Hi.
Please create a new VPC for fedora-ci/osci in AWS.
There are: tft + osci parts of fedora-ci. Tft has 1 ESK and osci has 1 EKS. Both EKS are located in 1 VPC.
From the begging this VPC had 1 CIDR with 250 IPs. Or 2 subnets (for different zones) with 128 IPs. Each container, each ec2 instance, each EKS node takes IP from this subnets. It became clear: not enough IPs. We cannot add new nodes to EKS. We cannot spin new containers: https://pagure.io/fedora-infrastructure/issue/9462
There is upstream bug: https://github.com/aws/containers-roadmap/issues/170 Unfortunately it is necessary to recreate the EKS-cluster to update the subnets.
To prevent future collide of TFT and OSCI resources please create new VPC, new token (if necessary), new policy that we can create a new EKS, please create cidr + subnets with size /19 each.
/19
osci/tft/eln proccess are tightly connected and this block eln/fedora-ci.
History ticket: at https://pagure.io/fedora-infrastructure/issue/8958#comment-656010
@mvadkert @msrb @bookwar @sgallagh FYI
@mobrien @kevin hi guys, this is blocking ELN testing unfortunately, if you would have time to prioritize it, would help, thank you!
@mvadkert We have reached the maximum number of VPCs for this region so I have requested an increase on our allowance. These requests can take some time.
If region is not an important factor it would be a quicker solution to move to a different one with us-east-2(Ohio) being the closest to us-east-1. The one drawback here is that us-east-1 is the default region for a lot of things so you will have to ensure to explicitly state region a lot of the time. If using the aws cli you can set a default region on your cli profille
Metadata Update from @humaton: - Issue tagged with: high-gain, medium-trouble, ops
It is ok to have new VPC in different region. us-east-2 sounds good.
Ok I have created a new VPC in the us-east-2 region vpc-0f6baa3d6bae8d912
There are 2 subnets: subnet-010f90da92f36876e 10.200.0.0/19 subnet-0a704a759f7671044 10.200.32.0/19
The subnets are in 2 different availability zones to provide some high availability
@mobrien hi, thank you very much for provided solution. I confirm that it works. I crated ESK + node groups. Next step is moving our running EKS services to the new EKS cluster. Ticket can be closed. Thank you for help!
Metadata Update from @smooge: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Hi, @mobrien , sorry to trouble you.
I cannot create a ec2 instance:
aws ec2 run-instances --image-id ami-000e7ce4dd68e7a11 \ --key-name astepano \ --instance-type r5.xlarge \ --region us-east-2 \ --subnet-id subnet-0a704a759f7671044 \ --count 2 \ --tag-specifications \ 'ResourceType=instance,Tags=[{Key=osci,Value=jenkinsslavecentos8},{Key=FedoraGroup,Value=ci}]' \ 'ResourceType=volume,Tags=[{Key=osci,Value=jenkinsslavecentos8},{Key=FedoraGroup,Value=ci}]' An error occurred (UnauthorizedOperation) when calling the RunInstances operation: You are not authorized to perform this operation. Encoded authorization failure message: RXT...
I was able to run this command for us-east-1 with success.
aws sts get-caller-identity { "UserId": "AIDAR2OOCKQW5VBCTDBGN", "Account": "125523088429", "Arn": "arn:aws:iam::125523088429:user/fedora-ci-osci" }
Metadata Update from @astepano: - Issue status updated to: Open (was: Closed)
<img alt="Screenshot_from_2020-11-20_10-11-51.png" src="/fedora-infrastructure/issue/raw/files/ae40b4e1f73f38a3bdbebb40f10da3eb6b3fdfd05cd23600e7bfd2eaa31b6e09-Screenshot_from_2020-11-20_10-11-51.png" />
I tried also with web-interface with my account: Federated Login: aws-fedora-ci/astepano
The same behavior. Please this blocks us to move to new VPC.
@mobrien @kevin please take a look, we cannot move to new VPC. Please. This is very urgent, fedora-pipelines, and ELN cannot work without this.
@astepano would you be able to provide me with the full encoded authorization failure please. I can decode it to help find the issue.
This is from command line for "Arn": "arn:aws:iam::125523088429:user/fedora-ci-osci":
"Arn": "arn:aws:iam::125523088429:user/fedora-ci-osci"
An error occurred (UnauthorizedOperation) when calling the RunInstances operation: You are not authorized to perform this operation. Encoded authorization failure message: Twb5_4tRyd-QUC3SYbge7Ee4oYQekPuGdBcu5DNdyirYKWjWZIiK8v2puVToj0lB9B0FOMpMONNfbb0L8NEbCUq5B7usVfglsRdRSc9cKDkO3bMm_KQBwo7hlKG24hhm0aaKRspHy2y-TAVFgWOfQb8-wJjezsqC2AA3cdEJSvRaxc9q-J4o3ye9TcHW_9TkhEp8YANJWzw8sJ_H2ZpQQr8d_F9Qxv6_IXiu9zyb2qTm27JIq5bHlRKhs_9qcvYFxF3Hjx8w4QdIWQDaPYDt8C10RLMHpjD452QCzLztGpem-dzBeTf3HjEatfmZoHSBD0AWnukqGyn2e3cfZPurWBdDKNvflrsXNWmZRW4W66SgrXNIjDHB5LdauP8ehvOxPoZLc-adbbXto4k8kguZyvkGLFJYk7zpl5qJnA3jvHRc3t1yblS3OLD3wQ6NexHyzFtAYsJcdOAPWBRJ9Pcpfj2p4u57bZm4NH4Q40b3LwhuEQ-5aCx7vKLZtvYxPdL7SyqVLju1gFwsx0eUH-dZcB36G4xcs_D3wk8kWx34zwmOThx2Cs1hyPH0In7Tg0KKOD8nc_6IuQ9Wb2FKSCohB7fCtTDp67rlrw3Qu4-YqcGw5udF8m5BLbllaPix3XZCHC77mlpAi5zK3_tWpY_n2DYzdtmGmoYUHK3sPhdH6G6tGC-iGB917p8UulYFnZCJohBHpt1-H6PI
for web ui Federated Login: aws-fedora-ci/astepano:
aws-fedora-ci/astepano
You are not authorized to perform this operation. Encoded authorization failure message: oN5QwrPTItOyr2Whu4fL2eBW88Wxl67i7bGbxyLe107RGlBMGx3nsmEJWoaPJ6IHC1Y38_yegymI9T-XmOK4kS2zE8d0kArU1vjxdYXjp2hsicCdpja6sHrqs41RDN5i8d9OQV_QMIRwyRtNvRc9HgIqWr6jkAJM8B_kXgtObqn9L1saObvwrXNf5Rq3MPX400WivhusLb0t6ehKndxSwEyZvMYGMjVcGEUBqzstrBm4uC4p9uN1yIt8xJHLRhUnUkJXLyIckfc70zeDq3bmu_QF_Vzsrode_dPOPKhsLVfQFk3Hr04qVxjYFq65D3L3Uv_QXYjfd2Grls7vZ24IZi0EKHeqvyDQ0eRlwT8k1kiakQeqSg-4QlQSGoUDgolZf_B9a0BvI_uGsPj1b-xwCVzm0bnBPVncqcfM9I_mmobF7ZeA6UU4pkm7GELdBIyK-Is5dFUwaXo8SxwD2zsM8rvoaJ42j6yKMRu2X678RxgVmkbhqQMd5KNQa8VSUl90wszfC1PAf48VqdMXdRNPrIPkEdsqlXxa5eLL9AiskOt8wrEHtEi4KmhNYB_XjF7qFIE8tCV2ZfEAAMotT3hXmO4BHpxOxYGrTfbgjVJJnFLxAkVA3v3z4wY1QLJm7cj3RhnmDREHjwZ1Hp_KD3j18AsJSNYXCNHIiyaB6C9MzVA2VF03XmBTdwyvji7GbSkyTiO7J6aDZoOJwnjbKbc
The issue has now been resolved. It was due to a tag on the security group having the FedoraGroup tag set as aws-fedora-ci rather than ci
aws-fedora-ci
ci
Metadata Update from @mobrien: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.