#11939 Kanban - OpenShift app bringup
Closed: Fixed with Explanation 16 days ago by zlopez. Opened 24 days ago by frantisekz.

NOTE

If your issue is for security or deals with sensitive info please
mark it as private using the checkbox below.

Describe what you would like us to do:


We (Fedora QE Team) would like (ehm, not that, been told that we need) to deploy a new web app to the Fedora oc cluster. I've already dumped the initial plays and roles to the ansible repository in https://pagure.io/fedora-infra/ansible/c/6b6718dda4531f2fb8fb81e9f78f5e2d0d3aa15d?branch=main and https://pagure.io/fedora-infra/ansible/c/9877831c7cba00c3728daef529214b22c4f54b79?branch=main .

Now, I'd like to ask with help with setting up bunch of things (it's been some time since I've added apps to the cluster):

  • secrets - I've filled in some into kanban_secrets.txt in my home dir (/home/fedora/frantisekz) on batcave in name:value notation, I'd need them added to the storage
  • oidc tokens - are those self-service? or to be asked for in here
  • database - should I just add some randomized password or would someone else need to create the user/pw/db on the database cluster?
  • dns - I believe this would need to do somebody with more sudo rights

Thanks a lot!

pinging @jskladan for updates.

When do you need this to be done by? (YYYY/MM/DD)



The secrets could be provided in the same way as in the https://pagure.io/fedora-infra/ansible/blob/9877831c7cba00c3728daef529214b22c4f54b79/f/roles/openshift-apps/kanban/templates/client-secrets.json. Is this OK for you?

The OIDC tokens will need to be created by somebody from sysadmin team, but we will provide you an ansible variable you can use. Looking at https://pagure.io/fedora-infra/ansible/blob/9877831c7cba00c3728daef529214b22c4f54b79/f/roles/openshift-apps/kanban/templates/client-secrets.json it seems that you already have the token available.

For database we need to create the new user/password/db, but you can choose the username and db name.

And for DNS, yes that needs more privileged rights. Although I'm not sure how it is for OpenShift, but @kevin should fill in the holes.

Metadata Update from @zlopez:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: OpenShift, medium-gain, medium-trouble, ops

24 days ago

@zlopez thank you!

We don't have the OIDC tokens or any other ansible (secret or not) variables set up (AFAIK), the file you referenced is but created based on "the rest of our apps"'s Openshift configs.

We'll need these variables to be set up and accessible (and that is, IMO, what @frantisekz is refering to in the first bullet point):

kanban_secret_key:
kanban_secret_key_stg:
kanban_db_name:
kanban_db_user:
kanban_db_pass:
kanban_db_pass_stg:
kanban_oidc_client_id:
kanban_oidc_client_id_stg:
kanban_oidc_secret:
kanban_oidc_secret_stg:

WRT the DB name/username - would probably make the most sense to just set it to kanban, I guess.

So, out of curiosity here... what is this app? I guess it's a kanban board? I fear naming it 'kanban' generically will cause some folks to assume they can use it for any fedora thing... is this targeted only to QE folks? or ?

I assume the kanban view in pagure is not sufficent for your needs?

@kevin I'm more than open to renaming it, I just used the repo name, as we did with the other apps.

It is a kanban board, that aggregates. tickets from multiple Pagure (Github planned in the near future) repositories into single view, and allows tracking story points/velocity and (in the future) other agile/scrum metrics.

The MVP I put together now does not support "multiple teams" using it (i.e. targeted to our team specifically), but if there's interest, I don't see it being an unreasonably huge RFE to fulfill.

Does that answer your question? As I said - I'm more than happy to rename the thing (within Openshift context), if you feel the name is possibly confusing.

@kevin WRT the naming - are you concerned about the name within Openshift, or the proposed url kanban.qa.fp.o? I read the message as "within openshift/infra repo", but I can see it being either way.

Anyway, we have no issue going with anything you feel is appropriate. Neither the name, nor the subdomain are of any particular importance to us.

Well, my thought was that community folks would see 'hey, a kanban app!' and want to use it... The url does make it more clear that it's for qa folks.

Anyhow, I don't want to block anything here, just trying to see what scope and such is. ;)

Metadata Update from @zlopez:
- Issue assigned to zlopez

18 days ago

I will take this one and start working on it. I will create everything for staging first, we can later do it for production.

One question about OIDC entry. What e-mail contact should be used there?

I will take this one and start working on it. I will create everything for staging first, we can later do it for production.

Thanks a ton!

One question about OIDC entry. What e-mail contact should be used there?

Let's make it fzatlouk[at]redhat.com

All the secrets for staging should be in place. I created the db and db user as well (hopefully I set the correct db permissions as well) and executed the ipsilon playbook to reflect the changes in OIDC.

So everything should be ready to deploy the kanban app on staging. Let me know if there will be something missing.

It seems there are some missing acls:

$ sudo rbac-playbook openshift-apps/kanban.yml -l staging
NOTIFY: [rbac-playbook] FAILURE frantisekz ran openshift-apps/kanban.yml
('Details:  \n'
 "limit: ['staging']  \n"
 'check: False  \n'
 'tags: None  \n'
 'user: None  \n'
 'start_at_task: None  \n'
 'Sha256: 6d996ffa33c60fe74466c26c66f40b0b3e58ae326565b12aa27f0e1a87bd679e')
user frantisekz is not authorized to run openshift-apps/kanban.yml

Could you try it now? I added the acls for sysadmin-qa, as it needs a group to be specified.

Could you try it now? I added the acls for sysadmin-qa, as it needs a group to be specified.

Nope :(, did try it (after ssh re-loggin), ends with the same issue.

I probably need to run some playbook as well, let me look.

Try it now, I ran the batcave playbook with rbac tag.

Try it now, I ran the batcave playbook with rbac tag.

Play is running now, thanks!

And obviously failing ... :D

fatal: [os-control01.stg.iad2.fedoraproject.org]: FAILED! => {"changed": true, "cmd": "oc -n kanban apply --validate=strict -f /etc/openshift_apps/kanban/deploymentconfig.yml", "delta": "0:00:00.368934", "end": "2024-05-28 14:13:24.327423", "msg": "non-zero return code", "rc": 1, "start": "2024-05-28 14:13:23.958489", "stderr": "error: error parsing /etc/openshift_apps/kanban/deploymentconfig.yml: error converting YAML to JSON: yaml: line 41: did not find expected key", "stderr_lines": ["error: error parsing /etc/openshift_apps/kanban/deploymentconfig.yml: error converting YAML to JSON: yaml: line 41: did not find expected key"], "stdout": "", "stdout_lines": []}

Which points to in our dc:

        env: |-
          {{ load_file('envvars.jinja') | indent(8) }}

I'd guess there are missing vars in the secret storage?

According to the error it's expecting JSON and the file has YAML structure.

I checked the ansible-private and all the keys are there. You can try to run it with -vv to get more info. I would like to know which key is actually missing.

According to the error it's expecting JSON and the file has YAML structure.

@zlopez to me, it does not read as "I'm expecting json, and it is yaml" but "Encountered an error when converting the /etc/openshift_apps/kanban/deploymentconfig.yml YAML file to JSON", and the error is "did not find expected key (on line 41)"

The {{ load_file .. }} Jinja line reads the envvars.jinja file, renders it, and outputs the rendered outcome to the current template (indenting it 8 spaces).

Since the deploymentconfig is yaml, I'd think the rendered contents should be YAML too.

Looking at the template, we still might have an error there - the |- should probably not be there, since the contents will be rendered as string. Maybe it is the root cause? Would also make sense with the error, since the parser is probably expecting the env being a dictionary, and not a string.

But I'm no expert here :/

@jskladan That could be the cause, but it's always hard to tell when working with templates. We can try to remove it and see if it helps.

I don't have access, sadly. Hopefully @frantisekz will get back online soon :D

The issues in the ansible were resolved.

@zlopez is it possible that the psql user was created wrong? mid-hook and deployment is failing at CREATE TABLE with no schema has been selected :

INFO [alembic.runtime.migration] Will assume transactional DDL.
Traceback (most recent call last):
File "/opt/app-root/lib64/python3.11/site-packages/sqlalchemy/engine/base.py", line 1971, in _exec_single_context
self.dialect.do_execute(
File "/opt/app-root/lib64/python3.11/site-packages/sqlalchemy/engine/default.py", line 919, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.InvalidSchemaName: no schema has been selected to create in
LINE 2: CREATE TABLE alembic_version (

Yeah, @jskladan hacked a small python-psql shell (as I am unable to connect to db01 or db01.stg from batcave via psql as I used to do), and:

on our other app on stg:

SELECT current_schema();
('public',)

and for kanban user/db:

SELECT current_schema();
(None,)

That is possible, this was the first time I created a new db with users in our postgres DB, let me check what I missed.

I added a few more permissions to kanban_user, could you try it now?

I added a few more permissions to kanban_user, could you try it now?

Now, it ended with:

psycopg2.errors.InsufficientPrivilege: permission denied for schema public
LINE 2: CREATE TABLE alembic_version

We're getting somewhere, I still do believe we're missing some perms?

I hope I fixed that now (granted CREATE privilege on public schema to kanban_user). Could you try again?

I hope I fixed that now (granted CREATE privilege on public schema to kanban_user). Could you try again?

Yes, it worked fine now, thanks!

We're now crash-looping for our reasons, will investigate.

Nice, let me know if you need anything else from me.

Nice, let me know if you need anything else from me.

Okay, we've resolved the deployment issues on our side, some more things if I may ask:

We have changed oidc endpoint:

From: "https://kanban.qa{{env_suffix}}.fedoraproject.org/oidc_callback"
To: "https://kanban.qa{{env_suffix}}.fedoraproject.org/flask_oidc/authorize"

I believe this would need to be changed auth server side?

And apart from that, can you proceed with production env preparation (db, oidc)?

@kevin Would you have time/cycles to add dns? Production would be ready asap once @zlopez has time to prepare the env (I do believe route needs to be up first, if I am not mistaken).

Let me change the OIDC endpoint first and I will start preparing everything for production as well.

The OIDC endpoint is now changed. I will start preparing the secrets for production and I will look at the DNS as well.

The DB, OIDC and secrets are now available for production. I will now look at the DNS.

So I looked in our DNS repository and can't find how this is done for Openshift as I don't see anything in the documentation and can't find anything hosted on OpenShift here.

@kevin How the DNS is done for OpenShift, I probably missed something.

So, what dns exactly. ;)

Do you want:

kanban.qa.fedoraproject.org ?

or

qa.fedoraproject.org/kanban

Currently blockerbugs is under like the second one. I think it would be consistent to just add this under there too? but I guess it's not a big deal either way.

For the first one we would have to add a dns entry in fedoraproject.org.template for kanban.qa and kanban.qa.stg and then add a website to playbooks/includes/proxies-websites and then a reverseproxy to playbooks/includes/proxies-reverseproxy.

For the second one we don't need dns or a new website, just a reverseproxy addition with qa.fedoraproject.org website.

So, what dns exactly. ;)

Do you want:

kanban.qa.fedoraproject.org ?

This one please, if it's not a huge trouble.

One (not the only) of the reasons for it is so we can have a separate oc project, where we did try some refactorings of the deployment that we'll be able to backport to our other applications.

ok, stg should be all live.

Will push prod out in a few.

Is there anything left to do here?

I think we did everything from our side. But lets leave this open till the app is deployed in production and working to address any issue on the way.

I think we did everything from our side. But lets leave this open till the app is deployed in production and working to address any issue on the way.

Yes, production is up too, it seems to work just fine (minus the stuff not yet implemented in the app).

Huge thanks both to @zlopez and @kevin for getting this up so quickly!!!

(Feel free to close this one)

Metadata Update from @zlopez:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

16 days ago

I'm glad everything is working for you :-)

Log in to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog