#11505 Deploy new sign hardware/software
Opened 8 months ago by kevin. Modified 7 days ago

We are getting new replacement hardware for our sign-vault machines and we have a new autosign machine and new sigul is out and our current setup is using rhel8.

We need to:

  • get sigul-1.2 in epel9. (I am already doing this)
  • make backup of staging sigul data
  • reinstall autosign01.stg, sign-vault01.stg, sign-bridge01.stg with rhel9 and sigul 1.2
  • restore old data and confirm things work as expected
  • Install new sign-vault hardware once it arrives with rhel9 and deploy sigul 1.2 and a copy of prod data
  • stop robosignatory and sign-bridge/sign-vault.
  • bring up new sign-vault, then bridge, then test
  • bring up new autosign01/robosignatory and confirm all is working.

Metadata Update from @zlopez:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: high-gain, medium-trouble, ops

8 months ago

Just as a status update here. I've got sigul 1.2 built and working on epel9. I have sign-bridge01.stg/sign-vault01.stg all installed with rhel9 and sigul 1.2 setup there.

I ran into a issue making a ECC key, need to sort that out and then test sign something in staging to test end to end.

So, making progress.

There's a few small issues left on sigul. Hopefully they will get fixed in the next week or so.

Hardware as arrived at the datacenter and should be racked week after next.

New hardware is in and racked.

I have installed sign-vault02 with rhel9.
The new sign-vault01 is racked, but I haven't installed anything on it yet.

The old sign-vault01 is still rhel8/old sigul. I need to test the new sigul and get everything working. Once it tests ok I will switch over to using sign-vault02 and then we can retire 01.

Just a (sorry, late) update here: I was poised to roll out the new version last week, but then... some security updates came along and took up all my time. ;(

If things are quiet this week, I might look at asking for a freeze break for it. Most everything is lined up, and it should be easy to back out if it causes issues.

F40 goes out tomorrow and final freeze is over wed... so I will probibly try and land this later this week if I can.

ok. After 7 months... finally I managed to get things all deployed.

So, sign-bridge01 and sign-vault02 are rhel9 and running sigul-1.2

Please report any signing issues, but it seems to be working as expected.

Next steps:

  • We need to fix head-signing. It's not working correctly right now, we need to get that fixed and enable it.

  • I need to generate ima certs. I plan to do that monday and get kernel developers to test them, then publish them.

  • We need to sort out the pesign changes needed and how to hook it into kernel and systemd builds and how to migrate from our current setup. I'll try and drive this forward next week also. This will allow systemd-boot signing and aarch64 kernel/ssytemd-boot signing.

  • I need to install/configure the other new vault server and decomission the existing sign-vault01 server, but I want to wait a week or so at least to make sure we don't need to roll back to it.

In case for some reason we do need to roll back:

  • Old rhel8 sign-bridge01 xml is on bvmhost-x86-03 in /root, and it's disk is in sign-bridge01.iad2.fedoraproject.org-20240426 so take rhel9 one down, redefine it and rename the disk
  • old sign-vault01 (rhel8) is still all there, stop sigul on sign-vault02, start on 01 (needs passphrases, etc).
  • bodhi-backend01 has sigul upgraded. Downgrade back to old version in f38 base repos.

Oh, and I need to fix a call to krb_login in sigul that was still there. ;(

Login to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog