archive-repo-manager

Tool to manage the Fedora Archive yum repository

How to use the repo:

sudo tee /etc/yum.repos.d/fedora-updates-archive.repo <<'EOF'
# This is a repo that contains all the old update packages from the
# Fedora updates yum repository (i.e. the packages that have made it
# to "stable". This repo is needed for OSTree based systems where users
# may be trying to layer packages on top of a base layer that doesn't
# have the latest stable content. Since base layer content is locked
# the package layering operation will fail unless there is older versions
# of packages available.
#
# This repo is given a high cost in order to prefer the normal Fedora
# yum repositories, which means only packages that can't be found
# elsewhere will be downloaded from here.
[updates-archive]
name=Fedora $releasever - $basearch - Updates Archive
#baseurl=https://fedoraproject-updates-archive.s3.amazonaws.com/fedora/$releasever/$basearch/
baseurl=https://fedoraproject-updates-archive.fedoraproject.org/fedora/$releasever/$basearch/
enabled=1
metadata_expire=1d
repo_gpgcheck=0
type=rpm
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$releasever-$basearch
skip_if_unavailable=True
# Make this repo a higher cost. We only want this repo to get
# used if needed packages can't be retrieved from any other repo.
cost=10000 # default is 1000
EOF

dnf install foo

Rough notes for running archive-repo-manager in a container:

HISTCONTROL='ignoreboth'
 export S3BUCKET=dustymabe-archive-repo-poc
 export AWSACCESSKEYID=
 export AWSSECRETACCESSKEY=
 export AWS_ACCESS_KEY_ID=
 export AWS_SECRET_ACCESS_KEY=
podman build -t archive-repo-manager .
podman run -it --rm             \
    -e AWS_ACCESS_KEY_ID        \
    -e AWS_SECRET_ACCESS_KEY    \
    -e AWSACCESSKEYID           \
    -e AWSSECRETACCESSKEY       \
    -e S3BUCKET                 \
    --device /dev/fuse          \
    --name archive-repo-manager \
    archive-repo-manager

The two sets of env vars for AWS credentials is because the aws CLI uses one form and s3fs uses another.

If you'd like you can add --entrypoint=/bin/bash. Then you can do the s3fs mount and run /usr/local/lib/archive_repo_manager.py directly.

Rough notes for running archive-repo-manager on FCOS:

Follow the README in the provisioning directory to bring up the instance using tofu in AWS.

After logging in you can switch to the worker user and monitor the systemd user units:

sudo machinectl shell worker@
journalctl -b0 --user
systemctl --user status
podman logs -f archive-repo-manager

Rough notes for creating a bucket in S3

Set up credentials. One way is to use the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables.

Then create the bucket:

aws s3 mb s3://myarchivebucket

Set the bucket to be completely public (no private things are getting stored here).

https://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html#example-bucket-policies-use-case-2

POLICY='{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": "*",
      "Action":["s3:GetObject","s3:GetObjectVersion"],
      "Resource": "arn:aws:s3:::myarchivebucket/*"
    }
  ]
}'
aws s3api put-bucket-policy --bucket myarchivebucket --policy "$POLICY"

Create an IAM policy to allow for access to the bucket.

POLICY='{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::myarchivebucket"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::myarchivebucket/*"
            ]
        }
    ]
}'
aws iam create-policy --policy-name read-write-archive-repo-s3-bucket --policy-document "$POLICY"

Create a user and attach the policy to it.

# Optional: create new user first.
aws iam create-user --user-name userforarchiverepoaccess

# Attach the policy to a user
aws iam attach-user-policy --user-name userforarchiverepoaccess --policy-arn arn:aws:iam::011111111111:user/userforarchiverepoaccess

Then use the credentials for the user to manage the repo. The credentials that are used by the automation script for uploading to the s3 bucket should really be limited to access to that bucket and nothing else.

Rough notes for initializing the archive repo from scratch

The directory structure we're adopting for now looks like fedora/${release}/${arch}. For Fedora 33 this would look like:

  • /fedora/33/aarch64/
  • /fedora/33/armhfp/
  • /fedora/33/ppc64le/
  • /fedora/33/s390x/
  • /fedora/33/x86_64/

To create the structure, mount up the newly created bucket using s3fs:

 export AWSACCESSKEYID=xxx
 export AWSSECRETACCESSKEY=xxx
mkdir /tmp/bucket/
s3fs -o uid=$(id -u),gid=$(id -g) $S3BUCKET /tmp/bucket

Then create the directory structure needed and stub repo metadata:

RELEASES='31 32 33'
ARCHES='aarch64 ppc64le s390x x86_64'
pushd /tmp/bucket/
mkdir -p fedora && pushd fedora
for release in $RELEASES; do
    for arch in $ARCHES; do
        mkdir -p "${release}/${arch}"
        pushd "${release}/${arch}"
        createrepo_c --no-database --zck --zck-dict-dir "/usr/share/fedora-repo-zdicts/f${release}" .
        popd
    done
done
popd; popd

Rough notes for draining a repo

Once a release is EOL we don't need to keep around the files, but we do want to make it such that people are still able to rebase from those EOL releases. Let's remove all the RPMs from the repo for that release, but leave the repodata in place. Something like:

aws s3 rm s3://fedoraproject-updates-archive/fedora/32/ --recursive --exclude '*' --include '*rpm' --dryrun

After running the delete you can verify only the repodata is left with:

aws s3 ls s3://fedoraproject-updates-archive/fedora/32/ --recursive