#7 Use /boot on Btrfs by default
Opened 4 years ago by chrismurphy. Modified a day ago

Summary

  • Instead of a 1G ext4 volume, create a 1-2G Btrfs volume and subvolume to mount at /boot.
  • due to limit a11y and i18n support in GRUB, unlikely to use GRUB LUKS
  • a separate /boot volume is needed to support LUKS encrypted sysroot
  • /boot on btrfs and btrfs subvolume/snapshot is already possible and supported by BLS and GRUB and Anaconda

Implementation details:

  • mkfs.btrfs --mixed helps reduce data/metadata block group space management issues, and the tiny performance loss won't matter for this volume

Bonus: in the interest of Keeping Things Simpleā„¢, if anyone can figure out a way to:

  • support snapshots and rollback without /boot on Btrfs, i.e. either ext4 or FAT
  • advantages of using /boot on Btrfs across all of Fedora instead of ext4 or FAT
    • Btrfs has no journal, it should always be consistent, no fsck needed
    • In case of uncertainty about GRUB treelog support, use notreelog mount option to turn all fsync() into a full file system sync(); slower but more robust and simper for bootloaders (which do not do ext4 or XFS journal replay)

Metadata Update from @chrismurphy:
- Issue set to the milestone: Fedora 34
- Issue tagged with: Dev

4 years ago

Bootloader Spec Fedora runs into a few problems with the spec: Anaconda doesn't create Extended Boot Loader Partition when it should. $BOOT should be readable by the firmware, which means FAT. $BOOT should be mounted at /boot or /efi but not at /boot/efi.

EfiFs wraps GRUB read-only file system modules into EFI drivers. EFI firmware could be taught to read Btrfs /boot directly. It'd support Btrfs raid1. Does it detect csum mismatches and automatically use the good copy?

Background
Fedora has a Hidden GRUB menu feature. It depends on the ability of GRUB to modify the values of boot_success and boot_indeterminate located in the grubenv file, from within the GRUB preboot environment.

Some configurations (e.g. BIOS x86_64) place grubenv in /boot/grub2, which would be on Btrfs if we proceed with /boot on Btrfs. UEFI configurations can be ignored, since grubenv is on the FAT formatted EFI System partition.

Problem
GRUB file system drivers are read-only. GRUB writes to grubenv directly modifying those blocks.
Writing to a grubenv file located on Btrfs from GRUB would be indistinguishable from corruption, and therefore the GRUB btrfs driver currently refusessave_env command when grubenv is on Btrfs (or ZFS, LUKS, or md RAID and maybe LVM.)

Workarounds
I don't know whether it's possible or appropriate for GRUB to dynamically insert a boot parameter as a way of sending a message to the booted environment, or some other way of sending messages from preboot to boot.

7 year old idea: put grubenv in the embed area.

FlagsNODATASUM|NODATACOW being set on the /boot/grub2/grubenv aren't sufficient. The grubenv is exactly 1KiB and therefore likely to be an inline extent inserted into the 16KiB leaf containing the grubenv's inode. It's still checksummed because metadata is always checksummed. There is a mount optionmax_inline=0 that would prevent use of inline extents, but that may not be adequate for this use case. It probably needs to be a flag or xattr set per file. That way GRUB can easily determine this file can be modified, and the linux btrfs driver knows not to make this file's extents inline. This suggests kernel fs/btrfs, btrfs-progs, and GRUB development work.

@javierm @jwrdegoede

support snapshots and rollback without /boot on Btrfs, i.e. either ext4 or FAT

To create a snapshot, one can back up the kernels and initrds under /boot/snapshots/[root snapshot name] and generate the appropriate bootloader configuration files.

A proof of concept (apologies for the project name) inspired loosely by SuSE's transactional update system. Root subvolumes are /snapshots/root/0 or /snapshots/root/1, with corresponding boot snapshot directories /boot/snapshots/0 and /boot/snapshots/1. See in particular the backup_bootdir() function.

Metadata Update from @ngompa:
- Issue set to the milestone: Future Release (was: Fedora 34)

4 years ago

Recently, @karolherbst asked me about considering to change Fedora's default Btrfs layout to use a subvolume for /boot instead of having it as a separate partition because the NVIDIA GSP firmware is huge and will blow out the space that is available in /boot on most systems today.

Additionally, he asked for us to consider coming up with an easy migration mechanism for existing systems due to the nature of the reason for this conversion.

What about existing and future LUKS installations? (Yes it's all much more straightforward with fscrypt btrfs but we can't assume exclusively fscrypt layouts.)

GRUB can support LUKS. But the installer doesn't support configuring GRUB for LUKS. And even if it did, we then lock out alternate bootloaders until they too support LUKS. It shifts the problem around significantly.

Are efforts to get / mounted faster exhausted? And how does /boot on Btrfs help solve this problem with non-Btrfs installations? I assume RHEL is going to run into this problem too and will need some kind of non-Btrfs solution?

We now have btrfs-efi packaged in Fedora.

There's also a ticket about getting it secure boot signed: releng#12300

Looks like the folks at SUSE have made a patch for allowing Btrfs bootloader header space to be used for grubenv: https://build.opensuse.org/projects/Base:System/packages/grub2/files/grub2-grubenv-in-btrfs-header.patch

We should get this imported into RH/Fedora GRUB so that we can use it.

Log in to comment on this ticket.