Re: How much free space in /var is required for upgrades?

Nico Kadel-Garcia <nkadel@xxxxxxxxx> · Sat, 14 May 2022 08:35:10 -0400

On Fri, May 13, 2022 at 2:54 PM Jason L Tibbitts III <j@xxxxxx> wrote:
>
> So I went to do a dnf system-upgrade from F35 to F36 on a test machine,
> as part of my usual testing.  In the middle of the process, it appears
> that /var filled up and that left the system in an unfortunate state.
> Surprisingly (to me) it did boot with a random mix of F35 and F36
> packages and even though it's a throwaway test box, I wanted to play
> around with fixing it a bit and trying to understand why it ran out of
> space instead of just reinstalling.

It can be a problem for bulky upgrades, and it's why I loathe the
"let's make every partition as small as possible" approach to laying
out disks. A look can help:

    dnf clean all --enablerepo=*
    dnf check-update | grep '^[a-zA-Z0-0]' | while read name; do
        dnf update "$name" -y
    done

> Turns out that "dnf --releasever 36 --nogpgcheck remove --duplicates"
> was able to effectively everything in the system, and while running this
> /var filled up again.  When that happened, dnf couldn't even be aborted;
> I had to kill -9.  The culprit is the write-ahead log,
> /var/lib/rpm/rpmdb.sqlite-wal.  I resized /var and reran, and by the end
> of the process had grown to over 9GB:
>
> -rw-r--r--. 1 root root 9124576392 May 13 13:11 rpmdb.sqlite-wal
>
> Of course it immediately went to 0 once the transaction completed,
> though rpmdb.sqlite went from:
>
> -rw-r--r--. 1 root root 281739264 May 11 14:24 rpmdb.sqlite
>
> to
>
> -rw-r--r--. 1 root root 730648576 May 13 13:15 rpmdb.sqlite
>
> which seems... odd for what's effectively just reinstalling the existing
> package set.
>
> Anyway, obviously the solution is to make sure that /var is "big enough"
> before you do a system upgrade.  And we do have warnings about
> filesystems being too small, but nothing about needing an extra 10GB for
> this.  Certainly my case might be somewhat pathological and it was good
> that in the end I was able to get the system back into a useful state
> without wiping it.  But in the end I wonder:

It's an unusual situation. /var/cache/ is one of the culprits for this
kind of bulky upgrade.

> 1) Is it really expected that the wal file will grow to that size?
>
> 2) Is there anything to be done to reduce the size of the log?
>
> 3) Is there any better way to handle a lack of space in /var during an
> RPM transaction?
>
> 4) Can we estimate how large the file will grow, and refuse to start a
> system upgrade if there is not enough space?  Certainly we already do
> this to some degree, but it seems that the estimate of the required
> space is a bit too small.
>
>  - J<
> _______________________________________________
> devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure