On 5/16/22 15:06, Nico Kadel-Garcia wrote:
On Mon, May 16, 2022 at 6:39 AM Panu Matilainen <pmatilai@xxxxxxxxxx> wrote:
On 5/13/22 21:54, Jason L Tibbitts III wrote:
So I went to do a dnf system-upgrade from F35 to F36 on a test machine,
as part of my usual testing. In the middle of the process, it appears
that /var filled up and that left the system in an unfortunate state.
Surprisingly (to me) it did boot with a random mix of F35 and F36
packages and even though it's a throwaway test box, I wanted to play
around with fixing it a bit and trying to understand why it ran out of
space instead of just reinstalling.
Turns out that "dnf --releasever 36 --nogpgcheck remove --duplicates"
was able to effectively everything in the system, and while running this
/var filled up again. When that happened, dnf couldn't even be aborted;
I had to kill -9. The culprit is the write-ahead log,
/var/lib/rpm/rpmdb.sqlite-wal. I resized /var and reran, and by the end
of the process had grown to over 9GB:
-rw-r--r--. 1 root root 9124576392 May 13 13:11 rpmdb.sqlite-wal
Of course it immediately went to 0 once the transaction completed,
though rpmdb.sqlite went from:
-rw-r--r--. 1 root root 281739264 May 11 14:24 rpmdb.sqlite
to
-rw-r--r--. 1 root root 730648576 May 13 13:15 rpmdb.sqlite
which seems... odd for what's effectively just reinstalling the existing
package set.
Anyway, obviously the solution is to make sure that /var is "big enough"
before you do a system upgrade. And we do have warnings about
filesystems being too small, but nothing about needing an extra 10GB for
this. Certainly my case might be somewhat pathological and it was good
that in the end I was able to get the system back into a useful state
without wiping it. But in the end I wonder:
1) Is it really expected that the wal file will grow to that size?
No.
2) Is there anything to be done to reduce the size of the log?
Yeah, such as reporting incidents like this.
3) Is there any better way to handle a lack of space in /var during an
RPM transaction?
4) Can we estimate how large the file will grow, and refuse to start a
system upgrade if there is not enough space? Certainly we already do
this to some degree, but it seems that the estimate of the required
space is a bit too small.
Rpm has had a heuristic on the rpmdb growth for years, but no heuristics
can help against unexpected events eating the space.
An in-place system upgrade is not an "unexpected event". It is a risky
transaction.
I never said an upgrade is unexpected, that'd be absurd. It's a
long-running (and indeed risky) process that is only transactional from
the point that it will not start if pre-flight check does not pass.
The big space pig is not /var/lib/rpm: it's /var/cache/dnf, which can
be quite flooded by updated packages tool suites such as openoffice
or tetex. Another of my favorites for such in-place upgrades is to
take a package list before hand and delete such bulky suites, to
re-install them after the upgrade is complete.
That stuff is already downloaded when the space calculations happen.
- Panu -
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure