On Thu, Nov 28, 2019 at 8:29 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
FAT, ext4, and XFS all have a kind of "dirty bit" set upon mount. It's
removed when cleanly unmounted. Therefore if the file system isn't
mounted, but the "dirty bit" is set, it can be assumed it was not
cleanly unmounted. Both kernel code and each file system's fsck can
detect this, and the message you see depends on which discovers the
problem first. The subsequent messages about how this problem is
handled, I think we can ignore. As you say, it will be variable. All
we care about is the indicator that it was not properly unmounted.
Here are those indicators for each file system:
FAT fsck (since /etc/fstab sets EFI system partition fs_passno to 2,
this is what's displayed for default installations)
Nov 28 12:04:21 localhost.localdomain systemd-fsck[681]: 0x41: Dirty
bit is set. Fs was not properly unmounted and some data may be
corrupt.
FAT kernel
[ 205.317346] FAT-fs (vdb1): Volume was not properly unmounted. Some
data may be corrupt. Please run fsck.
ext4 fsck (since /etc/fstab sets /, /boot, /home fs_passno to 1 or 2,
this is what's displayed for default installations)
Nov 28 12:07:21 localhost.localdomain systemd-fsck[681]: /dev/vdb2:
recovering journal
ext4 kernel
[ 316.756778] EXT4-fs (vdb2): recovery complete
XFS kernel (since /etc/fstab sets / fs_passno to 0, we should only see
this message with default installations)
[ 372.027026] XFS (vdb3): Starting recovery (logdev: internal)
If the test case is constrained only to default installations, the
messages to test for:
"0x41: Dirty bit is set"
"recovering journal"
"XFS" and "Starting recovery"
If the test case is more broad, to account for non-default additional
volumes that may not be set in fstab or may not have fs_passno set,
also include:
"EXT4-fs" and "recovery complete"
"FAT-fs" and "Volume was not properly unmounted"
In each case I'm choosing the first message that indicates previously
unclean shutdown happened. Whether fsck or kernel message, they should
be fairly consistent in that I'm not expecting them to change multiple
times per year.
Thanks, this is pretty extensively covered. We can put these patterns into one big `journalctl | grep` and detect the unmount issues of the previous boot. With this, we can easily automate it.
Who's feeling like updating the test case?
The gotcha is, how would we know? Failure to
automatically parse for these messages, should they change, will
indicate a clean shutdown. *shrug*
If that happens and a bug appears, I guess somebody will tell us sooner or later and we'll fix it. Parsing text output can only take us so far.
>> Steps 4-7: I'm not following the purpose of these steps. What I'd like
>> to see for step 4, is, if we get a bad result (any result 2 messages),
>> we need to collect the journal for the prior boot: `sudo journalctl
>> -b-1 > journal.log` and attach to a bug report; or we could maybe
>> parse for systemd messages suggesting it didn't get everything
>> unmounted. But offhand I don't know what those messages would be, I'd
>> have to go dig into systemd code to find them.
>
>
> I think the purpose is to verify that both reboot and poweroff shut down the system correctly without any filesystem issues (which means fully committed journals and no dirty bits set).
Gotcha. Yeah, I think it's reasonable to test the LiveOS reboot as
well as the installed system's reboot, to make sure they are both
properly unmounting file systems.
Sounds reasonable to test both LiveOS and the installed system. Does it make sense to test both installed system's reboot and poweroff, though? Are there any meaningful differences?
_______________________________________________ test mailing list -- test@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to test-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/test@xxxxxxxxxxxxxxxxxxxxxxx