Re: [RFC PATCH] ext4: add unmount filesystem message

Jan Kara <jack@xxxxxxx> · Wed, 13 Apr 2022 10:16:44 +0200

On Wed 13-04-22 14:33:53, Zhang Yi wrote:
> On 2022/4/13 11:51, Darrick J. Wong wrote:
> > On Wed, Apr 13, 2022 at 10:23:31AM +0800, Zhang Yi wrote:
> >> On 2022/4/13 9:35, Theodore Ts'o wrote:
> >>> On Tue, Apr 12, 2022 at 12:01:37PM -0400, Gabriel Krisman Bertazi wrote:
> >>>> Zhang Yi <yi.zhang@xxxxxxxxxx> writes:
> >>>>
> >>>>> Now that we have kernel message at mount time, system administrator
> >>>
> >>> "Now that we have...." is a bit misleading, since (at least to an
> >>> English speaker) that this is something that was recently added, and
> >>> that's not the case.
> >>>
> >>>>> could acquire the mount time, device and options easily. But we don't
> >>>>> have corresponding unmounting message at umount time, so we cannot know
> >>>>> if someone umount a filesystem easily. Some of the modern filesystems
> >>>>> (e.g. xfs) have the umounting kernel message, so add one for ext4
> >>>>> filesystem for convenience.
> >>>>>
> >>>>>  EXT4-fs (sdb): mounted filesystem with ordered data mode. Quota mode: none.
> >>>>>  EXT4-fs (sdb): unmounting filesystem.
> >>>>
> >>>> I don't think sysadmins should be relying on the kernel log for this,
> >>>> since the information can easily be overwritten by new messages there.
> >>>> Is there a reason why you can't just monitor /proc/self/mountinfo?
> >>>
> >>> You're right that it can be dangerous for sysadmins to be relying on
> >>> the kernel log for mount and umount notifications --- but it depends
> >>> on what they think it means, and the potential pitfalls are there for
> >>> both the mount and unmount messages.  The problem of course, is that
> >>> bind mounts, and mount name spaces, so if the question is whether a
> >>> file system is available at a particular mount point, then using the
> >>> kernel log is definitely not going to be reliable.
> >>>
> >>> But if the goal is to determine whether a particular device is safe to
> >>> run fsck or otherwise access directly, or for the purposes of
> >>> debugging the kernel and looking at the logs to understand when the
> >>> device is being accessed by the kernel and when the file system is
> >>> done with the device, I can see how it might be useful.
> >>>
> >>
> >> Yes, I understand that the kernel log is not reliable, and
> >> /proc/self/mountinfo neither. Our goal is simple, As Ted said, just add a
> >> method to help sysadmins to know whether a particular ext4 device is really
> >> doing unmount procedure, it could be helpful for us to debug kernel and
> >> locate kernel bug.
> > 
> > But if the mount/unmount messages are ratelimited, how will you know for
> > sure if the ratelimiting mechanism elides the message?
> > 
> 
> This is to be expected that the messages are ratelimited, it's just a "try best"
> way to let us acquire more information, it's best if it write something down and
> not surprising if not. If the messages are ratelimited will get the "...suppressed"
> message and could know what happened, we will combine other logs (e.g. systemd log)
> to make things clear as far as possible.

Just to add my 2c, several times when I was debugging some issue and
staring into the kernel logs, I was trying to figure out whether some ext4
filesystem was still mounted or not and a message about unmounting a fs in
the kernel log would have been useful to me (e.g. when I was trying to
figure out whether a shared device with ext4 filesystem got really mounted
from two nodes at once or whether it was first unmounted on one of the
nodes). Sure, you can live without that and sure it isn't 100% reliable in
all the corner cases but it is convenient at times...

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR