On Thu, Nov 07, 2024 at 02:09:19AM GMT, Aleksa Sarai wrote: > On 2024-11-06, Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > On Wed, Nov 6, 2024 at 12:00 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > > > On Wed, Nov 6, 2024 at 10:59 AM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > > > > > On Wed, Nov 06, 2024 at 02:09:58PM +1100, Aleksa Sarai wrote: > > > > > overlayfs helpfully provides a lot of of information when setting up a > > > > > mount, but unfortunately when using the fsopen(2) API, a lot of this > > > > > information is mixed in with the general kernel log. > > > > > > > > > > In addition, some of the logs can become a source of spam if programs > > > > > are creating many internal overlayfs mounts (in runc we use an internal > > > > > overlayfs mount to protect the runc binary against container breakout > > > > > attacks like CVE-2019-5736, and xino_auto=on caused a lot of spam in > > > > > dmesg because we didn't explicitly disable xino[1]). > > > > > > > > > > By logging to the fs_context, userspace can get more accurate > > > > > information when using fsopen(2) and there is less dmesg spam for > > > > > systems where a lot of programs are using fsopen("overlay"). Legacy > > > > > mount(2) users will still see the same errors in dmesg as they did > > > > > before (though the prefix of the log messages will now be "overlay" > > > > > rather than "overlayfs"). > > > > > > I am not sure about the level of risk in this format change. > > > Miklos, WDYT? > > > > > > > > > > > > > [1]: https://bbs.archlinux.org/viewtopic.php?pid=2206551 > > > > > > > > > > Signed-off-by: Aleksa Sarai <cyphar@xxxxxxxxxx> > > > > > --- > > > > > > > > To me this sounds inherently useful! So I'm all for it. > > > > > > > > > > [CC: Karel] > > > > > > I am quite concerned about this. > > > I have a memory that Christian suggested to make this change back in > > > the original conversion to new mount API, but back then mount tool > > > did not print out the errors to users properly and even if it does > > > print out errors, some script could very well be ignoring them. > > I think Christian mentioned this at LSF/MM (or maybe LPC), but it seems > that util-linux does provide the log information now in the case of > fsconfig(2) errors: > > % strace -e fsopen,fsconfig mount -t overlay -o userxattr=str x /tmp/a > fsopen("overlay", FSOPEN_CLOEXEC) = 3 > fsconfig(3, FSCONFIG_SET_STRING, "source", "foo", 0) = 0 > fsconfig(3, FSCONFIG_SET_STRING, "userxattr", "str", 0) = -1 EINVAL (Invalid argument) > mount: /tmp/a: fsconfig system call failed: overlay: Unexpected value for 'userxattr'. > dmesg(1) may have more information after failed mount system call. > > (Using the current HEAD of util-linux -- openSUSE's util-linux isn't > compiled with support for fsopen apparently.) After failed mount-related syscalls, libmount reads messages prefixed with "e " from the file descriptor created by fdopen(). These messages are later printed by mount(8). mount(8) or libmount does not read anything from kmesg. > However, it doesn't output any of the info-level ancillary > information if there were no errors. This is the expected default behavior. mount(8) does not print any additional information. We can enhance libmount to read and print other messages on stdout if requested by the user. For example, the mount(8) command has a --verbose option that is currently only used by some /sbin/mount.<type> helpers, but not by mount(8) itself. We can improve this and use it in libmount to read and print info-level messages. I can prepare a libmount/mount(8) patch for this. > So there will definitely be some loss of > information for pr_* logs that don't cause an actual error (which is a > little unfortunate, since that is the exact dmesg spam that caused me to > write this patch). > > I could take a look at sending a patch to get libmount to output that > information, but that won't help with the immediate issue, and this > doesn't help with the possible concern with some script that scrapes > dmesg. (Though I think it goes without saying that such scripts are kind > of broken by design -- since unprivileged users can create overlayfs > mounts and thus spam the kernel log with any message, there is no > practical way for a script to correctly get the right log information > without using the new mount API's logging facilities.) > I can adjust this patch to only include the log+return-an-error cases, > but that doesn't really address your primary concern, I guess. > > > > My strong feeling is that suppressing legacy errors to kmsg should be opt-in > > > via the new mount API and that it should not be the default for libmount. > > > IMO, it is certainly NOT enough that new mount API is used by userspace > > > as an indication for the kernel to suppress errors to kmsg. For me, it seems like we are mixing two things together. kmesg is a *log*, and tools like systemd read and save it. It is used for later issue debugging or by log analyzers. This means that all relevant information should be included. The stderr/stdout output from tools such as mount(8) is simply feedback for users or scripts, and informational messages are just hints. They should not be considered a replacement for system logging facilities. The same applies to messages read from the new mount API; they should not be a replacement for system logs. In my opinion, it is acceptable to suppress optional and unimportant messages and not save them into kmesg. However, all other relevant messages should be included regardless of the tool or API being used. Additionally, it should be noted that mount(8)/libmount is only a userspace tool and is not necessary for mounting filesystems. The kernel should not rely on libmount behavior; there are other tools available such as busybox. > I can see an argument for some kind of MS_SILENT analogue for > fsconfig(), though it will make the spam problem worse until programs > migrate to setting this new flag. Yes, the ideal solution would be to have mount options that can control this behavior. This would allow users to have control over it and save their settings to fstab, as well as keep it specific to the mount node. > Also, as this is already an issue ever since libmount added support for > the new API (so since 2.39 I believe?), I think it would make just as > much sense for this flag to be opt-in -- so libmount could set the > "verbose" or "kmsglog" flag by default but most normal programs would > not get the spammy behaviour by default. I prefer if the default behavior is defined by the kernel, rather than by userspace tools like libmount. If we were to automatically add any mount options through libmount, it would make it difficult to coexist with settings in fstab, etc. It's always better to have transparency and avoid any hidden factors in the process. Karel -- Karel Zak <kzak@xxxxxxxxxx> http://karelzak.blogspot.com