Re: [PATCH 00/17] VFS: Filesystem information and notifications [ver #17]

Steven Whitehouse <swhiteho@xxxxxxxxxx> · Tue, 3 Mar 2020 10:21:58 +0000

Hi,

On 03/03/2020 09:48, Miklos Szeredi wrote:
On Tue, Mar 3, 2020 at 10:26 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
On Tue, Mar 3, 2020 at 10:13 AM David Howells <dhowells@xxxxxxxxxx> wrote:
Miklos Szeredi <miklos@xxxxxxxxxx> wrote:

I'm doing a patch.   Let's see how it fares in the face of all these
preconceptions.
Don't forget the efficiency criterion.  One reason for going with fsinfo(2) is
that scanning /proc/mounts when there are a lot of mounts in the system is
slow (not to mention the global lock that is held during the read).
BTW, I do feel that there's room for improvement in userspace code as
well.  Even quite big mount table could be scanned for *changes* very
efficiently.  l.e. cache previous contents of /proc/self/mountinfo and
compare with new contents, line-by-line.  Only need to parse the
changed/added/removed lines.

Also it would be pretty easy to throttle the number of updates so
systemd et al. wouldn't hog the system with unnecessary processing.

Thanks,
Miklos

At least having patches to compare would allow us to look at the 
performance here and gain some numbers, which would be helpful to frame 
the discussions. However I'm not seeing how it would be easy to throttle 
updates... they occur at whatever rate they are generated and this can 
be fairly high. Also I'm not sure that I follow how the notifications 
and the dumping of the whole table are synchronized in this case, either.

Al has pointed out before that a single mount operation on a subtree can 
generate a large number of changes on that subtree. That kind of 
scenario will need to be dealt with efficiently so that we don't miss 
things, and we also minimize the possibility of overruns, and additional 
overhead on the mount changes themselves, by keeping the notification 
messages small.

We should also look at what the likely worst case might be. I seem to 
remember from what Ian has said in the past that there can be tens of 
thousands of autofs mounts on some large systems. I assume that worst 
case might be something like that, but multiplied by however many 
containers might be on a system. Can anybody think of a situation which 
might require even more mounts?

The network subsystem had a similar problem... they use rtnetlink for 
the routing information, and just like the proposal here it contains a 
dump mechanism, and a way to listen to events (add/remove routes) which 
is synchronized with that dump. Ian did start looking at netlink some 
time ago, but it also has some issues (it is in the network namespace 
not the fs namespace, it also has various things accumulated over the 
years that we don't need for filesystems) but that was part of the 
original inspiration for the fs notifications.

There is also, of course, /proc/net/route which can be useful in many 
circumstances, but for efficiency and synchronization reasons if is not 
the interface of choice for routing protocols. David's proposal has a 
number of the important attributes of an rtnetlink-like (in a conceptual 
sense) solution, and I remain skeptical that a /sysfs or similar 
interface would be an efficient solution to the original problem, even 
if it might perhaps make a useful addition.

There is also the chicken-and-egg issue, in the sense that if the 
interface is via a filesystem (sysfs, proc or whatever), how does one 
receive a notification for that filesystem itself being mounted until 
after it has been mounted? Maybe that is not a particular problem, but I 
think a cleaner solution would not require a mount in order to watch for 
other mounts,

Steve.