Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > I have read through these patches and I noticed a significant issue. > > Today in mount_bdev we do something that looks like: > > mount_bdev(...) > { > s = sget(..., bdev); > if (s->s_root) { > /* Noop */ > } else { > err = fill_super(s, ...); > if (err) { > deactivate_locked_super(s); > return ERR_PTR(err); > } > s->s_flags |= SB_ATTIVE; > bdev->bd_super = s; > } > return dget(s->s_root); > } > > The key point is that we don't process the mount options at all if > a super block already exists in the kernel. Similar to what > your fscontext changes are doing (after parsing the options). Actually, no, that's not the case. The fscontext code *requires* you to parse the parameters *before* any attempt to access the superblock is made. Note that this will actually be a problem for, say, ext4 which passes a text string stored in the superblock through the parser *before* parsing the mount syscall data. Fun. I'm intending to deal with that particular case by having ext4 create multiple private contexts, one filled in from the user data, and then a second one filled in from the superblock string. These can then be validated one against the other before the super_block struct is published. And if the super_block struct already exists, the user's specified parameters can be validated against that. > Your fscontext changes do not improve upon this area of the mount api at > all and that concerns me. This is an area where people can and already > do shoot themselves in their feet. This *will* be dealt with, but I wanted to get the core changes upstream before tackling all the filesystems. The legacy wrapper is just that and should be got rid of when all the filesystems have been converted. > ... > > Creating a new mount and finding an old mount are the same operation in > the kernel today. This is fundamentally confusing. In the new api > could we please separate these two operations? > > Perhaps someting like: > x create > x find > > With the "x create" case failing if the filesystem already exists, > still allowing "x find"? And with the "x find" case failing if > the superblock is not already created in the kernel. No. What you're suggesting introduces a userspace-userspace and a userspace-kernel race - unless you're willing to let userspace lock against superblock creation by other parties. Further, some filesystems *have* to be parameterised before you can do the check for the superblock. Network filesystems, for example, where you have to set the network parameters upfront and the key to the superblock might not be known until you've queried the server. > That should make it clear to a userspace program what is going on > and give it a chance to mount a filesystem anyway. That said, I'm willing to provide a "fail if already extant" option if we think that's actually likely to be of use. However, you'd still have to provide parameters before the check can be made. > In a similar vein could we please clarify the rules for changing mount > options for an existing superblock are in the new api? You mean remount/reconfigure? Note that we have to provide backward compatibility with every single filesystem. > Today mount assumes that it has to provide all of the existing options to > reconfigure a mount. What people want to do and what most filesystems > support is just specifying the options that need to be changed. Can we > please make this the rule of how this are expected to work for fscontext? > That only changing mount options need to be specified before: "x > reconfigure" Fine by me - but it must *also* support every option being specified if that is what mount currently does. I don't really want to supply extra parsers if I can avoid it. Miklós, for example wanted a different, incompatible interface, so you'd do: write(fd, "o +foo"); write(fd, "o -bar"); write(fd, "x reconfig"); sort of thing to enable or disable options... but this assumes that options are binary and requires a separate parser to the one that does the initial configuration - and you still have to support the old remount data parse. I'm okay with specifying that you should just specify the options you want to change and that the normal way to 'disable' something is to prefix it with "no". I guess I could pass a flag through to indicate that this came from sys_mount(MS_REMOUNT) rather than the new method. David