Andreas Gruenbacher <agruenba@xxxxxxxxxx> writes: > On Tue, May 24, 2016 at 5:41 PM, Djalal Harouni <tixxdz@xxxxxxxxx> wrote: >> On Mon, May 23, 2016 at 03:09:49PM +0200, Andreas Gruenbacher wrote: >>> Currently, getxattr() and setxattr() check for the xattr names >>> "system.posix_acl_{access,default}" and perform in-place UID / GID >>> namespace mappings in the xattr values. Filesystems then again check for >>> the same xattr names to handle those attributes, almost always using the >>> standard posix_acl_{access,default}_xattr_handler handlers. This is >>> unnecessary overhead; move the namespace conversion into the xattr >>> handlers instead. >> >> Please, are you sure that the changes in posix_acl_xattr_get() and >> posix_acl_xattr_set() are safe ? you are reading into current user >> namespace, from a first view this is not safe unless I'm missing >> something... they should map into init_user_ns... > > Yes, moving the namespace conversion from the VFS into those functions > so that we don't have to check for those attributes and parse them > twice is exactly the point of this patch. In general I am in favor of cleaning up the xattr and acl code in the kernel. However I am not certain that your changes succeed in getting us where we want to go. My feel is that what we want to do is to update the cached acl when we write it from userspace, to update the disk with the new acl when the inode is sync'd to disk, and let the vfs handle the translation from the cached posix acl to the vfs getxattr and setxattr system calls. In the long term anything else is madness. Currently posix acl reads and updates bypass the vfs acl cache for the inode and that just looks wrong. The reason that fixup happens in a separate pass from everything else today is that when I was wrote posix_acl_fix_xattr_to_user and posix_acl_xattr_from_user a number of filesystems had a very strange structure that completely bypassed any code conversion routines and made some strange assumptions. I don't remember which filesystems those were but it was neither lustre, nor cifs, nor nfs that were the problem cases when I was looking. I don't see your patch addressing that problem so either someone has already fixed those issues or they have been overlooked. There is a complication that is comming shortly (next merge window unless major unfixable bugs show up between now and them). There is a new field s_user_ns that is being introduced for filesystems to record their owner and their default translation rules to kuids. That will make the hard coded &init_user_ns values in your patch wrong. >> Please Cc the user namespace maintainers before. Thank you! > > Eric, Andy, anyone else? Serge Hallyn has a pending patch that adds a similar translation to the security.capability xattr. Which is one more case of where caching and translation at the VFS layer are makes sense. All of that said I am definitely in favor of cleaning up this area of code. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html