From: Christian Brauner <christian.brauner@xxxxxxxxxx> The new openat2() syscall verifies that no unknown O-flag values are set and returns an error to userspace if they are while the older open syscalls like open() and openat2() simply ignore unknown flag values: #define O_FLAG_CURRENTLY_INVALID (1 << 31) struct open_how how = { .flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID, .resolve = 0, }; /* fails */ fd = openat2(-EBADF, "/dev/null", &how, sizeof(how)); /* succeeds */ fd = openat(-EBADF, "/dev/null", O_RDONLY | O_FLAG_CURRENTLY_INVALID); However, openat2() silently truncates the upper 32 bits meaning: #define O_FLAG_CURRENTLY_INVALID_LOWER32 (1 << 31) #define O_FLAG_CURRENTLY_INVALID_UPPER32 (1 << 40) struct open_how how_lowe32 = { .flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID_LOWE32, .resolve = 0, }; struct open_how how_upper32 = { .flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID_LOWE32, .resolve = 0, }; /* fails */ fd = openat2(-EBADF, "/dev/null", &how_lower32, sizeof(how_lower32)); /* succeeds */ fd = openat2(-EBADF, "/dev/null", &how_upper32, sizeof(how_upper32)); That seems like a bug. Fix it by preventing the truncation in build_open_flags(). There's a snafu here though stripping FMODE_* directly from flags would cause the upper 32 bits to be truncated as well due to integer promotion rules since FMODE_* is unsigned int, O_* are signed ints (yuck). This change shouldn't regress old open syscalls since they silently truncate any unknown values. Cc: Christoph Hellwig <hch@xxxxxx> Cc: Aleksa Sarai <cyphar@xxxxxxxxxx> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Cc: linux-fsdevel@xxxxxxxxxxxxxxx Reported-by: Richard Guy Briggs <rgb@xxxxxxxxxx> Signed-off-by: Christian Brauner <christian.brauner@xxxxxxxxxx> --- fs/open.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/open.c b/fs/open.c index e53af13b5835..96644aa325eb 100644 --- a/fs/open.c +++ b/fs/open.c @@ -1002,12 +1002,17 @@ inline struct open_how build_open_how(int flags, umode_t mode) inline int build_open_flags(const struct open_how *how, struct open_flags *op) { - int flags = how->flags; + u64 flags = how->flags; + u64 strip = FMODE_NONOTIFY | O_CLOEXEC; int lookup_flags = 0; int acc_mode = ACC_MODE(flags); - /* Must never be set by userspace */ - flags &= ~(FMODE_NONOTIFY | O_CLOEXEC); + /* + * Strip flags that either shouldn't be set by userspace like + * FMODE_NONOTIFY or that aren't relevant in determining struct + * open_flags like O_CLOEXEC. + */ + flags &= ~strip; /* * Older syscalls implicitly clear all of the invalid flags or argument -- 2.27.0