Hi, Here's v3 of the LOOKUP_CACHED change. It's exposed as both a flag for openat2(), and it's used internally by io_uring to speed up (and make more efficient) the openat/openat2 support there. As posted in the v3 thread, performance numbers for various levels of the filename lookup already being cached: Cached 5.10-git 5.10-git+LOOKUP_CACHED Speedup --------------------------------------------------------------- 33% 1,014,975 900,474 1.1x 89% 545,466 292,937 1.9x 100% 435,636 151,475 2.9x The more cache hot we are, the faster the inline LOOKUP_CACHED optimization helps. This is unsurprising and expected, as a thread offload becomes a more dominant part of the total overhead. If we look at io_uring tracing, doing an IORING_OP_OPENAT on a file that isn't in the dentry cache will yield: 275.550481: io_uring_create: ring 00000000ddda6278, fd 3 sq size 8, cq size 16, flags 0 275.550491: io_uring_submit_sqe: ring 00000000ddda6278, op 18, data 0x0, non block 1, sq_thread 0 275.550498: io_uring_queue_async_work: ring 00000000ddda6278, request 00000000c0267d17, flags 69760, normal queue, work 000000003d683991 275.550502: io_uring_cqring_wait: ring 00000000ddda6278, min_events 1 275.550556: io_uring_complete: ring 00000000ddda6278, user_data 0x0, result 4 which shows a failed nonblock lookup, then punt to worker, and then we complete with fd == 4. This takes 65 usec in total. Re-running the same test case again: 281.253956: io_uring_create: ring 0000000008207252, fd 3 sq size 8, cq size 16, flags 0 281.253967: io_uring_submit_sqe: ring 0000000008207252, op 18, data 0x0, non block 1, sq_thread 0 281.253973: io_uring_complete: ring 0000000008207252, user_data 0x0, result 4 shows the same request completing inline, also returning fd == 4. This takes 6 usec. Using openat2, we see that an attempted RESOLVE_CACHED open of an uncached file will fail with -EAGAIN, and a subsequent attempt will too as it's still not cached. ls the file and retry, and we successfully open it with RESOLVE_CACHED: [test@archlinux ~]$ ./openat2-cached /etc/nanorc open: -1 openat2: Resource temporarily unavailable [test@archlinux ~]$ ./openat2-cached /etc/nanorc open: -1 openat2: Resource temporarily unavailable [test@archlinux ~]$ ls -al /etc/nanorc -rw-r--r-- 1 root root 10066 Dec 17 16:15 /etc/nanorc [test@archlinux ~]$ ./openat2-cached /etc/nanorc open: 3 Minor polish since v3: - Rename LOOKUP_NONBLOCK -> LOOKUP_CACHED, and ditto for the RESOLVE_ flag. This better explains what the feature does, making it more self explanatory in terms of both code readability and for the user visible part. - Remove dead LOOKUP_NONBLOCK check after we've dropped LOOKUP_RCU already, spotted by Al. - Add O_TMPFILE to the checks upfront, so we can drop the checking in do_tmpfile(). -- Jens Axboe