On 4/1/22 2:40 AM, Miklos Szeredi wrote: > On Wed, 30 Mar 2022 at 19:49, Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> On 3/30/22 9:53 AM, Jens Axboe wrote: >>> On 3/30/22 9:17 AM, Jens Axboe wrote: >>>> On 3/30/22 9:12 AM, Miklos Szeredi wrote: >>>>> On Wed, 30 Mar 2022 at 17:05, Jens Axboe <axboe@xxxxxxxxx> wrote: >>>>>> >>>>>> On 3/30/22 8:58 AM, Miklos Szeredi wrote: >>>>>>> Next issue: seems like file slot reuse is not working correctly. >>>>>>> Attached program compares reads using io_uring with plain reads of >>>>>>> proc files. >>>>>>> >>>>>>> In the below example it is using two slots alternately but the number >>>>>>> of slots does not seem to matter, read is apparently always using a >>>>>>> stale file (the prior one to the most recent open on that slot). See >>>>>>> how the sizes of the files lag by two lines: >>>>>>> >>>>>>> root@kvm:~# ./procreads >>>>>>> procreads: /proc/1/stat: ok (313) >>>>>>> procreads: /proc/2/stat: ok (149) >>>>>>> procreads: /proc/3/stat: read size mismatch 313/150 >>>>>>> procreads: /proc/4/stat: read size mismatch 149/154 >>>>>>> procreads: /proc/5/stat: read size mismatch 150/161 >>>>>>> procreads: /proc/6/stat: read size mismatch 154/171 >>>>>>> ... >>>>>>> >>>>>>> Any ideas? >>>>>> >>>>>> Didn't look at your code yet, but with the current tree, this is the >>>>>> behavior when a fixed file is used: >>>>>> >>>>>> At prep time, if the slot is valid it is used. If it isn't valid, >>>>>> assignment is deferred until the request is issued. >>>>>> >>>>>> Which granted is a bit weird. It means that if you do: >>>>>> >>>>>> <open fileA into slot 1, slot 1 currently unused><read slot 1> >>>>>> >>>>>> the read will read from fileA. But for: >>>>>> >>>>>> <open fileB into slot 1, slot 1 is fileA currently><read slot 1> >>>>>> >>>>>> since slot 1 is already valid at prep time for the read, the read will >>>>>> be from fileA again. >>>>>> >>>>>> Is this what you are seeing? It's definitely a bit confusing, and the >>>>>> only reason why I didn't change it is because it could potentially break >>>>>> applications. Don't think there's a high risk of that, however, so may >>>>>> indeed be worth it to just bite the bullet and the assignment is >>>>>> consistent (eg always done from the perspective of the previous >>>>>> dependent request having completed). >>>>>> >>>>>> Is this what you are seeing? >>>>> >>>>> Right, this explains it. Then the only workaround would be to wait >>>>> for the open to finish before submitting the read, but that would >>>>> defeat the whole point of using io_uring for this purpose. >>>> >>>> Honestly, I think we should just change it during this round, making it >>>> consistent with the "slot is unused" use case. The old use case is more >>>> more of a "it happened to work" vs the newer consistent behavior of "we >>>> always assign the file when execution starts on the request". >>>> >>>> Let me spin a patch, would be great if you could test. >>> >>> Something like this on top of the current tree should work. Can you >>> test? >> >> You can also just re-pull for-5.18/io_uring, it has been updated. A last >> minute edit make a 0 return from io_assign_file() which should've been >> 'true'... > > Yep, this works now. > > Next issue: will get ENFILE even though there are just 40 slots. > When running as root, then it will get as far as invoking the OOM > killer, which is really bad. > > There's no leak, this apparently only happens when the worker doing > the fputs can't keep up. Simple solution: do the fput() of the > previous file synchronously with the open_direct operation; fput > shouldn't be expensive... Is there a reason why this wouldn't work? I take it you're continually reusing those slots? If you have a test case that'd be ideal. Agree that it sounds like we just need an appropriate breather to allow fput/task_work to run. Or it could be the deferral free of the fixed slot. -- Jens Axboe