On Wed, 2022-03-02 at 02:31 +0800, Hao Xu wrote: > > > + ne = kmalloc(sizeof(*ne), GFP_NOWAIT); > > + if (!ne) > > + goto out; > > IMHO, we need to handle -ENOMEM here, I cut off the error handling > when > > I did the quick coding. Sorry for misleading. If you are correct, I would be shocked about this. I did return in my 'Linux Device Drivers' book and nowhere it is mentionned that the kmalloc() can return something else than a pointer No mention at all about the return value in man page: https://www.kernel.org/doc/htmldocs/kernel-api/API-kmalloc.html API doc: https://www.kernel.org/doc/html/latest/core-api/mm-api.html?highlight=kmalloc#c.kmalloc header file: https://elixir.bootlin.com/linux/latest/source/include/linux/slab.h#L522 I did browse into the kmalloc code. There is a lot of paths to cover but from preliminary reading, it pretty much seems that kmalloc only returns a valid pointer or NULL... /** * kmem_cache_alloc - Allocate an object * @cachep: The cache to allocate from. * @flags: See kmalloc(). * * Allocate an object from this cache. The flags are only relevant * if the cache has no available objects. * * Return: pointer to the new object or %NULL in case of error */ /** * __do_kmalloc - allocate memory * @size: how many bytes of memory are required. * @flags: the type of memory to allocate (see kmalloc). * @caller: function caller for debug tracking of the caller * * Return: pointer to the allocated memory or %NULL in case of error */ I'll need someone else to confirm about possible kmalloc() return values with perhaps an example I am a bit skeptic that something special needs to be done here... Or perhaps you are suggesting that io_add_napi() returns an error code when allocation fails. as done here: https://elixir.bootlin.com/linux/latest/source/arch/alpha/kernel/core_marvel.c#L867 If that is what you suggest, what would this info do for the caller? IMHO, it wouldn't help in any way... > > > > > @@ -7519,7 +7633,11 @@ static int __io_sq_thread(struct io_ring_ctx > > *ctx, bool cap_entries) > > !(ctx->flags & IORING_SETUP_R_DISABLED)) > > ret = io_submit_sqes(ctx, to_submit); > > mutex_unlock(&ctx->uring_lock); > > - > > +#ifdef CONFIG_NET_RX_BUSY_POLL > > + if (!list_empty(&ctx->napi_list) && > > + io_napi_busy_loop(&ctx->napi_list)) > > I'm afraid we may need lock for sqpoll too, since io_add_napi() could > be > in iowq context. > > I'll take a look at the lock stuff of this patch tomorrow, too late > now > in my timezone. Ok, please do. I'm not a big user of io workers. I may have omitted to consider this possibility. If that is the case, I think that this would be very easy to fix by locking the spinlock while __io_sq_thread() is using napi_list. > > How about: > > if (list is singular) { > > do something; > > return; > > } > > while (!io_busy_loop_end() && io_napi_busy_loop()) > > ; > is there a concern with the current code? What would be the benefit of your suggestion over current code? To me, it seems that if io_blocking_napi_busy_loop() is called, a reasonable expectation would be that some busy looping is done or else you could return the function without doing anything which would, IMHO, be misleading. By definition, napi_busy_loop() is not blocking and if you desire the device to be in busy poll mode, you need to do it once in a while or else, after a certain time, the device will return back to its interrupt mode. IOW, io_blocking_napi_busy_loop() follows the same logic used by napi_busy_loop() that does not call loop_end() before having perform 1 loop iteration. > Btw, start_time seems not used in singular branch. I know. This is why it is conditionally initialized. Greetings,