On 7/25/20 2:31 AM, Pavel Begunkov wrote: > That's not final for a several reasons, but good enough for discussion. > That brings io_kiocb down to 192B. I didn't try to benchmark it > properly, but quick nop test gave +5% throughput increase. > 7531 vs 7910 KIOPS with fio/t/io_uring > > The whole situation is obviously a bunch of tradeoffs. For instance, > instead of shrinking it, we can inline apoll to speed apoll path. > > [2/2] just for a reference, I'm thinking about other ways to shrink it. > e.g. ->link_list can be a single-linked list with linked tiemouts > storing a back-reference. This can turn out to be better, because > that would move ->fixed_file_refs to the 2nd cacheline, so we won't > ever touch 3rd cacheline in the submission path. > Any other ideas? Nothing noticeable for me, still about the same performance. But generally speaking, I don't necessarily think we need to go all in on making this as tiny as possible. It's much more important to chase the items where we only use 2 cachelines for the hot path, and then we have the extra space in there already for the semi hot paths like poll driven retry. Yes, we're still allocating from a pool that has slightly larger objects, but that doesn't really matter _that_ much. Avoiding an extra kmalloc+kfree for the semi hot paths are a bigger deal than making io_kiocb smaller and smaller. That said, for no-brainer changes, we absolutely should make it smaller. I just don't want to jump through convoluted hoops to get there. -- Jens Axboe