On Sun, Jan 09, 2022 at 07:43:26PM -0700, Jens Axboe wrote: > On 1/9/22 7:38 PM, Ming Lei wrote: > > On Sun, Jan 09, 2022 at 06:54:21PM -0700, Jens Axboe wrote: > >> On 1/9/22 6:50 PM, Ming Lei wrote: > >>> Only the last sbitmap_word can have different depth, and all the others > >>> must have same depth of 1U << sb->shift, so not necessary to store it in > >>> sbitmap_word, and it can be retrieved easily and efficiently by adding > >>> one internal helper of __map_depth(sb, index). > >>> > >>> Not see performance effect when running iops test on null_blk. > >>> > >>> This way saves us one cacheline(usually 64 words) per each sbitmap_word. > >> > >> We probably want to kill the ____cacheline_aligned_in_smp from 'word' as > >> well. > > > > But sbitmap_deferred_clear_bit() is called in put fast path, then the > > cacheline becomes shared with get path, and I guess this way isn't > > expected. > > Just from 'word', not from 'cleared'. They will still be in separate > cache lines, but usually doesn't make sense to have the leading member > marked as cacheline aligned, that's a whole struct property at that > point. OK, got it, it is just sort of code style thing, and not any functional change. Will include it in V2. Thanks, Ming