Re: [PATCH v2] mm: alloc_pages_bulk: remove assumption of populating only NULL elements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2025/3/4 17:17, Qu Wenruo wrote:
> 
> 
> 在 2025/2/28 20:14, Yunsheng Lin 写道:
>> As mentioned in [1], it seems odd to check NULL elements in
>> the middle of page bulk allocating, and it seems caller can
>> do a better job of bulk allocating pages into a whole array
>> sequentially without checking NULL elements first before
>> doing the page bulk allocation for most of existing users.
>>
>> Through analyzing of bulk allocation API used in fs, it
>> seems that the callers are depending on the assumption of
>> populating only NULL elements in fs/btrfs/extent_io.c and
>> net/sunrpc/svc_xprt.c while erofs and btrfs don't, see:

I should have said 'while erofs and xfs don't depend on the
assumption of populating only NULL elements'.

>> commit 91d6ac1d62c3 ("btrfs: allocate page arrays using bulk page allocator")
> 
> If you want to change the btrfs part, please run full fstests with SCRATCH_DEV_POOL populated at least.

The above is a helpful suggestion/comment to someone like me, who
is not very familiar with fs yet, thanks for the suggestion.

But I am not sure about some of the other comments below.

> 
> [...]
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index f0a1da40d641..ef52cedd9873 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -623,13 +623,26 @@ int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array,
>>                  bool nofail)
>>   {
>>       const gfp_t gfp = nofail ? (GFP_NOFS | __GFP_NOFAIL) : GFP_NOFS;
>> -    unsigned int allocated;
>> +    unsigned int allocated, ret;
>>   -    for (allocated = 0; allocated < nr_pages;) {
>> -        unsigned int last = allocated;
>> +    /* Defragment page_array so pages can be bulk allocated into remaining
>> +     * NULL elements sequentially.
>> +     */
>> +    for (allocated = 0, ret = 0; ret < nr_pages; ret++) {
>> +        if (page_array[ret]) {
> 
> You just prove how bad the design is.
> 

Below is the reason you think the design is bad? If not, it would be
good to be more specific about why it is a bad design.

> All the callers have their page array members to initialized to NULL, or do not care and just want alloc_pages_bulk() to overwrite the uninitialized values.

Actually there are two use cases here as mentioned in the commit log:
1. Allocating an array of pages sequentially by providing an array as
   output parameter.
2. Refilling pages to NULL elements in an array by providing an array
   as both input and output parameter.

Most of users calling the bulk alloc API is allocating an array of pages
sequentially except btrfs and sunrpc, the current page bulk alloc API
implementation is not only doing the unnecessay NULL checking for most
users, but also require most of its callers to pass all NULL array via
memset, kzalloc, etc, which is also unnecessary overhead.

That means there is some space for improvement from performance and
easy-to-use perspective for most existing use cases here, this patch
just change alloc_pages_bulk() API to treat the page_array as only
the output parameter by mirroring kmem_cache_alloc_bulk() API.

For the existing btrfs and sunrpc case, I am agreed that there
might be valid use cases too, we just need to discuss how to
meet the requirements of different use cases using simpler, more
unified and effective APIs.

> 
> The best example here is btrfs_encoded_read_regular().
> Now your code will just crash encoded read.

It would be good to be more specific about the 'crash' here,
as simple testing mentioned in below seems fine for btrfs fs
too, but I will do more testing by running full fstests with
SCRATCH_DEV_POOL populated after I learn how to use the fstests.

https://lore.kernel.org/all/91fcdfca-3e7b-417c-ab26-7d5e37853431@xxxxxxxxxx/

> 
> Read the context before doing stupid things.
> 
> I find it unacceptable that you just change the code, without any testing, nor even just check all the involved callers.

What exactly is the above 'context' is referring to? If it is a good advice,
I think I will take it seriously.

May I suggest that it might be better to be more humble and discuss
more before jumpping to conclusion here as it seems hard for one
person to be familiar with all the subsystem in the kernel?

> 
>> +            page_array[allocated] = page_array[ret];
>> +            if (ret != allocated)
>> +                page_array[ret] = NULL;
>> +
>> +            allocated++;
>> +        }
>> +    }
>>   -        allocated = alloc_pages_bulk(gfp, nr_pages, page_array);
>> -        if (unlikely(allocated == last)) {
>> +    while (allocated < nr_pages) {
>> +        ret = alloc_pages_bulk(gfp, nr_pages - allocated,
>> +                       page_array + allocated);
> 
> I see the new interface way worse than the existing one.
> 
> All btrfs usage only wants a simple retry-until-all-fulfilled behavior.

As above, I am agreed that the above might be what btrfs usage want, so
let's discuss how to meet the requirements of different use cases using
simpler, more unified and effective API, like introducing a function like
alloc_pages_refill_array() to meet the above requirement as mentioned in
below?
https://lore.kernel.org/all/74827bc7-ec6e-4e3a-9d19-61c4a9ba6b2c@xxxxxxxxxx/

> 
> NACK for btrfs part, and I find you very unresponsible not even bother running any testsuit and just submit such a mess.
> 
> Just stop this, no one will ever take you serious anymore.
> 
> Thanks,
> Qu
> 
> 




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux