On 8/19/24 18:29, Daniel Jordan wrote:
On Sat, Aug 17, 2024 at 03:12:37PM GMT, Herbert Xu wrote:
On Mon, Aug 12, 2024 at 10:04:07AM -0400, Waiman Long wrote:
Anyway, using DIV_ROUND_UP() is a slight change in behavior as chunk_size
will be increased by 1 in most cases. I am a bit hesitant to make this
change without looking into more detail about the rationale behind the
current code.
I don't think it matters much. Just look at the two lines after
the division, they're both rounding the value up. So clearly this
is expected to handle the case where work gets bunched up into the
first N CPUs, potentially leaving some CPUs unused.
Yeah, the caller is supposed to use min_chunk as a hint for what a
reasonable amount of work is per thread and so avoid wasteful amounts of
threads.
But Daniel wrote the code so he can have the last say of whether
we should round up after the division or after the other two ops.
I think either way works fine with the three existing users and how they
choose job->min_chunk and job->size.
The DIV_ROUND_UP approach reads a bit nicer to me, but I can imagine
oddball cases where rounding up is undesirable (say, near-zero values
for size, min_chunk, and align; padata_work_alloc_mt returns many fewer
works than requested; and a single unit of work is very expensive) so
that rounding up makes a bigger difference. So, the way it now is seems
ok.
By the way, this bug must've happened coming from
hugetlb_pages_alloc_boot(), right, Waiman? Because the other padata
users have hardcoded min_chunk. I guess it was a case of
h->max_huge_pages < num_node_state(N_MEMORY) * 2
Yes, I guess the hugetlbfs caller is the cause of this div-by-0 problem.
This is likely a bug that needs to be fixed. The current patch does
guarantee that padata won't crash like that even with rogue caller.
Cheers,
Longman