On Sat, Aug 17, 2024 at 03:12:37PM GMT, Herbert Xu wrote: > On Mon, Aug 12, 2024 at 10:04:07AM -0400, Waiman Long wrote: > > > > Anyway, using DIV_ROUND_UP() is a slight change in behavior as chunk_size > > will be increased by 1 in most cases. I am a bit hesitant to make this > > change without looking into more detail about the rationale behind the > > current code. > > I don't think it matters much. Just look at the two lines after > the division, they're both rounding the value up. So clearly this > is expected to handle the case where work gets bunched up into the > first N CPUs, potentially leaving some CPUs unused. Yeah, the caller is supposed to use min_chunk as a hint for what a reasonable amount of work is per thread and so avoid wasteful amounts of threads. > But Daniel wrote the code so he can have the last say of whether > we should round up after the division or after the other two ops. I think either way works fine with the three existing users and how they choose job->min_chunk and job->size. The DIV_ROUND_UP approach reads a bit nicer to me, but I can imagine oddball cases where rounding up is undesirable (say, near-zero values for size, min_chunk, and align; padata_work_alloc_mt returns many fewer works than requested; and a single unit of work is very expensive) so that rounding up makes a bigger difference. So, the way it now is seems ok. By the way, this bug must've happened coming from hugetlb_pages_alloc_boot(), right, Waiman? Because the other padata users have hardcoded min_chunk. I guess it was a case of h->max_huge_pages < num_node_state(N_MEMORY) * 2