On Thu, Nov 03, 2022 at 04:34:04PM +0100, Andrew Jones wrote: > On Thu, Nov 03, 2022 at 04:02:12PM +0100, Borislav Petkov wrote: > > On Thu, Nov 03, 2022 at 01:59:45PM +0100, Andrew Jones wrote: > > > The patch I'm proposing ensures cpumask_next()'s range, which is actually > > > [-1, nr_cpus_ids - 1), > > > > Lemme make sure I understand it correctly: on the upper boundary, if you > > supply for n the value nr_cpu_ids - 2, then it will return potentially > > the last bit if the mask is set, i.e., the one at position (nr_cpu_ids - 1). > > > > If you supply nr_cpus_ids - 1, then it'll return nr_cpu_ids to signal no > > further bits set. > > > > Yes, no? > > Yes > > > > > > I'll send a v4 with another stab at the commit message. > > > > Yes, and it is still an unreadable mess: "A kernel compiled with commit > > ... but not its revert... " Nope. > > > > First make sure cpumask_next()'s valid accepted range has been settled > > upon, has been explicitly documented in a comment above it and then I'll > > take a patch that fixes whatever is there to fix. > > That's fair, but I'll leave that to Yury. I'll take care of it. > > Callers should not have to filter values before passing them in - the > > function either returns an error or returns the next bit in the mask. > > That's reasonable, but cpumask folk probably need to discuss it because > not all cpumask functions have a return value where an error may be > placed. Callers should pass sane arguments into internal functions if they expect sane output. The API not exported to userspace shouldn't sanity-check all inputs arguments. For example, cpumask_next() doesn't check srcp for NULL. However, cpumask API is exposed to drivers, and that's why optional cpumask_check() exists. (Probably. It has been done long before I took over this.) Current *generic* implementation guarantees that out-of-region offset would prevent cpumask_next() from dereferencing srcp, and makes it returning nr_cpu_ids. This behavior is expected by many callers. However, there is a couple of non-generic cpumask implementations, and one of them is written in assembler. So, the portable code shouldn't expect from cpumasks more than documentation said: for a _valid_ offset cpumask_next() returns next set bit or >= nr_cpu_ids. cpumask_check() has been broken for years. Attempting to fix it faced so much resistance, that I had to revert the patch. Now there's ongoing discussion whether we need this check at all. My opinion is that if all implementations of cpumask (more precisely, underlying bitmap API) are safe against out-of-range offset, we can simply remove cpumask_check(). Those users, like cpuinfo, who waste time on useless last iteration will bear it themselves. Thanks, Yury