Re: [git pull] device mapper fixes for 6.7-rc2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 18 Nov 2023 at 10:33, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> What are the alleged "number of bugs all over the kernel" that the old
> MAX_ORDER definition had? In light of the other "MAX" definitions I
> found from a second of grepping, I really think the argument for that
> was just wrong.

Bah. I looked at the commits leading up to it, and I really think that
"if that's the fallout from 24 years of history, then it wasn't a huge
deal".

Compared to the inevitable problems that the semantic change will
cause for backports (and already caused for this merge window), I
still think this was all likely a mistake.

In fact, the biggest takeaway from those patches is that we probably
should have introduced some kind of "this is the max _size_ you can
allocate", because a few of them were related to that particular
calculation.

The floppy driver example is actually a good example, in that that is
*exactly* what it wanted, but it had been just done wrong as

  #define MAX_LEN (1UL << MAX_ORDER << PAGE_SHIFT)

and variations of that is what half the fixes were..

IOW, I really think we should just have added a

  #define MAX_GFP_SIZE (1UL << (PAGE_SHIFT+MAX_ORDER-1))

and the rest were basically all just the trivial off-by-one things.

The __iommu_dma_alloc_pages() code is just so odd that any MAX_ORDER
issues are more about "WTH is this code doing" And the "fix" is just
more of the same.

The *whole* point of having the max for a power-of-two thing is that
you can do things like

        mask = (1<<MAX)-1;

and that's exactly what the iommu code was trying to do. Except for
some *unfathomable* reason the iommu code did

        order_mask &= (2U << MAX_ORDER) - 1;

and honestly, you can't blame MAX_ORDER for the confusion. What a
strange off-by-one. MAX_ORDER was *literally* designed to be easy for
this case, and people screwed it up.

And no, it's *not* because "inclusive is more intuitive". As
mentioned, this whole "it's past the end* is the common case for MAX
values in the mm, for exactly these kinds of reasons. I already
listted several other cases.

The reason there were 9 bugs is because MAX_ORDER is an old thing, and
testing doesn't catch it because nobody triggers the max case. The
crazy iommu code goes back to 2016, for example.

That

    84 files changed, 223 insertions(+), 253 deletions(-)

really seems bogus for the 9 fixes that had accumulated in this area
in the last 24 years.

Oh well. Now we have places with 'MAX_ORDER + 1' instead of 'MAX_ORDER
- 1'. I'm not seeing that it's a win, and I do think "semantic change
after 24 years" is a loss and is going to cause problems.

               Linus




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux