Re: [patch 5/9] x86: Cure per CPU madness on UP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 15 2024 at 16:23, Linus Torvalds wrote:
> On Fri, 15 Mar 2024 at 15:55, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>> So the proper thing to do is to check for num_possible_cpus() == 1 in
>> that function.
>
> I think that's _one_ proper thing. I still think that the deeper
> problem is that it still looks at local apic rules even when those
> rules are completely nonsensical.
>
> For example, that MAX_LOCAL_APIC range test may not matter simply
> because it's testing a constant value, but it still smells entirely
> wrong to even check for that, when the system doesn't necessarily have
> one.

cpu_info.apic_id defaults to 0, so unless the calling code is completely
broken it will be correct. And I rather catch the case of calling code
being broken in the !APIC case if we still want to support systems
without a local APIC.

> So I think your patch may fix the immediate bug, but I think it's
> still just a band-aid.
>
> Either we should just make all machines look like they have the proper
> local apic mappings, or we shouldn't look at any local apic rules AT
> ALL.

Sure. I can simply check if there was an APIC registered instead.

>> Sure you can argue that we could avoid it for SMP=n builds completely,
>> but I think the right thing to do is to aim for removing CONFIG_SMP and
>> make the UP build a subset of a generic SMP capable build which has
>> CONFIG_NR_CPUS=1, i.e. num_possible_cpus() = 1. Why?
>
> I wouldn't be entirely opposed to just doing that. UP has become
> fairly irrelevant.
>
> That said, UP is *not* entirely irrelevant on other architectures, and
> if we drop UP support on x86, we'll be effectively dropping a lot of
> coverage testing. The number of people who do cross-compilers is
> pretty small.
>
> End result: I'd *much* rather get rid of X86_UP_APIC and the "nolapic"
> kernel command line, and say "even UP has to have a local APIC".
>
> We already require a Pentium-class CPU, so in practice we already
> require that local APIC setup. And yes, machines existed where it
> could be turned off, but I don't think that is relevant any more.

You wish. We still support 486 and some of the still produced 486 clones
do not have a local APIC.

Not that I care and yes I'm all for getting rid of CONFIG_.*_APIC and of
the related config/command line options. If we refuse to boot on
hardware which does not enumerate an APIC then even better.

But that is only a part of the overall problem.

> Put another way: I think "UP config for wider build testing" is a
> _lot_ more relevant than "no LAPIC support".

I really have to disagree here.

The concept of making UP a proper subset of SMP has absolutely nothing
to do with x86 and UP test coverage.

We want SMP as a general concept and overhaul the whole kernel to get
rid of this ever increasing non-sensical UP burden. The real world UP
small system use cases have moved over to other OSes like Zephyr & Co
long ago.

Just because some esoteric architectures (m68k comes to my mind) will
have serious issues with that for the very wrong reasons does not mean
that we should not go there.

It's going to be quite some effort, but the overall benefit is worth it.

OTOH, it's absolutely not rocket science to pretend to be SMP capable
and if some architectures fail to accomodate on the way then we just
should remove them as that's a clear sign of being unmaintained and
irrelevant.

The amount of untested SMP=n code in the kernel becomes frigthening and
your argument that build coverage is making a difference is wishful
thinking at best.

Anything else than making the kernel SMP capable and making UP builds a
well defined subset via CONFIG_NR_CPUS=1 is a complete waste of time and
effort.

If your intention is to indulge in the historical glory of Linux running
on any (by now) irrelevant hardware on the planet, then I stop arguing
right here.

If not, can we please have a serious discussion about going SMP only and
making UP the simple and obvious NR_CPUS=1 subset?

The amount of subtle SMP=n fallout has been kinda exponentially
increasing over the years and it's just putting burden on the wrong
people. TBH, I'm tired of this nonsense.

Thanks,

        tglx




[Index of Archives]     [Newbies FAQ]     [LKML]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Trinity Fuzzer Tool]

  Powered by Linux