On Thu, Nov 3, 2022 at 12:36 PM Nathan Chancellor <nathan@xxxxxxxxxx> wrote: > > Thanks, I also realized that only a couple minutes after I sent my > initial message. I just got done testing the following diff, which > resolves my issue. That looks obviously correct. Except in this case "obviously correct patch" is to some very non-obvious code, and I think the whole code around it is very very questionable. I had to actually go check that this code can only be enabled on x86-64 (because "IRQ_REMAP" has a "depends on X86_64" on it), because it also uses cmpxchg_double and that now exists on x86-32 too (but only does 64 bits, not 128 bits, of course). Now, to make things even more confusing, I think that #if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) has *never* made sense, since it's always enabled for x86. HOWEVER - there were actually early AMD x86-64 machines that didn't have CMPXCHG16B. So the conditional kind of makes sense, but doing it based on CONFIG_HAVE_CMPXCHG_DOUBLE does not. It turns out that we do do this all correctly, except we do it at boot time, with a test for boot_cpu_has(X86_FEATURE_CX16): /* * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. * XT, GAM also requires GA mode. Therefore, we need to * check cmpxchg16b support before enabling them. */ if (!boot_cpu_has(X86_FEATURE_CX16) || ... but that #ifdef has apparenrly never been valid (I didn't go back and see if we at some point had a config entry for those old CPUs). And even after I checked *that*, I then checked the 'struct irte' to check that it's actually properly defined, and it isn't. Considering that this all requires 16-byte alignment to work, I think that type should also be marked as being 16-byte aligned. In fact, I wonder if we should aim to actually force compile-time checking, because right now we have VM_BUG_ON((unsigned long)(p1) % (2 * sizeof(long))); VM_BUG_ON((unsigned long)((p1) + 1) != (unsigned long)(p2)); in our x86-64 __cmpxchg_double() macro, and honestly, that first one might be better as a compile time check of __alignof__, and the second one shouldn't exisrt at all because our interface shouldn't be using two different pointers when the only possible use is for one single aligned value. If somebody actually wants the old m68k type of "DCAS" that did a cmpxchg on two actually *different* pointers, we should call it somethign else (and that's not what any current architecture does). So honestly, just looking at this trivially correct patch, I got into a rats nest of horribly wrong code. Nasty. Linus