On Thu, 8 Aug 2024 at 09:12, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > > > But it looks like "$$divU" should be somewhere between $$divoI and > > $$divl_2, and in Guenter's bad case that's > > > > 0000000041218c70 T $$divoI > > 00000000412190d0 T $$divI_2 > > > > so *maybe* $$divU is around a page boundary? 0000000041218xxx turning > > into 0000000041219000? > > It uses $$divU which is at $$divoI + 0x250. I validated that in the > disassembly. Well, that does support "maybe we have a page crosser issue", but it's not quite at the delayed branch. Because that would mean that $$divU starts at 0x41218ec0, and that means that there are 80 instructions from the start of $$divU to the end of that 0x41218xxx page. And if I counted instructions right (I don't have a disassembler, so I'm just looking at the libgcc sources), that puts the page crosser not quite at the delayed branch slot, but it does put it somewhere roughly at or around ds temp,arg1,temp /* 29th divide step */ addc retreg,retreg,retreg /* shift retreg with/into carry */ so it's around the last few bits of the result. The ones we get wrong. Which is intriguing, but honestly, I don't see how we could get itlb misses horribly wrong and not crash left and right. The $$divU routine is unusual in that it uses that millicode calling convention, but I think that's just a different register for the return address. And it obviously depends on the carry flag, which is pretty unusual. Maybe if the ITLB fill messes up C, it wouldn't show up in other areas? But the $$divU result error is more than one single bit getting cleared. Linus