On Thu, May 5, 2022 at 4:43 AM Jason A. Donenfeld <Jason@xxxxxxxxx> wrote: > > Hi Linus, > > On Wed, May 4, 2022 at 8:00 PM Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > On Wed, May 4, 2022 at 3:15 AM Jason A. Donenfeld <Jason@xxxxxxxxx> wrote: > > > > > > > Alignment? Compiler bug? HW issue? > > > > > > Probably one of those, yea. Removing the instruction addresses, the only > > > difference between the two compiles is: https://xn--4db.cc/Rrn8usaX/diff#line-440 > > > > Well, that address doesn't work for me at all. It turns into א.cc. > > > > I'd love to see the compiler problem, since I find those fascinating > > (mainly because they scare the hell out of me), but those web > > addresses you use are not working for me. > > א.cc is correct. If you can't load it, your browser or something in > your stack is broken. Choosing a non-ASCII domain like that clearly a > bad decision because people with broken stacks can't load it? Yea, > maybe. But maybe it's like the arch/alpha/ reordering of dependent > loads applied to the web... A bit of stretch. I have uploaded a diff I created here: https://gist.github.com/54334556f2907104cd12374872a0597c It shows the same output. > > It most definitely looks like an OpenRISC compiler bug - that code > > doesn't look like it does anything remotely undefined (and with the > > "unsigned char", nothing implementation-defined either). > > I'm not so certain it's in the compiler anymore, actually. The bug > exhibits itself even when that code isn't actually called. Adding nops > to unrelated code also makes the problem go away. And removing these > nops [1] makes the problem go away too. So maybe it's looking more > like a linker bug (or linker script bug) related to alignment. Or > whatever is jumping between contexts in the preemption code and > restoring registers and such is assuming certain things about code > layout that doesn't always hold. More fiddling is necessary still. Bisecting definitely came to this patch which is strange. Then reverting e5be15767e7e ("hex2bin: make the function hex_to_bin constant-time") did also fix the problem for me. But it could be any small patch that changes layout could make this go away. I have things to try: - more close look at the produced asembly diff - newer compiler (I fixed a few bugs in gcc 12 for openrisc, and this testing came up in gcc 11) - trying on FPGA's I'll report as I find things. -Stafford