On Wed, Mar 18, 2020 at 11:35 AM George Spelvin <lkml@xxxxxxx> wrote: > > On Wed, Mar 18, 2020 at 08:26:06AM -0700, Alexander Duyck wrote: > > On Tue, Mar 17, 2020 at 6:44 PM George Spelvin <lkml@xxxxxxx> wrote: > >> + if (unlikely(rshift == 0)) { > >> + r = get_random_long(); > >> + rshift = r << 1 | 1; > > > > You might want to wrap the "r << 1" in parenthesis. Also you could > > probably use a + 1 instead of an | 1. > > I could, but what would it matter? I have just confirmed that all of: > x << 1 | 1; > (x << 1) + 1; > x + x + 1; > x + x | 1; > 2*x + 1; > 2*x | 1; > compile to > leal 1(%rdi,%rdi), %eax > > on x86, and two instructions on every other processor I can think of. > > Since this is concpetually a bit-manipulation operation where carry > propagation is undesirable, the logical operation form seems the most > natural way to write it. > > As for the parens, all C programmers are forced to remember that the > boolean operators have weirdly low precedence (below < <= == >= >), > so there's no risk of confusion. Okay, if this is coming out the same regardless I suppose it doesn't matter. I am still not a huge fan of skipping the parenthesis as I feel it is a bit uglier to read but that isn't anything that has to be fixed. > >> } > >> + WRITE_ONCE(rand, rshift); > >> > >> - if (rand & 1) > >> + if ((long)r < 0) > > > > One trick you might be able to get away with here is to actually > > compare r to rshift. "If (rshift <= r)" should give you the same > > result. This works since what you are essentially doing is just adding > > r to itself so if you overflow rshift will be equal to at most r - 1. > > However with the addition of the single bit in the rshift == 0 case it > > could potentially be equal in the unlikely case of r being all 1's. > > Er... but why would I want to? On most processors, "branch on sign bit" > is a single instruction, and that's the instruction I'm hoping the > compiler will generate. > > That's why I changed the shift direction from the original right (testing > the lsbit) to left (testing the msbit): slight code size reduction. > > Anything else produces larger and slower object code, for no benefit. I was just putting it out there as a possibility. What I have seen in the past is that under some circumstances gcc can be smart enough to interpret that as a "branch on carry". My thought was you are likely having to test the value against itself and then you might be able to make use of shift and carry flag to avoid that. In addition you could get away from having to recast a unsigned value as a signed one in order to perform the bit test.