On 2/27/20 5:50 PM, LIU Zhiwei wrote: >> This is not what I had in mind, and looks wrong as well. >> >> int idx = (index * mlen) / 64; >> int pos = (index * mlen) % 64; >> return (((uint64_t *)v0)[idx] >> pos) & 1; >> >> You also might consider passing log2(mlen), so the multiplication could be >> strength-reduced to a shift. > I don't think so. For example, when mlen is 8 bits and index is 0, it will > reduce to > > return (((uint64_t *)v0)[0]) & 1 > > And it's not right. > > The right bit is first bit in vector register 0. And in host big endianess, > it will be the first bit of the seventh byte. You've forgotten that we've just done an 8-byte big-endian load, which means that we *are* looking at the first bit of the byte at offset 7. It is right. >> You don't need to pass mlen, since it's > Yes. I finally remembered all of the bits that go into mlen and thought I had deleted that sentence -- apparently I only removed half. ;-) r~