On Mon, Jun 01, 2020 at 03:22:01PM +0800, Tao pilgrim wrote: > On Fri, May 29, 2020 at 10:13 PM Jens Axboe <axboe@xxxxxxxxx> wrote: > > > > On 5/29/20 8:11 AM, Kaitao Cheng wrote: > > > There is a function named ilog2() exist which can replace blksize. > > > The generated code will be shorter and more efficient on some > > > architecture, such as arm64. And ilog2() can be optimized according > > > to different architecture. > > > > When you posted this last time, I said: > > > > "I like the simplification, but do you have any results to back up > > that claim? Is the generated code shorter? Runs faster?" > > > > Hi Jens Axboe: > > I did a test on ARM64. > unsigned int ckt_blksize(int size) > { > return blksize_bits(size); > } > unsigned int ckt_ilog2(int size) > { > return ilog2(size); > } > > When I compiled it into assembly code, I got the following result, > > 0000000000000088 <ckt_blksize>: > 88: 2a0003e8 mov w8, w0 > 8c: 321d03e0 orr w0, wzr, #0x8 > 90: 11000400 add w0, w0, #0x1 > 94: 7108051f cmp w8, #0x201 > 98: 53017d08 lsr w8, w8, #1 > 9c: 54ffffa8 b.hi 90 <ckt_blksize+0x8> > a0: d65f03c0 ret > a4: d503201f nop > > 00000000000000a8 <ckt_ilog2>: > a8: 320013e8 orr w8, wzr, #0x1f > ac: 5ac01009 clz w9, w0 > b0: 4b090108 sub w8, w8, w9 > b4: 7100001f cmp w0, #0x0 > b8: 5a9f1100 csinv w0, w8, wzr, ne > bc: d65f03c0 ret > > The generated code of ilog2 is shorter , and runs faster But does this code path actually show up anywhere that is actually measurable as mattering? If so, please show that benchmark results. thanks, greg k-h