> -----邮件原件----- > 发件人: Florian Westphal [mailto:fw@xxxxxxxxx] > 发送时间: 2019年2月25日 19:54 > 收件人: Li,Rongqing <lirongqing@xxxxxxxxx> > 抄送: Florian Westphal <fw@xxxxxxxxx>; netfilter-devel@xxxxxxxxxxxxxxx > 主题: Re: 答复: [PATCH][nf-next] netfilter: replace modulo operation with > bitwise AND > > Li,Rongqing <lirongqing@xxxxxxxxx> wrote: > > > > > > > -----邮件原件----- > > > 发件人: Florian Westphal [mailto:fw@xxxxxxxxx] > > > 发送时间: 2019年2月25日 19:27 > > > 收件人: Li,Rongqing <lirongqing@xxxxxxxxx> > > > 抄送: netfilter-devel@xxxxxxxxxxxxxxx > > > 主题: Re: [PATCH][nf-next] netfilter: replace modulo operation with > > > bitwise AND > > > > > > Li RongQing <lirongqing@xxxxxxxxx> wrote: > > > > CONNTRACK_LOCKS is 1024 and power of 2, so modulo operations can > > > > be replaced with AND (CONNTRACK_LOCKS - 1) > > > > > > > > and bitwise AND operation is quicker than module operation > > > > > > Uh. What kind of compiler doesn't figure that out?! > > > > > > I would prefer to keep it as-is and let compiler do the optimization. > > > > > > gcc version 7.3.0 (GCC) > > > > > > main() > > { > > int i=1000000000; > > > > i= i % 1024; > > > > return i; > > } > > > > 00000000004004a7 <main>: > > 4004a7: 55 push %rbp > > 4004a8: 48 89 e5 mov %rsp,%rbp > > 4004ab: c7 45 fc 00 ca 9a 3b movl $0x3b9aca00,-0x4(%rbp) > > 4004b2: 8b 45 fc mov -0x4(%rbp),%eax > > 4004b5: 99 cltd > > 4004b6: c1 ea 16 shr $0x16,%edx > > 4004b9: 01 d0 add %edx,%eax > > 4004bb: 25 ff 03 00 00 and $0x3ff,%eax > > & 1023. > > With -O2 gcc emits same code regardless of % 1024 or & 1023. > > > Similar patch: > > > > commit 1a1d74d378b13ad3f93e8975a0ade0980a49d28b > > Author: Jakub Kicinski <jakub.kicinski@xxxxxxxxxxxxx> > > Date: Mon Oct 31 20:43:17 2016 +0000 > > > > nfp: use AND instead of modulo to get ring indexes > > But in that case the value isn't known at compile time. > > gcc can't know that '= reg1 % reg2' can be rewritten as 'reg1 & (reg2 - 1)' > > But it can do it if reg2 is a known constant (and a power of 2). Thanks, You are right I think this patches maybe doing two thing: 1. make codes not depend on compiler optimization 2. remind to not change CONNTRACK_LOCKS to a value which is not power of 2 -RongQing