Hi Liping, On Mon, Mar 06, 2017 at 10:27:20PM +0800, Liping Zhang wrote: > Hi Pablo, > > 2017-03-06 20:01 GMT+08:00 Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>: > [...] > > Right, the userdata case is not handled properly. And in this case we > > have no specific flag at set creation, so element comments may come > > without previous notice via set flags. > > > > I think we have to keep a list of dummy nft_set_ext that is only used > > in the dump path, we can simplify this logic at the cost of increasing > > memory consumption. Another alternative is to keep around a structure > > to store only the set element userdata that we need for comments. > > > > Let me think. > > > > Your patches look good to me. Probably we can skip 2/2 if we introduce > > the dummy nft_set_ext list, and remove the ->flush field for > > nft_set_iter. > > > > Actually I was preparing to send v2 about this patch, then I saw your > reply:). Because I find out that in nft_bitmap_walk(), the 'key' maybe > incorrect on the big-endian machines when the key length is 1. > So the patch diff looks like this: > > static void nft_bitmap_walk(...) > key = ((idx * BITS_PER_BYTE) + off) >> 1; > - memcpy(nft_set_ext_key(ext), &key, set->klen); > + if (set->klen == 2) > + *(u16 *)nft_set_ext_key(ext) = key; > + else > + *(u8 *)nft_set_ext_key(ext) = key; > > But if we will introduce the dummy nft_set_ext list to the whole elements > in the bitmap, the above part is not needed anymore, i.e. we need not to > convert the bit to key. Right, we can just walk over the list of dummy nft_set_ext if we follow this approach. > (Now start the second part, about the byte-order in nft) > Unrelated to this patch actually, I find that there's a little messy when we > store the u8 or u16 integer to the register, which may cause miss-match in > big-endian machines (Actually I have no big-endian machine around me, > so I can't verify it). > > For example, dest pointer is declared as "u32 *dest = ®s->data[priv->dreg];", > but there are different ways to fetch the value: > 1. fetching the l4 port, we use: > *dest = 0; > *(u16 *)dest = *(u16 *)ptr; > > 2. fetching the NFT_META_IIFTYPE, we use: > *dest = 0; > *(u16 *)dest = in->type; > > 3. fetching the NFT_CT_PROTO_SRC, we use: > *dest = (__force __u16)tuple->src.u.all; > > So method 1 and method 2 will cause the value stored like this, either in > big-endian or little-endian: > 0 15 31 > +-+-+-+-+-+-+-+-+- > | Value | 0 | > +-+-+-+-+-+-+-+-+- > But method 3 will cause the value stored like this, in big-endian machine: > 0 15 31 > +-+-+-+-+-+-+-+-+- > | 0 | Value | > +-+-+-+-+-+-+-+-+- > > Later in nft_cmp, nft_set_hash, nft_set_rbtree, we use memcmp to do compare: > "memcmp(®s->data[priv->sreg], data/key, 2);" > > So method 3 is wrong in big-endian, as 0~15 bits will always be zero. Maybe we > can introduce some wrapper functions to help us, for example: > > static inline void nft_register_store16(u32 *dreg, u16 value) > { > *dreg = 0; > *(u16 *)dreg = value; > } > > static inline void nft_register_store8(u32 *dreg, u8 value) > { > *dreg = 0; > *(u8 *)dreg = value; > } I think this a good idea, send patches to add this and use them for the nf tree, please. > ... > > Am I wrong? Or I totally misunderstood this byte-order issue? This looks correct to me. Note that: *dest = 0; is just there because of concatenations, so we make sure that we zero the pad given that register allocation happens at 32-bit level. Another note: For method 3. __force is there for the sparse checker given the different endianness of both sides of the assignment. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html