On Thu, Sep 14, 2023 at 5:57 PM Jacob Satterfield <jsatterfield.linux@xxxxxxxxx> wrote: > > On Tue, Sep 12, 2023 at 11:23 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote: > > > > On Sep 6, 2023 Stephen Smalley <stephen.smalley.work@xxxxxxxxx> wrote: > > > > > > The current avtab hashtable employs a separate chaining collision > > > resolution strategy where each bucket/chain holds an ordered linked list > > > of pointers to kmem_cache allocated avtab_node elements. > > > > > > On Fedora 38 (x86_64) using the default policy, avtab_node_cachep > > > uses 573 slabs each containing 170 objects totaling 2,337,840 bytes. > > > A call to kmem_cache_zalloc() is required for every single rule, which > > > in the default policy is currently 96,730 and continually rising. > > > > > > When both sets of avtab_node (regular and cond.) are turned into arrays > > > with the hash table slot indexing into it rather than a pointer, then > > > this results in only 2 additional kvcalloc() calls and the complete > > > removal of the kmem_cache itself. > > > > > > The caching characteristics of iterating a single array are better due > > > to locality of reference. Running "perf stat -r 100 -d load_policy" > > > has shown a runtime reduction of at least 10% on a Fedora 38 x86_64 VM > > > with this single patch. Future patches focused on improving the hash > > > table's collision resolution strategy and array layout (struct-of-arrays > > > vs. array-of-structs) may elicit even more caching and therefore runtime > > > performance improvements. > > > > > > On 64-bit machines, the memory usage of the hash table slots is cut in > > > half due to the use of u32 indices instead of memory pointers. Since > > > ~65k hash slots are used between the regular and cond. tables with the > > > default Fedora 38 policy, this equates to around 256KB of memory saved. > > > > > > Notes: > > > > > > A couple helper functions avtab_get_chain() and avtab_get_node() are > > > introduced to provide more standardized interfaces for use in functions > > > that need to search through the hash table. > > > > > > NULL_NODE_IDX defines a sentinel value which helps determinine if a spot > > > in the hash table or the "next" member of an avtab_node are valid. > > > > > > This patch causes the cond. rules table to waste memory as the size > > > requested for the kvcalloc() is the size of the regular rules table. > > > These tables rarely, if ever, have the same number of rules in practice. > > > The next patch addresses this shortcoming. > > > > Isn't the more important issue that of there being more conditional > > rules than regular rules? Yes, I'm sure this is very unlikely, but > > given just this patch wouldn't this cause a problem? > > > > It is important to remember that even when combined in a patchset, > > any given patch should not break the system. A patch can depend on > > prior patches in the patchset, but not upcoming patches. > > > > I've only looked very briefly at patch 2/3, but I suspect at the very > > least you may need to squash patches 1/3 and 2/3 to ensure there is > > no breakage. > > > > More comments below (all are in the context of patch 1/3, some may not > > be relevant in the context of 1/3+2/3). > > I will squash the two in the next version. > > > > Signed-off-by: Jacob Satterfield <jsatterfield.linux@xxxxxxxxx> > > > Reviewed-by: Stephen Smalley <stephen.smalley.work@xxxxxxxxx> > > > --- > > > security/selinux/ss/avtab.c | 75 +++++++++++++++---------------- > > > security/selinux/ss/avtab.h | 28 ++++++++++-- > > > security/selinux/ss/conditional.c | 13 +++--- > > > security/selinux/ss/services.c | 20 +++++---- > > > 4 files changed, 81 insertions(+), 55 deletions(-) ... 0, SLAB_PANIC, NULL); > > > diff --git a/security/selinux/ss/avtab.h b/security/selinux/ss/avtab.h > > > index 3c3904bf02b0..72708aed5fff 100644 > > > --- a/security/selinux/ss/avtab.h > > > +++ b/security/selinux/ss/avtab.h > > > @@ -77,16 +77,38 @@ struct avtab_datum { > > > struct avtab_node { > > > struct avtab_key key; > > > struct avtab_datum datum; > > > - struct avtab_node *next; > > > + u32 next; > > > > Given that one of the common avtab lookup operations is to simply walk > > the list, using the next field, why not keep the next field as a > > pointer? Sure, you loose the 64-bit to 32-bit size reduction on 64-bit > > systems (although I wonder if alignment issues rendered that moot), but > > I would expect the walk to be quicker if you only needed to do a single > > next pointer lookup as opposed to the array indexing. > > > > I may be missing something, but I believe this would also remove the > > need for the NULL_NODE_IDX sentinel. > > > > The conditional avtab is the reason an index was used instead of a > pointer. Because it doesn't have the right hint when allocating the > nodes array, we have to copy the nodes array into a smaller buffer > after reading the rules (otherwise we waste about 1.7 MB of memory). > The copy invalidates the pointers that each "next" refers to as they > would be pointing to free'd memory instead of the new nodes array. > However, I agree that the walk would be quicker with the direct > pointer reference instead of the array index and with the way the > struct alignment works, the index here doesn't save memory. > > An alternative I have prototyped is to use pointer math to calculate > an offset from the original nodes array to where "next" is pointing to > and add it from the new array to calculate the new "next" value. This > changes "next" back to a pointer and gets rid of the indirection. > > I agree that the NULL_NODE_IDX sentinel is not very ergonomic, however > it's at the cost of 256 KB of memory in relation to the htable slots. > Mainly due to the difficulty of telling when a slot is empty with an > index vs. a pointer. I could leave the htable slots as indexes, but it > would still require some form of sentinel value to indicate the > emptiness of a slot. What would you prefer? In addition to other changes we've discussed, it looks like squashing 1/3 and 2/3 is going to change things quite a bit, so I'm going to suggest you make the changes you're planning to make, including squashing the two patches, and repost. We can re-review the new revision and pick up the discussion from there. Does that sound okay? -- paul-moore.com