Re: [PATCH] Improve loading performance by 80% using dynamicresizing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for the comments, Stephen--we'll do a rewrite to address the issues you've identified.

On Aug 2, 2010, at 8:40 AM, Stephen Smalley wrote:

> On Thu, 2010-07-29 at 17:30 -0400, Matthew Robertson wrote:
>> After spending a lot of developer time recently doing repeated relinks of
>> policies after making minor changes we put some time into tracking down avenues
>> for improving performance in the SELinux libraries. On a reasonably modern HP
>> server a 'make load' of a precompiled SELinux Reference Policy takes roughly 96
>> seconds. The 'perf' tool pointed us to some undersized and overpopulated hash
>> tables in libsepol's hashtab.c and avtab.c. By implementing dynamic resizing for
>> those structures we were able to bring that down to a 19 second runtime (80%
>> performance improvement) at the cost of ~6% more memory overhead.
>> 
>> Signed-off-by: Matthew Robertson <Matthew.L.Robertson@xxxxxxxxxx>
> 
> Thanks for the patch.  Some comments below will need to be addressed for
> merging.
> 
>> diff --git a/libsepol/src/avtab.c b/libsepol/src/avtab.c
>> index ea947cb..51abd04 100644
>> --- a/libsepol/src/avtab.c
>> +++ b/libsepol/src/avtab.c
>> @@ -78,17 +78,13 @@ avtab_insert_node(avtab_t * h, int hvalue, avtab_ptr_t prev, avtab_key_t * key,
>> 	return newnode;
>> }
>> 
>> -int avtab_insert(avtab_t * h, avtab_key_t * key, avtab_datum_t * datum)
>> +int avtab_insert__(avtab_t * h, avtab_key_t * key, avtab_datum_t * datum)
> 
> This should be made a static function, and I think the convention is to
> use leading underscores rather than trailing ones for internal
> functions.  Or use a unique suffix that has an actual meaning, e.g.
> avtab_insert_helper().
> 
>> +int avtab_insert(avtab_t * h, avtab_key_t * key, avtab_datum_t * datum)
>> +{
>> +        int oldsize, i;
>> +	avtab_ptr_t prev, cur, *oldtbl, *newtbl;
>> +
>> +	if (!h || !h->htable)
>> +		return SEPOL_ENOMEM;
>> +
>> +	/* If we don't need to resize, just insert the new entry */
>> +	if (h->nslot * 3/4 > h->nel)
> 
> The resizing threshold can be precomputed by avtab_alloc() and stored in
> the avtab as a field (e.g. h->threshold), so you can just compare
> without recomputing here.  Why 3/4?  That should likely be a #define or
> otherwise settable.
> 
>> +		return avtab_insert__(h, key, datum);
>> +
>> +	/* Attempt to allocate a bigger table */
>> +	newtbl = (avtab_ptr_t*) calloc(sizeof(avtab_ptr_t), h->nslot * 2);
> 
> Order of arguments to calloc() is reversed - this is an array with
> h->nslot*2 members of size sizeof(avtab_ptr_t) bytes each, not the other
> way around.  Yields the same effect but should be fixed.
> 
>> +	if (newtbl == NULL)
>> +		return avtab_insert__(h, key, datum);
>> +
>> +	/* Reinsert all elements into the new table */
>> +	h->nel = 0;
>> +	oldsize = h->nslot;
>> +	h->nslot *= 2;
>> +	h->mask = h->nslot - 1;
>> +	oldtbl = h->htable;
>> +	h->htable = newtbl;
>> +
>> +	for (i = 0; i < oldsize; i++) {
>> +		cur = oldtbl[i];
>> +		while (cur != NULL) {
>> +			avtab_insert(h, &(cur->key), &(cur->datum));
> 
> avtab_insert() can fail, so at the least, you need to check and handle
> errors, particularly ENOMEM, or you can lose data here.
> Also, I think you want to call __avtab_insert() rather than recursively
> calling avtab_insert() here, as you shouldn't need to check for resizing
> the table here.
> Last, we could avoid both the possibility of memory allocation failure
> and be more efficient by reusing the already existing avtab nodes and
> merely re-chaining them into the new hash table.  This would however
> require writing a bit of custom code for doing that (or factoring out a
> common helper) rather than just reusing __avtab_insert().
> 
>> diff --git a/libsepol/src/hashtab.c b/libsepol/src/hashtab.c
>> index c4be72c..943a2cd 100644
>> --- a/libsepol/src/hashtab.c
>> +++ b/libsepol/src/hashtab.c
>> @@ -63,41 +63,76 @@ hashtab_t hashtab_create(unsigned int (*hash_value) (hashtab_t h,
>> 	return p;
>> }
>> 
>> +int hashtab_insert__(hashtab_t h, hashtab_key_t key, hashtab_datum_t datum)
> 
> As above, this should be made static, and either use leading underscores
> or a meaningful suffix.
> 
>> +	/* If we don't need to resize, just insert the new entry */
>> +	if (h->size > h->nel)
> 
> resize threshold should be adjustable, at least via a #define, and can
> be precomputed during hashtab_create().
> 
>> +		return hashtab_insert__(h, key, datum);
>> +
>> +	/* Attempt to allocate a bigger table */
>> +	newtbl = (hashtab_ptr_t*) calloc(sizeof(hashtab_ptr_t), h->size * 8);
> 
> calloc order of arguments is wrong again.
> 
>> +	if (newtbl == NULL)
>> +		return hashtab_insert__(h, key, datum);
>> +
>> +	/* Reinsert all elements into the new table */
>> +	oldtbl = h->htable;
>> +	oldsize = h->size;
>> +	h->nel = 0;
>> +	h->size *= 8;
> 
> The factor should likely be definable.  Why 8 here vs 2 in the avtab
> case?
> 
>> +	h->htable = newtbl;
>> +
>> +	for (i = 0; i < oldsize; i++) {
>> +		hashtab_ptr_t prev = NULL;
>> +		hashtab_ptr_t cur = oldtbl[i];
>> +		while (cur != NULL) {
>> +			hashtab_insert__(h, cur->key, cur->datum);
> 
> This can fail.  Same as in the avtab case.  Optimal situation would be
> to re-chain the existing nodes into the new hashtab rather than
> allocating new nodes, copying the contents, and freeing the old ones.
> 
> -- 
> Stephen Smalley
> National Security Agency
> 

Matthew Robertson
eXMeritus Software

--
This message was distributed to subscribers of the selinux mailing list.
If you no longer wish to subscribe, send mail to majordomo@xxxxxxxxxxxxx with
the words "unsubscribe selinux" without quotes as the message.


[Index of Archives]     [Selinux Refpolicy]     [Linux SGX]     [Fedora Users]     [Fedora Desktop]     [Yosemite Photos]     [Yosemite Camping]     [Yosemite Campsites]     [KDE Users]     [Gnome Users]

  Powered by Linux