Re: [PATCH v3] selinux: sidtab: reverse lookup hash table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/5/19 8:42 AM, Jeff Vander Stoep wrote:
This replaces the reverse table lookup and reverse cache with a
hashtable which improves cache-miss reverese-lookup times from

reverse

O(n) to O(1)* and maintains the same performance as a reverse
cache hit.

This reduces the time needed to add a new sidtab entry from ~500us
to 5us on a Pixel 3 when there are ~10,000 sidtab entries.

The implementation uses the kernel's generic hashtable API,
It uses the context's string represtation as the hash source,
and the kernels generic string hashing algorithm full_name_hash()
to reduce the string to a 32 bit value.

This change also maintains the improvement introduced in commit
ee1a84fd which removed the need to keep the current sidtab locked

Preferred style is "commit <12+ chars of sha1> ("title line")"; checkpatch.pl would have caught it except for the line break between commit and the hash I think. ala commit ee1a84fdfeed ("selinux: overhaul sidtab to fix bug and improve performance") This ensures that the commit is unambiguously identified and the reader gets an idea of what it is without having to go look.

during policy reload. It does however introduce periodic locking of
the target sidtab while converting the hashtable. Sidtab entries
are never modified or removed, so the context struct stored in the
sid_to_context tree can also be used for the context_to_sid
hashtable to reduce memory usage.

This bug was reported by:
- Stephen Smally on the selinux bug tracker.

Smalley (although I don't think I'm the original reporter, came from a mailing list email IIRC)

<snip>
V3:
-Move the hash_add() and count increment to after an smp_wmb()
statement to enforce that prior objects have been written before
becoming available for lookups.

smp_wmb() is SMP-conditional; wmb() is mandatory per memory-barriers.txt. I thought we needed wmb() for the old sidtab hash table update to avoid reordering even on the UP case. And I think memory-barriers.txt recommends pairing every write barrier with a corresponding read barrier for the reader code, which my code lacked (since memory-barriers.txt didn't exist at the time and I was just following the guidance from other docs), so presumably you want/need that.

-Add lock nesting notation.

Is it truly needed locking nesting or can we get rid of it as suggested by Ondrej?

Patch wouldn't apply for me (on selinux/next); not sure why but some comments below.


Change-Id: I9b498100c94f05222c54e24f53eba76468f360fd

Don't include AOSP Change-Ids in upstream patches.

<snip>
diff --git a/security/selinux/Kconfig b/security/selinux/Kconfig
index 8af7a690eb40..5494379e1ac7 100644
--- a/security/selinux/Kconfig
+++ b/security/selinux/Kconfig
@@ -99,3 +99,15 @@ config SECURITY_SELINUX_CHECKREQPROT_VALUE
  	  via /selinux/checkreqprot if authorized by policy.
If you are unsure how to answer this question, answer 0.
+
+config SECURITY_SELINUX_SIDTAB_HASH_BITS
+	int "NSA SELinux sidtab hashtable size"
+	depends on SECURITY_SELINUX
+	range 8 13
+	default 9
+	help
+	  This option sets the number of buckets used in the sitab hashtable

sidtab

+	  to 2^SECURITY_SELINUX_SIDTAB_HASH_BITS buckets. The number of hash
+	  collisions may be viewed at /selinux/ss/sidtab_hash_stats. If chain

Since Linux 3.0, selinuxfs is mounted on /sys/fs/selinux rather than /selinux so it would be /sys/fs/selinux/ss/sidtab_hash_stats. The other config option help texts are out of date but don't worry about those in this patch.

+	  lengths are high (e.g. > 20) than selecting a higher value here will
+	  ensure that lookups times are fast and stable.
<snip>
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index 7eff19c41bcb..4bb4ff027042 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
<snip>
@@ -1576,6 +1602,7 @@ static int sel_make_avc_files(struct dentry *dir)
  		{ "cache_threshold",
  		  &sel_avc_cache_threshold_ops, S_IRUGO|S_IWUSR },
  		{ "hash_stats", &sel_avc_hash_stats_ops, S_IRUGO },
+		{ "sidtab_hash_stats", &sel_sidtab_hash_stats_ops, S_IRUGO },

Still awaiting a verdict from Paul on whether we should conform to checkpatch.pl warning on symbolic modes or stick with them for consistency until such a time as we convert them en masse?

  #ifdef CONFIG_SECURITY_SELINUX_AVC_STATS
  		{ "cache_stats", &sel_avc_cache_stats_ops, S_IRUGO },
  #endif
@@ -1601,6 +1628,35 @@ static int sel_make_avc_files(struct dentry *dir)
  	return 0;
  }
+static int sel_make_ss_files(struct dentry *dir)
+{
+	struct super_block *sb = dir->d_sb;
+	struct selinux_fs_info *fsi = sb->s_fs_info;
+	int i;
+	static struct tree_descr files[] = {
+		{ "sidtab_hash_stats", &sel_sidtab_hash_stats_ops, S_IRUGO },
+	};
+
+	for (i = 0; i < ARRAY_SIZE(files); i++) {
+		struct inode *inode;
+		struct dentry *dentry;
+
+		dentry = d_alloc_name(dir, files[i].name);
+		if (!dentry)
+			return -ENOMEM;
+
+		inode = sel_make_inode(dir->d_sb, S_IFREG|files[i].mode);
+		if (!inode)

dput(dentry); here (got fixed elsewhere recently IIRC).

+			return -ENOMEM;
+
+		inode->i_fop = files[i].ops;
+		inode->i_ino = ++fsi->last_ino;
+		d_add(dentry, inode);
+	}
+
+	return 0;
+}
+
  static ssize_t sel_read_initcon(struct file *file, char __user *buf,
  				size_t count, loff_t *ppos)
  {
<snip>
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 62e6a53f4ac9..9578fb3e994f 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
<snip>
@@ -1433,6 +1446,7 @@ static int string_to_context_struct(struct policydb *pol,
  	rc = -EINVAL;
  	if (!policydb_context_isvalid(pol, ctx))
  		goto out;
+
  	rc = 0;
  out:
  	if (rc)

Avoid unrelated whitespace changes

@@ -1440,6 +1454,42 @@ static int string_to_context_struct(struct policydb *pol,
  	return rc;
  }
+int context_add_hash(struct policydb *policydb,
+		     struct context *context)
+{
+	int rc;
+	char *str;
+	int len;
+
+	if (context->str) {
+		context->hash = context_compute_hash(context->str);
+	} else {
+		rc = context_struct_to_string(policydb, context,
+					      &str, &len);
+		if (rc)
+			return rc;
+		context->hash = context_compute_hash(str);
+		kfree(str);
+	}

I'm guessing that KCSAN is going to complain about this and other existing cases in our code that rely on atomic load/store of naturally-aligned word-sized values. At some point we'll probably need to convert them all to using READ_ONCE/WRITE_ONCE or similar constructs when we do this outside of a spinlock or write lock (as here). Offhand, this may only be a problem for the cases where we are passing in a context structure that is part of the active policydb, as in security_port_sid() and other callers, since it is then a shared structure that could be visible to other threads. Same issue already exists however for the sid field there so not really new.

+	return 0;
+}
+
<snip>
diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
index 9d156d5156da..2267878da626 100644
--- a/security/selinux/ss/sidtab.c
+++ b/security/selinux/ss/sidtab.c
@@ -16,26 +16,37 @@
  #include "security.h"
  #include "sidtab.h"
+#define index_to_sid(index) (index + SECINITSID_NUM + 1)
+#define sid_to_index(sid) (sid - (SECINITSID_NUM + 1))
+
  int sidtab_init(struct sidtab *s)
  {
  	u32 i;
memset(s->roots, 0, sizeof(s->roots)); - /* max count is SIDTAB_MAX so valid index is always < SIDTAB_MAX */
-	for (i = 0; i < SIDTAB_RCACHE_SIZE; i++)
-		s->rcache[i] = SIDTAB_MAX;
-
  	for (i = 0; i < SECINITSID_NUM; i++)
  		s->isids[i].set = 0;
s->count = 0;
  	s->convert = NULL;
+	hash_init(s->context_to_sid);
spin_lock_init(&s->lock);
  	return 0;
  }
+static u32 context_to_sid(struct sidtab *s, struct context *context)
+{
+	struct sidtab_entry_leaf *entry;
+

I think we need a read barrier or ACQUIRE operation of some kind here or in the callers to ensure that the hash is in a good state.

+	hash_for_each_possible(s->context_to_sid, entry, list, context->hash) {
+		if (context_cmp(&entry->context, context))
+			return entry->sid;
+	}
+	return 0;
+}
+
  int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
  {
  	struct sidtab_isid_entry *entry;
<snip>
@@ -304,29 +274,35 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
  		rc = -ENOMEM;
  		dst_convert = sidtab_do_lookup(convert->target, count, 1);
  		if (!dst_convert) {
-			context_destroy(dst);
+			context_destroy(&dst->context);
  			goto out_unlock;
  		}
- rc = convert->func(context, dst_convert, convert->args);
+		rc = convert->func(context, &dst_convert->context,
+				convert->args);
  		if (rc) {
-			context_destroy(dst);
+			context_destroy(&dst->context);
  			goto out_unlock;
  		}
+		dst_convert->sid = index_to_sid(count);
/* at this point we know the insert won't fail */
+		spin_lock_irqsave(&convert->target->lock, flags);
  		convert->target->count = count + 1;
+		hash_add(convert->target->context_to_sid,
+			 &dst_convert->list, dst_convert->context.hash);
+		spin_unlock_irqrestore(&convert->target->lock, flags);

As Ondrej asked, do we need to take this lock?

  	}
if (context->len)
  		pr_info("SELinux:  Context %s is not valid (left unmapped).\n",
  			context->str);
- sidtab_rcache_push(s, count);
-	*index = count;
+	*sid = index_to_sid(count);
- /* write entries before writing new count */
+	/* Write entries before updating count and inserting into hashtable. */
  	smp_store_release(&s->count, count + 1);
+	hash_add(s->context_to_sid, &dst->list, dst->context.hash);

I think we need a separate acquire/release or write/read barrier pairing for the reverse hash table that is independent of the one for the forward table.

rc = 0;
  out_unlock:



[Index of Archives]     [Selinux Refpolicy]     [Linux SGX]     [Fedora Users]     [Fedora Desktop]     [Yosemite Photos]     [Yosemite Camping]     [Yosemite Campsites]     [KDE Users]     [Gnome Users]

  Powered by Linux