Re: Looks like issue in handling active_nodes count in 4.19 kernel .

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/18/19 12:58 AM, Ravi Kumar Siddojigari wrote:
Yes this is the first time that we are getting this stress tested done on v4.19 kernel .
We had not tested this prior version of kernel though . Current proposed changes seems to really help and testing is still going on .
As per the delta it looks  change  6b6bc620  seem to be missing in earlier version of kernel not sure if this was the cause.

6b6bc620 shouldn't have altered any behavior; it was purely an encapsulation of the data structures. Both of the bugs you've identified were introduced by the xperms support in fa1aa143ac4a68. Maybe they were harder to trigger when the AVC was still using GFP_ATOMIC instead of GFP_NOWAIT, but they were bugs nonetheless.


Br ,
Ravi.
-----Original Message-----
From: Stephen Smalley <sds@xxxxxxxxxxxxx>
Sent: Tuesday, December 17, 2019 9:54 PM
To: Ravi Kumar Siddojigari <rsiddoji@xxxxxxxxxxxxxx>; selinux@xxxxxxxxxxxxxxx
Cc: paul@xxxxxxxxxxxxxx; linux-security-module@xxxxxxxxxxxxxxx
Subject: Re: Looks like issue in handling active_nodes count in 4.19 kernel .

On 12/17/19 10:52 AM, Stephen Smalley wrote:
On 12/17/19 10:40 AM, Ravi Kumar Siddojigari wrote:
Yes  indeed this is a stress test on ARM64 device with multicore
where most of the cores /tasks are stuck  in avc_reclaim_node .
We still see this issue even after picking the earlier patch "
selinux: ensure we cleanup the internal AVC counters on error in
avc_insert() commit: d8db60cb23e4"
Where selinux_state  during issue was as below where all the slots
are  NULL and the count was more than threshold.
Which seem to be calling avc_reclaim_node always and as the all the
slots are empty its going for full for- loop with locks and unlock
and taking too long .
Not sure what could make the  slots null , for sure its not due to
flush() /Reset(). We think that still we need to call  avc_kill_node
in update_node function .
Adding the patch below can you please review or correct the following
patch .


    selinux_state = (
      disabled = FALSE,
      enforcing = TRUE,
      checkreqprot = FALSE,
      initialized = TRUE,
      policycap = (TRUE, TRUE, TRUE, FALSE, FALSE, TRUE),
      avc = 0xFFFFFF9BEFF1E890 -> (
        avc_cache_threshold = 512,  /* <<<<<not configured and its
with default*/
        avc_cache = (
          slots = ((first = 0x0), (first = 0x0), (first = 0x0), (first
= 0x0), (first = 0x0), (first = 0x0), (first = 0x0), (first = 0x0),
(first = 0x0), (first = 0x0), (first = 0x0), (first = 0x0), (first
/*<<<< all are NULL */
          slots_lock = ((rlock = (raw_lock = (val = (counter = 0),
locked = 0, pending = 0, locked_pending = 0, tail = 0), magic =
3735899821, owner_cpu = 4294967295, owner = 0xFFFFFFFFFFFFFFFF,
dep_map = (key = 0xFFFFFF9BEFF298A8, cla
          lru_hint = (counter = 616831529),
          active_nodes = (counter = 547),   /*<<<<< increased more
than 512*/
          latest_notif = 1)),
      ss = 0xFFFFFF9BEFF2E578)


--
In AVC update we don't call avc_node_kill() when
avc_xperms_populate() fails, resulting in the
avc->avc_cache.active_nodes counter having a false value.In last patch this changes was missed , so correcting it.

Change-Id: Ic0298162cc766c0f21be7ab232e259766654dad3
Signed-off-by: Jaihind Yadav<jaihindyadav@xxxxxxxxxxxxxx>
---
   security/selinux/avc.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/selinux/avc.c b/security/selinux/avc.c index
91d24c2..3d1cff2 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -913,7 +913,7 @@ static int avc_update_node(struct selinux_avc
*avc,
          if (orig->ae.xp_node) {
                  rc = avc_xperms_populate(node, orig->ae.xp_node);
                  if (rc) {
-                       kmem_cache_free(avc_node_cachep, node);
+                       avc_node_kill(avc, node);
                          goto out_unlock;
                  }
          }
--

That looks correct to me; I guess that one got missed by the prior fix.
Still not sure how your AVC got into that state though...

Acked-by: Stephen Smalley <sds@xxxxxxxxxxxxx>

BTW, have you been running these stress tests on earlier kernels too?
If so, what version(s) are known to pass them?  I ask because this code has been present since v4.3 and this is the first such report.





[Index of Archives]     [Selinux Refpolicy]     [Linux SGX]     [Fedora Users]     [Fedora Desktop]     [Yosemite Photos]     [Yosemite Camping]     [Yosemite Campsites]     [KDE Users]     [Gnome Users]

  Powered by Linux