On 2020-03-28 23:17, Paul Moore wrote: > On Wed, Mar 25, 2020 at 8:29 AM Richard Guy Briggs <rgb@xxxxxxxxxx> wrote: > > On 2020-03-20 17:56, Paul Moore wrote: > > > On Thu, Mar 19, 2020 at 5:48 PM Richard Guy Briggs <rgb@xxxxxxxxxx> wrote: > > > > On 2020-03-18 17:47, Paul Moore wrote: > > > > > On Wed, Mar 18, 2020 at 5:42 PM Richard Guy Briggs <rgb@xxxxxxxxxx> wrote: > > > > > > On 2020-03-18 17:01, Paul Moore wrote: > > > > > > > On Fri, Mar 13, 2020 at 3:23 PM Richard Guy Briggs <rgb@xxxxxxxxxx> wrote: > > > > > > > > On 2020-03-13 12:42, Paul Moore wrote: > > > > > > > > > > > > > > ... > > > > > > > > > > > > > > > > The thread has had a lot of starts/stops, so I may be repeating a > > > > > > > > > previous suggestion, but one idea would be to still emit a "death > > > > > > > > > record" when the final task in the audit container ID does die, but > > > > > > > > > block the particular audit container ID from reuse until it the > > > > > > > > > SIGNAL2 info has been reported. This gives us the timely ACID death > > > > > > > > > notification while still preventing confusion and ambiguity caused by > > > > > > > > > potentially reusing the ACID before the SIGNAL2 record has been sent; > > > > > > > > > there is a small nit about the ACID being present in the SIGNAL2 > > > > > > > > > *after* its death, but I think that can be easily explained and > > > > > > > > > understood by admins. > > > > > > > > > > > > > > > > Thinking quickly about possible technical solutions to this, maybe it > > > > > > > > makes sense to have two counters on a contobj so that we know when the > > > > > > > > last process in that container exits and can issue the death > > > > > > > > certificate, but we still block reuse of it until all further references > > > > > > > > to it have been resolved. This will likely also make it possible to > > > > > > > > report the full contid chain in SIGNAL2 records. This will eliminate > > > > > > > > some of the issues we are discussing with regards to passing a contobj > > > > > > > > vs a contid to the audit_log_contid function, but won't eliminate them > > > > > > > > all because there are still some contids that won't have an object > > > > > > > > associated with them to make it impossible to look them up in the > > > > > > > > contobj lists. > > > > > > > > > > > > > > I'm not sure you need a full second counter, I imagine a simple flag > > > > > > > would be okay. I think you just something to indicate that this ACID > > > > > > > object is marked as "dead" but it still being held for sanity reasons > > > > > > > and should not be reused. > > > > > > > > > > > > Ok, I see your point. This refcount can be changed to a flag easily > > > > > > enough without change to the api if we can be sure that more than one > > > > > > signal can't be delivered to the audit daemon *and* collected by sig2. > > > > > > I'll have a more careful look at the audit daemon code to see if I can > > > > > > determine this. > > > > > > > > > > Maybe I'm not understanding your concern, but this isn't really > > > > > different than any of the other things we track for the auditd signal > > > > > sender, right? If we are worried about multiple signals being sent > > > > > then it applies to everything, not just the audit container ID. > > > > > > > > Yes, you are right. In all other cases the information is simply > > > > overwritten. In the case of the audit container identifier any > > > > previous value is put before a new one is referenced, so only the last > > > > signal is kept. So, we only need a flag. Does a flag implemented with > > > > a rcu-protected refcount sound reasonable to you? > > > > > > Well, if I recall correctly you still need to fix the locking in this > > > patchset so until we see what that looks like it is hard to say for > > > certain. Just make sure that the flag is somehow protected from > > > races; it is probably a lot like the "valid" flags you sometimes see > > > with RCU protected lists. > > > > This is like looking for a needle in a haystack. Can you point me to > > some code that does "valid" flags with RCU protected lists. > > Sigh. Come on Richard, you've been playing in the kernel for some > time now. I can't think of one off the top of my head as I write > this, but there are several resources that deal with RCU protected > lists in the kernel, Google is your friend and Documentation/RCU is > your friend. Ok, I thought you were talking about a specific piece of code... > Spending time to learn how RCU works and how to use it properly is not > time wasted. It's a tricky thing to get right (I have to refresh my > memory on some of the more subtle details each time I write/review RCU > code), but it's very cool when done correctly. I review Documentation/RCU almost every time I work on RCU... > paul moore - RGB -- Richard Guy Briggs <rgb@xxxxxxxxxx> Sr. S/W Engineer, Kernel Security, Base Operating Systems Remote, Ottawa, Red Hat Canada IRC: rgb, SunRaycer Voice: +1.647.777.2635, Internal: (81) 32635