Hi Mimi : The situation is a little different,'list_splice_init_rcu' don't change the list head. If "ima_rules" being changed, readers may can't reload the new value in time for cpu cache or compiler optimization. Defining "ima_rules" as a volatile variable can fix, but It is inefficient. Maybe using a temporary ima_rules variable for every "list_for_each_entry_rcu(entry, ima_rules, list)" loop is a better solution to fix the "endless loop" bug. Regards, liqiong 在 2021年08月20日 23:48, Mimi Zohar 写道: > On Fri, 2021-08-20 at 13:23 +0000, THOBY Simon wrote: >> Hi Liqiong, >> >> On 8/20/21 12:15 PM, 李力琼 wrote: >>> Hi, Simon: >>> >>> This solution is better then rwsem, a temp "ima_rules" variable should >>> can fix. I also have a another idea, with a little trick, default list >>> can traverse to the new list, so we don't need care about the read side. >>> >>> here is the patch: >>> >>> @@ -918,8 +918,21 @@ void ima_update_policy(void) >>> list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu); >>> >>> if (ima_rules != policy) { >>> + struct list_head *prev_rules = ima_rules; >>> + struct list_head *first = ima_rules->next; >>> ima_policy_flag = 0; >>> + >>> + /* >>> + * Make the previous list can traverse to new list, >>> + * that is tricky, or there is a deadly loop whithin >>> + * "list_for_each_entry_rcu(entry, ima_rules, list)" >>> + * >>> + * After update "ima_rules", restore the previous list. >>> + */ >> I think this could be rephrased to be a tad clearer, I am not quite sure >> how I must interpret the first sentence of the comment. >> >> >>> + prev_rules->next = policy->next; >>> ima_rules = policy; >>> + syncchronize_rcu(); >> I'm a bit puzzled as you seem to imply in the mail this patch was tested, >> but there is no 'syncchronize_rcu' (with two 'c') symbol in the kernel. >> Was that a copy/paste error? Or maybe you forgot the 'not' in "This >> patch has been tested"? These errors happen, and I am myself quite an >> expert in doing them :) >> >>> + prev_rules->next = first; >>> >>> >>> The side effect is the "ima_default_rules" will be changed a little while. >>> But it make sense, the process should be checked again by the new policy. >>> >>> This patch has been tested, if will do, I can resubmit this patch.> >>> How about this ? >> least >> >> Correct me if I'm wrong, here is how I think I understand you patch. >> We start with a situation like that (step 0): >> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >> >> Then we decide to update the policy for the first time, so >> 'ima_rules [&ima_default_rules] != policy [&ima_policy_rules]'. >> We enter the condition. >> First we copy the current value of ima_rules (&ima_default_rules) >> to a temporary variable 'prev_rules'. We also create a pointer dubbed >> 'first' to the entry 1 in the default list (step 1): >> prev_rules ------------- >> \/ >> ima_rules --> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >> /\ >> first -------------------------------------------------------------- >> >> >> Then we update prev_rules->next to point to policy->next (step 2): >> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >> /\ >> first >> (notice that list entry 0 no longer points backwards to 'list entry 1', >> but I don't think there is any reverse iteration in IMA, so it should be >> safe) >> >> prev_rules ------------- >> \/ >> ima_rules --> List entry 0 (head node) = ima_default_rules >> | >> | >> ------------------------------------------- >> \/ >> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >> >> >> We then update ima_rules to point to ima_policy_rules (step 3): >> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >> /\ >> first >> >> prev_rules ------------- >> \/ >> ima_rules List entry 0 (head node) = ima_default_rules >> | | >> | | >> | ------------------------------------------ >> --------------- | >> \/ \/ >> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >> synchronize_rcu /\ >> first -------------------------------------------------------------- >> >> Then we run synchronize_rcu() to wait for any RCU reader to exit their loops (step 4). >> >> Finally we update prev_rules->next to point back to the ima policy and fix the loop (step 5): >> >> List entry 1 <-> List entry 2 <-> ... -> List entry 0 >> /\ >> first >> >> prev_rules ---> List entry 0 (head node) = ima_default_rules <-> List entry 1 <-> List entry 2 <-> ... <-> List entry 0 >> /\ >> first (now useless) >> ima_rules >> | >> | >> | >> --------------- >> \/ >> policy --> policy entry 0' (head node) = ima_policy_rules <-> policy entry 1' <-> policy entry 2' <-> .... <-> policy entry 0' >> >> The goal is that readers should still be able to loop >> (forward, as we saw that backward looping is temporarily broken) >> while in steps 0-4. >> >> I'm not completely sure what would happen to a client that started iterating >> over ima_rules right after step 2. >> >> Wouldn't they be able to start looping through the new policy >> as 'List entry 0 (head node) = ima_default_rules' points to ima_policy_rules? >> And if they, wouldn't they loop until the write to 'ima_rule' at step 3 (admittedly >> very shortly thereafter) completed? >> And would the compiler be allowed to optimize the read to 'ima_rules' in the >> list_for_each_entry() loop, thereby never reloading the new value for >> 'ima_rules', and thus looping forever, just what we are trying to avoid? >> >> Overall, I'm tempted to say this is perhaps a bit too complex (at least, >> my head tells me it is, but that may very well be because I'm terrible >> at concurrency issues). >> >> Honestly, in this case I think awaiting input from more experienced >> kernel devs than I is the best path forward :-) > I'm far from an expert on RCU locking, but __list_splice_init_rcu() > provides an example of how to make sure there aren't any readers > traversing the list, before two lists are spliced together. In our > case, after there aren't any readers, instead of splicing two lists > together, it should be safe to point to the new list. > > thanks, > > Mimi >