On Tue, Feb 25, 2025 at 8:05 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > > Hi, > > On 2/26/2025 11:24 AM, Alexei Starovoitov wrote: > > On Sat, Feb 8, 2025 at 2:17 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > >> Hi Toke, > >> > >> On 2/6/2025 11:05 PM, Toke Høiland-Jørgensen wrote: > >>> Hou Tao <houtao@xxxxxxxxxxxxxxx> writes: > >>> > >>>> +cc Cody Haas > >>>> > >>>> Sorry for the resend. I sent the reply in the HTML format. > >>>> > >>>> On 2/4/2025 4:28 PM, Hou Tao wrote: > >>>>> Currently, the update of existing element in hash map involves two > >>>>> steps: > >>>>> 1) insert the new element at the head of the hash list > >>>>> 2) remove the old element > >>>>> > >>>>> It is possible that the concurrent lookup operation may fail to find > >>>>> either the old element or the new element if the lookup operation starts > >>>>> before the addition and continues after the removal. > >>>>> > >>>>> Therefore, replacing the two-step update with an atomic update. After > >>>>> the change, the update will be atomic in the perspective of the lookup > >>>>> operation: it will either find the old element or the new element. > > I'm missing the point. > > This "atomic" replacement doesn't really solve anything. > > lookup will see one element. > > That element could be deleted by another thread. > > bucket lock and either two step update or single step > > don't change anything from the pov of bpf prog doing lookup. > > The point is that overwriting an existed element may lead to concurrent > lookups return ENOENT as demonstrated by the added selftest and the > patch tried to "fix" that. However, it seems using > hlist_nulls_replace_rcu() for the overwriting update is still not > enough. Because when the lookup procedure found the old element, the old > element may be reusing, the comparison of the map key may fail, and the > lookup procedure may still return ENOENT. you mean l_old == l_new ? I don't think it's possible within htab_map_update_elem(), but htab_map_delete_elem() doing hlist_nulls_del_rcu() then free_htab_elem, htab_map_update_elem, alloc, hlist_nulls_add_head_rcu and just deleted elem is reused to be inserted into another bucket. I'm not sure whether this new hlist_nulls_replace_rcu() primitive works with nulls logic. So back to the problem statement.. Are you saying that adding new to a head while lookup is in the middle is causing it to miss an element that is supposed to be updated assuming atomicity of the update? While now update_elem() is more like a sequence of delete + insert? Hmm. > I see. In v2 I will fallback to the original idea: adding a standalone > update procedure for htab of maps in which it will atomically overwrite > the map_ptr just like array of maps does. hold on. is this only for hash-of-maps ? How that non-atomic update manifested in real production?