On Mon, Jun 3, 2024 at 2:23 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Sun, Jun 2, 2024 at 10:53 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > > > > Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes: > > > > > On Sat, Jun 1, 2024 at 11:57 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > >> > > >> On Sun, Jun 2, 2024 at 11:52 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > > >> > > > >> > Yafang Shao <laoar.shao@xxxxxxxxx> writes: > > >> > > > >> > > Quoted from Linus [0]: > > >> > > > > >> > > Since user space can randomly change their names anyway, using locking > > >> > > was always wrong for readers (for writers it probably does make sense > > >> > > to have some lock - although practically speaking nobody cares there > > >> > > either, but at least for a writer some kind of race could have > > >> > > long-term mixed results > > >> > > > >> > Ugh. > > >> > Ick. > > >> > > > >> > This code is buggy. > > >> > > > >> > I won't argue that Linus is wrong, about removing the > > >> > task_lock. > > >> > > > >> > Unfortunately strscpy_pad does not work properly with the > > >> > task_lock removed, and buf_size larger that TASK_COMM_LEN. > > >> > There is a race that will allow reading past the end > > >> > of tsk->comm, if we read while tsk->common is being > > >> > updated. > > >> > > >> It appears so. Thanks for pointing it out. Additionally, other code, > > >> such as the BPF helper bpf_get_current_comm(), also uses strscpy_pad() > > >> directly without the task_lock. It seems we should change that as > > >> well. > > > > > > Hmm. What race do you see? > > > If lock is removed from __get_task_comm() it probably can be removed from > > > __set_task_comm() as well. > > > And both are calling strscpy_pad to write and read comm. > > > So I don't see how it would read past sizeof(comm), > > > because 'buf' passed into __set_task_comm is NUL-terminated. > > > So the concurrent read will find it. > > > > The read may race with a write that is changing the location > > of '\0'. Especially if the new value is shorter than > > the old value. > > so ? > strscpy_pad in __[gs]et_task_comm will read/write either long > or byte at a time. > Assume 64 bit and, say, we had comm where 2nd u64 had NUL. > Now two cpus are racing. One is writing shorter comm. > Another is reading. > The latter can read 1st u64 without NUL and will proceed > to read 2nd u64. Either it will read the old u64 with NUL in it > or it will read all zeros in 2nd u64 or some zeros in 2nd u64 > depending on how the compiler generated memset(.., 0, ..) > as part of strscpy_pad(). > _pad() part is critical here. > If it was just strscpy() then there would indeed be a chance > of reading both u64-s and not finding NUL in any of them. > > > If you are performing lockless reads and depending upon a '\0' > > terminator without limiting yourself to the size of the buffer > > there needs to be a big fat comment as to how in the world > > you are guaranteed that a '\0' inside the buffer will always > > be found. > > I think Yafang can certainly add such a comment next to > __[gs]et_task_comm. > > I prefer to avoid open coding memcpy + mmemset when strscpy_pad works. Thanks for your explanation. I will add a comment for it in the next version. -- Regards Yafang