On 11/08, Mina Almasry wrote: > On Thu, Nov 7, 2024 at 7:01 PM Stanislav Fomichev <stfomichev@xxxxxxxxx> wrote: > > > > On 11/07, Mina Almasry wrote: > > > On Thu, Nov 7, 2024 at 5:30 PM Stanislav Fomichev <stfomichev@xxxxxxxxx> wrote: > > > > > > > > On 11/07, Mina Almasry wrote: > > > > > Document new behavior when the number of frags passed is too big. > > > > > > > > > > Signed-off-by: Mina Almasry <almasrymina@xxxxxxxxxx> > > > > > --- > > > > > Documentation/networking/devmem.rst | 9 +++++++++ > > > > > 1 file changed, 9 insertions(+) > > > > > > > > > > diff --git a/Documentation/networking/devmem.rst b/Documentation/networking/devmem.rst > > > > > index a55bf21f671c..d95363645331 100644 > > > > > --- a/Documentation/networking/devmem.rst > > > > > +++ b/Documentation/networking/devmem.rst > > > > > @@ -225,6 +225,15 @@ The user must ensure the tokens are returned to the kernel in a timely manner. > > > > > Failure to do so will exhaust the limited dmabuf that is bound to the RX queue > > > > > and will lead to packet drops. > > > > > > > > > > +The user must pass no more than 128 tokens, with no more than 1024 total frags > > > > > +among the token->token_count across all the tokens. If the user provides more > > > > > +than 1024 frags, the kernel will free up to 1024 frags and return early. > > > > > + > > > > > +The kernel returns the number of actual frags freed. The number of frags freed > > > > > +can be less than the tokens provided by the user in case of: > > > > > + > > > > > > > > [..] > > > > > > > > > +(a) an internal kernel leak bug. > > > > > > > > If you're gonna respin, might be worth mentioning that the dmesg > > > > will contain a warning in case of a leak? > > > > > > We will not actually warn in the likely cases of leak. > > > > > > We warn when we find an entry in the xarray that is not a net_iov, or > > > if napi_pp_put_page fails on that net_iov. Both are very unlikely to > > > happen honestly. > > > > > > The likely 'leaks' are when we don't find the frag_id in the xarray. > > > We do not warn on that because the user can intentionally trigger the > > > warning with invalid input. If the user is actually giving valid input > > > and the warn still happens, likely a kernel bug like I mentioned in > > > another thread, but we still don't warn. > > > > In this case, maybe don't mention the leaks at all? If it's not > > actionable, not sure how it helps? > > It's good to explain what the return code of the setsockopt means, and > when it would be less than the number of passed in tokens. > > Also it's not really 'not actionable'. I expect serious users of > devmem tcp to log such leaks in metrics and try to root cause the > userspace or kernel bug causing them if they happen. Right now it reads like both (a) and (b) have a similar probability. Maybe even (a) is more probable because you mention it first? In theory, any syscall can have a bug in it where it returns something bogus, so maybe at least downplay the 'leak' part a bit? "In the extremely rare cases, kernel might free less frags than requested .... " Imagine a situation where the user inadvertently tries to free the same token twice or something and gets the unexpected return value. Why? Might be the kernel leak, right?