Re: [PATCH v2 1/1] RDMA/addr: Refresh neighbour entries upon "rdma_resolve_addr"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jason,

On Fri, 2022-06-24 at 21:10 -0300, Jason Gunthorpe wrote:
> On Fri, Jun 24, 2022 at 05:03:23PM -0700, Gerd Rausch wrote:
[...]
> > This in "dst_fetch_ha"?:
> > --------%<--------%<--------%<--------%<--------%<--------
> >         if (!(n->nud_state & NUD_VALID)) {
> >                 neigh_event_send(n, NULL);
> >                 ret = -ENODATA;
> > --------%<--------%<--------%<--------%<--------%<--------
> > 
> > With:
> > #define NUD_VALID	(NUD_PERMANENT|NUD_NOARP|NUD_REACHABLE|NUD_PROBE
> > > NUD_STALE|NUD_DELAY)
> > 
> > and the knowledge that the ARP entry is NUD_STALE,
> > how would that function be called and perform the necessary refresh?
> > 
> > 
> > > I think there is more going on here than this commit message is
> > > explaining.
> > > 
> > 
> > If the intention of the above mentioned "neigh_event_send" was to
> > refresh stale entries, then the "if" condition ought to change, no?
> 
> Maybe yes.
> 
> If you are saying with this patch that neigh_event_send() should be
> called unconditionally for every 'packet' why not remove the test
> above instead of calling it twice?
> 

It looks like starting from the very first time when "neigh_event_send"
was introduced here:

935ef2d7a291 RDMA/cma: Use neigh_event_send() to start neighbour discovery

it was always strictly coupled to the "-ENODATA" case.

In other words, I don't see how the STALE case was covered, and I'd have
to guess wether the -ENODATA coupling was intentional or accidental.

Now it may be perfectly possible to just make the "neigh_event_send"
call above unconditional and call it a day.

I mean, simpler fixes always win over more complex ones.

But, I personally don't know the twists & turns of this code well enough
to be able to assert that this won't leave any non-covered conditional
corner cases. I certainly hadn't tested that.


> On the other hand I see many places checking for NUD_VALID before
> calling it.
> 
> So, really the commit message needs to explain how neigh_event_send()
> is supposed to be used and explain that NUD_VALID is OK to drop.
> 

Some places do check for NUD_VALID (e.g. "__ip_do_redirect"),
many others don't (e.g. "__teql_resolve", or "neigh_resolve_output"
itself).

Even in the case of "dst_fetch_ha":
I can not state whether the check for !NUD_VALUD was done intentionally
or not. I can observe though that the side-effect of that check is that
STALE entries won't get refreshed. So something is clearly off here.

Let's look "__ip_do_redirect":
I would have to guess why that check for !NUD_VALID is there.
It's been there as far back as the GIT history goes.
But I prefer not having to put guess-work into my Commit-Log.

That said, I don't mind putting guess-work in this e-mail though:
Perhaps the author(s) knew that the code would have to go through
"ip_finish_output2" (or whatever the predecessor's name was) eventually,
and therefore it was okay to kick off a refresh only if !NUD_VALID.

At the end of the day, if the desired behavior is to always refresh
STALE entries, we should do so.
Calling "neigh_event_send" for !NUD_VALID only won't achieve that.

> I'm having a hard time guessing from a quick look around.
> 

Thanks for looking though,

  Gerd





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux