Hi, On Mon, May 29, 2023 at 6:18 PM Alexander Aring <aahringo@xxxxxxxxxx> wrote: > > Hi, > > On Thu, May 25, 2023 at 11:02 AM Andreas Gruenbacher > <agruenba@xxxxxxxxxx> wrote: > > > > On Wed, May 24, 2023 at 6:02 PM Alexander Aring <aahringo@xxxxxxxxxx> wrote: > > > This patch fixes a possible plock op collisions when using F_SETLKW lock > > > requests and fsid, number and owner are not enough to identify a result > > > for a pending request. The ltp testcases [0] and [1] are examples when > > > this is not enough in case of using classic posix locks with threads and > > > open filedescriptor posix locks. > > > > > > The idea to fix the issue here is to place all lock request in order. In > > > case of non F_SETLKW lock request (indicated if wait is set or not) the > > > lock requests are ordered inside the recv_list. If a result comes back > > > the right plock op can be found by the first plock_op in recv_list which > > > has not info.wait set. This can be done only by non F_SETLKW plock ops as > > > dlm_controld always reads a specific plock op (list_move_tail() from > > > send_list to recv_mlist) and write the result immediately back. > > > > > > This behaviour is for F_SETLKW not possible as multiple waiters can be > > > get a result back in an random order. To avoid a collisions in cases > > > like [0] or [1] this patch adds more fields to compare the plock > > > operations as the lock request is the same. This is also being made in > > > NFS to find an result for an asynchronous F_SETLKW lock request [2][3]. We > > > still can't find the exact lock request for a specific result if the > > > lock request is the same, but if this is the case we don't care the > > > order how the identical lock requests get their result back to grant the > > > lock. > > > > When the recv_list contains multiple indistinguishable requests, this > > can only be because they originated from multiple threads of the same > > process. In that case, I agree that it doesn't matter which of those > > requests we "complete" in dev_write() as long as we only complete one > > request. We do need to compare the additional request fields in > > dev_write() to find a suitable request, so that makes sense as well. > > We need to compare all of the fields that identify a request (optype, > > ex, wait, pid, nodeid, fsid, number, start, end, owner) to find the > > "right" request (or in case there is more than one identical request, > > a "suitable" request). > > > > In my "definition" why this works is as you said the "identical > request". There is a more deeper definition of "when is a request > identical" and in my opinion it is here as: "A request A is identical > to request B when they get granted under the same 'time'" which is all > the fields you mentioned. s/under/at/ at the same 'time' or under the same conditions... - Alex