Re: corruption due to loss of lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 11 Jul 2013 14:19:10 +0000
"Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> wrote:

> On Thu, 2013-07-11 at 07:13 -0400, Jeff Layton wrote:
> > On Thu, 13 Jun 2013 13:47:37 -0500
> > Malahal Naineni <malahal@xxxxxxxxxx> wrote:
> > 
> > > Hi Trond,
> > > 
> > > I saw Bryan's patches here https://patchwork.kernel.org/patch/987402/
> > > that fix issues after loss of a lock.  What is the status on this patch
> > > set? Do they need more work? We have an application that uses range
> > > locks on a file. Two threads from two different clients end up writing
> > > to the same a file due to this bug after a lease expiry from a client.
> > > 
> > > Regards, Malahal.
> > 
> > (cc'ing Bryan since he did the original set)
> > 
> > Yeah, this set would be a nice thing to have. A couple of comments:
> > 
> > - I still think it would be best to make SIGLOST its own signal, but as
> >   Bryan points out, it would need to be larger than SIGRTMAX. I'm
> >   not sure that's possible on all arches with the way the RT signals
> >   were done. It's probably worth investigating that though before
> >   settling on SIGIO since it would be hard to change that retroactively.
> > 
> > - This is not really a v4.1 specific thing. It should also be done for
> >   v4.0 and v2/3, though the latter two really need to be done within
> >   lockd.
> 
> SIGLOST is not part of any standard. It is a hack that has been adopted
> by IBM and Solaris.
> 
> The POSIXly correct way to do this is to use EBADF to warn the
> application that the file descriptor is no longer valid (in the sense
> that the server is no longer honouring the lock) and EIO in order to
> warn it that data may have been lost.
> 

It is a hack...I won't argue there

I'm not sure that returning errors is really the best approach though.
In some cases, the fd may be fine. It may only be the lock that has
been lost.

With a signal, the program has more of a choice as to whether it cares
about lost locks and is more immediate when the problem occurs. An
error code seems like it might cause a lot of grief for programs that
aren't expecting that sort of behavior.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux