Re: [PATCH v3] nfsd: disallow file locking and delegations for NFSv4 reexport

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2024-10-30 at 15:48 -0700, Rick Macklem wrote:
> On Wed, Oct 30, 2024 at 10:08 AM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
> > 
> > CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@xxxxxxxxxxx.
> > 
> > 
> > 
> > 
> > > On Oct 30, 2024, at 12:37 PM, Cedric Blancher <cedric.blancher@xxxxxxxxx> wrote:
> > > 
> > > On Wed, 30 Oct 2024 at 17:15, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
> > > > 
> > > > 
> > > > 
> > > > > On Oct 30, 2024, at 10:55 AM, Cedric Blancher <cedric.blancher@xxxxxxxxx> wrote:
> > > > > 
> > > > > On Tue, 29 Oct 2024 at 17:03, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
> > > > > > 
> > > > > > > On Oct 29, 2024, at 11:54 AM, Brian Cowan <brian.cowan@xxxxxxxxxxxxxxxx> wrote:
> > > > > > > 
> > > > > > > Honestly, I don't know the usecase for re-exporting another server's
> > > > > > > NFS export in the first place. Is this someone trying to share NFS
> > > > > > > through a firewall? I've seen people share remote NFS exports via
> > > > > > > Samba in an attempt to avoid paying their NAS vendor for SMB support.
> > > > > > > (I think it's "standard equipment" now, but 10+ years ago? Not
> > > > > > > always...) But re-exporting another server's NFS exports? Haven't seen
> > > > > > > anyone do that in a while.
> > > > > > 
> > > > > > The "re-export" case is where there is a central repository
> > > > > > of data and branch offices that access that via a WAN. The
> > > > > > re-export servers cache some of that data locally so that
> > > > > > local clients have a fast persistent cache nearby.
> > > > > > 
> > > > > > This is also effective in cases where a small cluster of
> > > > > > clients want fast access to a pile of data that is
> > > > > > significantly larger than their own caches. Say, HPC or
> > > > > > animation, where the small cluster is working on a small
> > > > > > portion of the full data set, which is stored on a central
> > > > > > server.
> > > > > > 
> > > > > Another use case is "isolation", IT shares a filesystem to your
> > > > > department, and you need to re-export only a subset to another
> > > > > department or homeoffice. Part of such a scenario might also be policy
> > > > > related, e.g. IT shares you the full filesystem but will do NOTHING
> > > > > else, and any further compartmentalization must be done in your own
> > > > > department.
> > > > > This is the typical use case for gov NFS re-export.
> > > > 
> > > > It's not clear to me from this description why re-export is
> > > > the right tool for this job. Please explain why ACLs are not
> > > > used in this case -- this is exactly what they are designed
> > > > to do.
> > > 
> > > 1. IT departments want better/harder/immutable isolation than ACLs
> > 
> > So you want MAC, and the storage administrator won't set
> > that up for you on the NFS server. NFS doesn't do MAC
> > very well if at all.
> > 
> > 
> > > 2. Linux NFSv4 only implements POSIX draft ACLs, not full Windows or
> > > NFSv4 ACLs. So there is no proper way to prevent ACL editing,
> > > rendering them useless in this case.
> > 
> > Er. Linux NFSv4 stores the ACLs as POSIX draft, because
> > that's what Linux file systems can support. NFSD, via
> > NFSv4, makes these appear like NFSv4 ACLs.
> > 
> > But I think I understand.
> > 
> > 
> > > There is a reason why POSIX draft ACls were abandoned - they are not
> > > fine-granted enough for real world usage outside the Linux universe.
> > > As soon as interoperability is required these things just bite you
> > > HARD.
> > 
> > You, of course, have the ability to run some other NFS
> > server implementation that meets your security requirements
> > more fully.
> > 
> > 
> > > Also, just running more nfsd in parallel on the origin NFS server is
> > > not a better option - remember the debate of non-2049 ports for nfsd?
> > 
> > I'm not sure where this is going. Do you mean the storage
> > administrator would provide NFS service on alternate
> > ports that each expose a separate set of exports?
> > 
> > So the only option Linux has there is using containers or
> > libvirt. We've continued to privately discuss the ability
> > for NFSD to support a separate set of exports on alternate
> > ports, but it doesn't look feasible. The export management
> > infrastructure and user space tools would need to be
> > rewritten.
> > 
> > 
> > > > And again, clients of the re-export server need to mount it
> > > > with local_lock. Apps can still use locking in that case,
> > > > but the locks are not visible to apps on other clients. Your
> > > > description does not explain why local_lock is not
> > > > sufficient or feasible.
> > > 
> > > Because:
> > > - it breaks applications running on more than one machine?
> > 
> > Yes, obviously. Your description needs to mention that is
> > a requirement, since there are a lot of applications that
> > don't need locking across multiple clients.
> > 
> > 
> > > - it breaks use cases like NFS--->SMB bridges, because without locking
> > > the typical Windows .NET application will refuse to write to a file
> > 
> > That's a quagmire, and I don't think we can guarantee that
> > will work. Linux NFS doesn't support "deny" modes, for
> > example.
> > 
> > 
> > > - it breaks even SIMPLE things like Microsoft Excel
> > 
> > If you need SMB semantics, why not use Samba?
> > 
> > The upshot appears to be that this usage is a stack of
> > mismatched storage protocols that work around a bunch of
> > local IT bureaucracy. I'm trying to be sympathetic, but
> > it's hard to say that /anyone/ would fully support this.
> > 
> > 
> > > Of course the happy echo "hello Linux-NFSv4-only world" >/nfs/file
> > > will always work.
> > > 
> > > > > Of course no one needs the gov customers, so feel free to break locking.
> > > > 
> > > > 
> > > > Please have a look at the patch description again: lock
> > > > recovery does not work now, and cannot work without
> > > > changes to the protocol. Isn't that a problem for such
> > > > workloads?
> > > 
> > > Nope, because of UPS (Uninterruptible power supply). Either everything
> > > is UP, or *everything* is DOWN. Boolean.
> > 
> > Power outages are not the only reason lock recovery might
> > be necessary. Network partitions, re-export server
> > upgrades or reboots, etc. So I'm not hearing anythying
> > to suggest this kind of workload is not impacted by
> > the current lock recovery problems.
> > 
> > 
> > > > In other words, locking is already broken on NFSv4 re-export,
> > > > but the current situation can lead to silent data corruption.
> > > 
> > > Would storing the locking information into persistent files help, ie.
> > > files which persist across nfsd server restarts?
> > 
> > Yes, but it would make things horribly slow.
> > 
> > And of course there would be a lot of coding involved
> > to get this to work.
> I suspect this suggestion might be a fair amount of code too
> (and I am certainly not volunteering to write it), but I will mention it.
> 
> Another possibility would be to have the re-exporting NFSv4 server
> just pass locking ops through to the backend NFSv4 server.
> - It is roughly the inverse of what I did when I constructed a flex files
>   pNFS server. The MDS did the locking ops and any I/O ops. were
>   passed through to the DS(s). Of course, it was hoped the client
>   would use layouts and bypass the MDS for I/O.
> 

How do you handle reclaim in this case? IOW, suppose the backend server
crashes but the reexporter stays up. How do you coordinate the grace
periods between the two so that the client can reclaim its lock on the
backend?

> 
> > 
> > What if we added an export option to allow the re-export
> > server to continue handling locking, but default it to
> > off (which is the safer option) ?
> > 
> > --
> > Chuck Lever
> > 
> > 
> 

-- 
Jeff Layton <jlayton@xxxxxxxxxx>





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux