Re: [PATCH v3] nfsd: disallow file locking and delegations for NFSv4 reexport

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Wed, 30 Oct 2024 16:59:15 +0000

> On Oct 30, 2024, at 12:37 PM, Cedric Blancher <cedric.blancher@xxxxxxxxx> wrote:
> 
> On Wed, 30 Oct 2024 at 17:15, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
>> 
>> 
>> 
>>> On Oct 30, 2024, at 10:55 AM, Cedric Blancher <cedric.blancher@xxxxxxxxx> wrote:
>>> 
>>> On Tue, 29 Oct 2024 at 17:03, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
>>>> 
>>>>> On Oct 29, 2024, at 11:54 AM, Brian Cowan <brian.cowan@xxxxxxxxxxxxxxxx> wrote:
>>>>> 
>>>>> Honestly, I don't know the usecase for re-exporting another server's
>>>>> NFS export in the first place. Is this someone trying to share NFS
>>>>> through a firewall? I've seen people share remote NFS exports via
>>>>> Samba in an attempt to avoid paying their NAS vendor for SMB support.
>>>>> (I think it's "standard equipment" now, but 10+ years ago? Not
>>>>> always...) But re-exporting another server's NFS exports? Haven't seen
>>>>> anyone do that in a while.
>>>> 
>>>> The "re-export" case is where there is a central repository
>>>> of data and branch offices that access that via a WAN. The
>>>> re-export servers cache some of that data locally so that
>>>> local clients have a fast persistent cache nearby.
>>>> 
>>>> This is also effective in cases where a small cluster of
>>>> clients want fast access to a pile of data that is
>>>> significantly larger than their own caches. Say, HPC or
>>>> animation, where the small cluster is working on a small
>>>> portion of the full data set, which is stored on a central
>>>> server.
>>>> 
>>> Another use case is "isolation", IT shares a filesystem to your
>>> department, and you need to re-export only a subset to another
>>> department or homeoffice. Part of such a scenario might also be policy
>>> related, e.g. IT shares you the full filesystem but will do NOTHING
>>> else, and any further compartmentalization must be done in your own
>>> department.
>>> This is the typical use case for gov NFS re-export.
>> 
>> It's not clear to me from this description why re-export is
>> the right tool for this job. Please explain why ACLs are not
>> used in this case -- this is exactly what they are designed
>> to do.
> 
> 1. IT departments want better/harder/immutable isolation than ACLs

So you want MAC, and the storage administrator won't set
that up for you on the NFS server. NFS doesn't do MAC
very well if at all.

> 2. Linux NFSv4 only implements POSIX draft ACLs, not full Windows or
> NFSv4 ACLs. So there is no proper way to prevent ACL editing,
> rendering them useless in this case.

Er. Linux NFSv4 stores the ACLs as POSIX draft, because
that's what Linux file systems can support. NFSD, via
NFSv4, makes these appear like NFSv4 ACLs.

But I think I understand.

> There is a reason why POSIX draft ACls were abandoned - they are not
> fine-granted enough for real world usage outside the Linux universe.
> As soon as interoperability is required these things just bite you
> HARD.

You, of course, have the ability to run some other NFS
server implementation that meets your security requirements
more fully.

> Also, just running more nfsd in parallel on the origin NFS server is
> not a better option - remember the debate of non-2049 ports for nfsd?

I'm not sure where this is going. Do you mean the storage
administrator would provide NFS service on alternate
ports that each expose a separate set of exports?

So the only option Linux has there is using containers or
libvirt. We've continued to privately discuss the ability
for NFSD to support a separate set of exports on alternate
ports, but it doesn't look feasible. The export management
infrastructure and user space tools would need to be
rewritten.

>> And again, clients of the re-export server need to mount it
>> with local_lock. Apps can still use locking in that case,
>> but the locks are not visible to apps on other clients. Your
>> description does not explain why local_lock is not
>> sufficient or feasible.
> 
> Because:
> - it breaks applications running on more than one machine?

Yes, obviously. Your description needs to mention that is
a requirement, since there are a lot of applications that
don't need locking across multiple clients.

> - it breaks use cases like NFS--->SMB bridges, because without locking
> the typical Windows .NET application will refuse to write to a file

That's a quagmire, and I don't think we can guarantee that
will work. Linux NFS doesn't support "deny" modes, for
example.

> - it breaks even SIMPLE things like Microsoft Excel

If you need SMB semantics, why not use Samba?

The upshot appears to be that this usage is a stack of
mismatched storage protocols that work around a bunch of
local IT bureaucracy. I'm trying to be sympathetic, but
it's hard to say that /anyone/ would fully support this.

> Of course the happy echo "hello Linux-NFSv4-only world" >/nfs/file
> will always work.
> 
>>> Of course no one needs the gov customers, so feel free to break locking.
>> 
>> 
>> Please have a look at the patch description again: lock
>> recovery does not work now, and cannot work without
>> changes to the protocol. Isn't that a problem for such
>> workloads?
> 
> Nope, because of UPS (Uninterruptible power supply). Either everything
> is UP, or *everything* is DOWN. Boolean.

Power outages are not the only reason lock recovery might
be necessary. Network partitions, re-export server
upgrades or reboots, etc. So I'm not hearing anythying
to suggest this kind of workload is not impacted by
the current lock recovery problems.

>> In other words, locking is already broken on NFSv4 re-export,
>> but the current situation can lead to silent data corruption.
> 
> Would storing the locking information into persistent files help, ie.
> files which persist across nfsd server restarts?

Yes, but it would make things horribly slow.

And of course there would be a lot of coding involved
to get this to work.

What if we added an export option to allow the re-export
server to continue handling locking, but default it to
off (which is the safer option) ?

--
Chuck Lever