Re: MDS auth caps for cephfs

Gregory Farnum <greg@xxxxxxxxxxx> · Fri, 22 May 2015 15:02:31 -0700

On Fri, May 22, 2015 at 2:35 PM, Sage Weil <sweil@xxxxxxxxxx> wrote:
> On Fri, 22 May 2015, John Spray wrote:
>> On 21/05/2015 01:14, Sage Weil wrote:
>> > Looking at the MDSAuthCaps again, I think there are a few things we might
>> > need to clean up first.  The way it is currently structured, the idea is
>> > that you have an array of grants (MDSCapGrant).  For any operation, you'd
>> > look at each grant until one that says what you're trying to do is okay.
>> > If non match, you fail.  (i.e., they're additive only.)
>> >
>> > Each MDSCapGrant has a 'spec' and a 'match'.  The 'match' is a check
>> > to see if the current grant applies to a given operation, and the 'spec'
>> > says what you're allowed to do.
>> >
>> > Currently MDSCapMatch is just
>> >
>> >    int uid;  // Require UID to be equal to this, if !=MDS_AUTH_UID_ANY
>> >    std::string path;  // Require path to be child of this (may be "/" for
>> > any)
>> >
>> > I think path is clearly right.  UID I'm not sure makes sense here... I'm
>> > inclined to ignore it (instead of removing it) until we decide
>> > how to restrict a mount to be a single user.
>> >
>> > The spec is
>> >
>> >    bool read;
>> >    bool write;
>> >    bool any;
>> >
>> > I'm not quite sure what 'any' means, but read/write are pretty clear.
>>
>> Ah, I added that when implementing 'tell' -- 'any' is checked when handling
>> incoming MCommand in MDS, so it's effectively the admin permission.
>
> Ok!
>
>> > The root_squash option clearly belongs in spec, and Nistha's first patch
>> > adds it there.  What about the other NFS options.. should be mirror those
>> > too?
>> >
>> > root_squash
>> >   Map requests from uid/gid 0 to the anonymous uid/gid. Note that this does
>> >   not apply to any other uids or gids that might be equally sensitive, such
>> >   as user bin or group staff.
>> > no_root_squash
>> >   Turn off root squashing. This option is mainly useful for diskless
>> >   clients.
>> > all_squash
>> >   Map all uids and gids to the anonymous user. Useful for NFS-exported
>> >   public FTP directories, news spool directories, etc. The opposite option
>> >   is no_all_squash, which is the default setting.
>> > anonuid and anongid
>> >   These options explicitly set the uid and gid of the anonymous account.
>> >   This option is primarily useful for PC/NFS clients, where you might want
>> >   all requests appear to be from one user. As an example, consider the
>> >   export entry for /home/joe in the example section below, which maps all
>> >   requests to uid 150 (which is supposedly that of user joe).
>>
>> Yes, I think we should.  Part of me wants to say that people who want NFS-like
>> behaviour should be using NFS gateways.  However, these are all probably
>> straightforward enough to implement that it's worth maintaining them in cephfs
>> too.

Unfortunately not really — the NFS semantics are very different from
the way our CephX security caps work. We grant accesses with each
permission, rather than restricting them. We can accomplish similar
things, but they'll need to be in opposite directions:
allow anon_access
allow uid 123, allow gid 123[,456,789,...]
allow root
where each additional grant gives the session more access. (And I'm
not sure if these are best set up as specific things on their own or
just squashed in so that UID -1 is "anon", etc) These let you set up
access permissions like those of NFS, but it's a quite different model
than the various mounting and config file options NFS gives you. I
want to make sure we're clear about not trying to match those
precisely because otherwise our security capabilities are not going to
make any kind of sense. :(
What would it mean for a user who doesn't have no_root_squash to have
access to uid 0? Why should we allow random users to access any UID
*except* for root? Does a client who has no_root_squash and anon uid
123 get to access stuff as root, or else as 123? Can they access as
124?
I mean, I think it would have to mean they get access to everything as
anybody, and I'm not sure which requests would be considered
"anonymous" for the uid 123 bit to kick in. But I don't think that's
what the administrator would *mean* for them to have.

As I think about this more I guess the point is that for multi tenancy
we want each client to be able to do anything inside of their own
particular directory namespace, since UIDs and GIDs may not be
synchronized across tenants? I'm not sure how to address that, but
either way I think it will require a wider/different set of primitives
than we've described here. :/

>>
>> We probably need to mirror these in our mount options too, so that e.g.
>> someone with an admin key can still enable root_squash at will, rather than
>> having to craft an authentication token with the desired behaviour.

Mmmm, given that clients normally can't see their capabilities at all
that's a bit tricky. We could maybe accomplish it by tying in with the
extra session exchange (that Sage referred to below); that will be
necessary for adding clients to an existing host session dynamically
and we could also let a user voluntarily drop certain permissions with
it...although dropping permissions requires a client to know that they
have them. Hrm.

On Fri, May 22, 2015 at 2:35 PM, Sage Weil <sweil@xxxxxxxxxx> wrote:
> Yeah. So Greg and Josh and I sat down with Dan van der Ster yesterday and
> went over some of this.  I think we also concluded:
>
>  - We should somehow tag requests with a uid and list<gid>.  This will
> make the request path permission checks sane WRT these sorts of checks.

Well, hopefully we don't need to tag individual requests with a list
of GIDs because the group information will be in the session state?

>
>  - We need something trickier for cap writeback.  We can simply tag the
> dirty cap on the client with the uid etc of whoever dirtied it, but if
> multiple users do that it can get messy.  I suggest forcing the client to
> flush before allowing a second dirty, although this will be slighly
> painful as we need to handle the case where the MDS fails or a subtree
> migrates, so it might mean actually blocking in that case.  (This will be
> semi gross to code but I don't think will affect any realworld workload.)

Flushing *might* be the easiest solution to implement, but I actually
worry we'll run into it a non-trivial amount of the time. Consider a
client with multiple containerized applications running on the same
host, that need to share data...
I'd need to look through the writeback paths in the client pretty
carefully before I felt comfortable picking a path forward here. I'm
tempted to set up some kind of ordered flush thing similar to our
projected journal updates (but simpler!) — if the client allows
something the MDS doesn't then we've got a problem, but that basically
requires a user subverting the client so I'm not sure it's worth
worrying about?

>
>  - For per-user kerberos, we'll need an extra exchange between client and
> MDS to establish user credentials (e.g., when a user does kinit, or a new
> user logs into the box, etc.).  Note that the kerberos credential has a
> group concept, but I'm not sure how that maps onto the Unix groups
> (perhaps that is a parallel PAM thing with the LDAP/AD server?).  In any
> case, if such an exchange will be needed there, and that session
> state is what we'll be checking against, should we create that structure
> now and use it to establish the gid list (instead of, say, including a
> potentially largish list<gid_t> in every MClientRequest)?

Like I've said, the GID list that the MDS can care about needs to be
in the session list anyway, right? So we shouldn't need to add it to
MClientRequests.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html