Re: rbd locking and handling broken clients

Greg Farnum <greg@xxxxxxxxxxx> · Wed, 13 Jun 2012 16:41:30 -0700

On Wednesday, June 13, 2012 at 1:37 PM, Florian Haas wrote:
> Greg,
>  
> My understanding of Ceph code internals is far too limited to comment on
> your specific points, but allow me to ask a naive question.
>  
> Couldn't you be stealing a lot of ideas from SCSI-3 Persistent
> Reservations? If you had server-side (OSD) persistence of information of
> the "this device is in use by X" type (where anything other than X would
> get an I/O error when attempting to access data), and you had a manual,
> authenticated override akin to SCSI PR preemption, plus key
> registration/exchange for that authentication, then you would at least
> have to have the combination of a misbehaving OSD plus a malicious
> client for data corruption. A non-malicious but just broken client
> probably won't do.
>  
> Clearly I may be totally misguided, as Ceph is fundamentally
> decentralized and SCSI isn't, but if PR-ish behavior comes even close to
> what you're looking for, grabbing those ideas would look better to me
> than designing your own wheel.

Yeah, the problem here is exactly that Ceph (and RBD) are fundamentally decentralized. :) I'm not familiar with the SCSI PR mechanism either, but it looks to me like it deals in entirely local information — the equivalent with RBD would require performing a locking operation on every object in the RBD image before you accessed it. We could do that, but then opening an image would take time linear in its size… :(

On Wednesday, June 13, 2012 at 4:14 PM, Tommi Virtanen wrote:
> On Wed, Jun 13, 2012 at 10:40 AM, Gregory Farnum <greg@xxxxxxxxxxx (mailto:greg@xxxxxxxxxxx)> wrote:
> > 2) Client fencing. See http://tracker.newdream.net/issues/2531. There
> > is an existing "blacklist" functionality in the OSDs/OSDMap, where you
> > can specify an "entity_addr_t" (consisting of an IP, a port, and a
> > nonce — so essentially unique per-process) which is not allowed to
> > communicate with the cluster any longer. The problem with this is that
>  
> Does that work even after a TCP connection close & re-establish, where
> the client now has a new source port address? (Perhaps the port is 0
> for clients?)

Precisely — client ports are 0 since they never accept incoming connections.

> You know, I'd be really happy if this could be achieved by means of
> removing cephx keys.

Unfortunately, that wouldn't really solve the problem without dramatically decreasing the rotation interval for cluster access keys which cephx shares. Alternative (entirely theoretical) security schemes might, but they're well behind what's feasible for us to work on any time soon...

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html