Re: sparse-read OSD op guarantees

Jeff Layton <jlayton@xxxxxxxxxx> · Mon, 02 May 2022 10:47:25 -0400

On Mon, 2022-05-02 at 16:41 +0200, Ilya Dryomov wrote:
> On Mon, May 2, 2022 at 4:22 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > 
> > (sorry for the resend, but the first message got rejected by the list because it was from an unsubscribed address)
> > 
> > On Mon, 2022-05-02 at 14:05 +0200, Ilya Dryomov wrote:
> > > Hi Sam,
> > > 
> > > I wanted to clarify ObjectStore::fiemap API and sparse-read OSD op
> > > guarantees as this came up in Jeff's fscrypt work and just recently in
> > > RBD as well.
> > > 
> > > In fscrypt for kcephfs, Jeff has opted to use sparse-read to ensure
> > > that file holes (which must contain all zeroes logically) don't get
> > > "decrypted" into seemingly random junk.  (Unlike ecryptfs, fscrypt
> > > framework doesn't attempt to protect the information about existence
> > > and location of holes in files, so logical holes generally correspond
> > > to physical holes.)
> > > 
> > 
> > The fscrypt client infrastructure generally prevents you from reading a
> > file when you don't have the key, but you could always analyze the
> > backing device and determine where the holes are. The situation with
> > cephfs is analogous.
> 
> Yup.
> 
> > 
> > I imagine this is the same with ecryptfs though. I don't believe it
> > fills in the holes when you do a write past the EOF either. Were you
> > thinking of LUKS? That operates at the device level, so finding holes
> > there is a much different matter.
> 
> I'm pretty sure ecryptfs always fills holes by encrypting logical zeroes and
> writing the resulting ciphertext out to the backing filesystem.  Quoting the
> FAQ:
> 
>     eCryptfs does not currently support sparse files. Sequences of encrypted
>     extents with all 0's could be interpreted as sparse regions in eCryptfs
>     without too much implementation complexity. However, this would open up
>     a possible attack vector, since the fact that certain segments of data are
>     all 0's could betray strategic information that the user does not
>     necessarily want to reveal to an attacker. For instance, if the attacker
>     knows that a certain database file with patient medical data keeps
>     information about viral infections in one region of the file and
>     information about diabetes in another section of the file, then the very
>     fact that the segment for viral infection data is populated with data at
>     all would reveal that the patient has a viral infection.
> 

I stand corrected then! That tends to be pretty horrible for performance
though. Prepare to wait for a while if you do create a file and then
start writing at the 2G offset.

In principle, we could also have the client fill in holes instead. It
may be worthwhile to have a mode where it does that. That might alsogive
us a way to support this on non-bluestore pools if it's not feasible to
allow for sparseness there).
-- 
Jeff Layton <jlayton@xxxxxxxxxx>

_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx