On Thu, Mar 21, 2019 at 2:45 PM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > > On Thu, Mar 21, 2019 at 8:51 AM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > > > On Wed, Mar 20, 2019 at 6:06 PM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > >> > >> On Tue, Mar 19, 2019 at 9:43 AM Erwin Bogaard <erwin.bogaard@xxxxxxxxx> wrote: > >> > > >> > Hi, > >> > > >> > > >> > > >> > For a number of application we use, there is a lot of file duplication. This wastes precious storage space, which I would like to avoid. > >> > > >> > When using a local disk, I can use a hard link to let all duplicate files point to the same inode (use “rdfind”, for example). > >> > > >> > > >> > > >> > As there isn’t any deduplication in Ceph(FS) I’m wondering if I can use hard links on CephFS in the same way as I use for ‘regular’ file systems like ext4 and xfs. > >> > > >> > 1. Is it advisible to use hard links on CephFS? (It isn’t in the ‘best practices’: http://docs.ceph.com/docs/master/cephfs/app-best-practices/) > >> > > >> > 2. Is there any performance (dis)advantage? > >> > > >> > 3. When using hard links, is there an actual space savings, or is there some trickery happening? > >> > > >> > 4. Are there any issues (other than the regular hard link ‘gotcha’s’) I need to keep in mind combining hard links with CephFS? > >> > >> The only issue we've seen is if you hardlink b to a, then rm a, then > >> never stat b, the inode is added to the "stray" directory. By default > >> there is a limit of 1 million stray entries -- so if you accumulate > >> files in this state eventually users will be unable to rm any files, > >> until you stat the `b` files. > > > > > > Eek. Do you know if we have any tickets about that issue? It's easy to see how that happens but definitely isn't a good user experience! > > I'm not aware of a ticket -- I had thought it was just a fact of life > with hardlinks and cephfs. I think it is for now, but as you've demonstrated that's not really a good situation and I'm sure we can figure out some way of automatically merging inodes into their remaining link parents. I've created a ticket at http://tracker.ceph.com/issues/38849 > After hitting this issue in prod, we found the explanation here in > this old thread (with your useful post ;) ): > > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-October/013621.html > > Our immediate workaround was to increase mds bal fragment size max > (e.g. to 200000). > In our env we now monitor num_strays in case these get out of control again. > > BTW, now thinking about this more... isn't directory fragmentation > supposed to let the stray dir grow to unlimited shards? (on our side > it seems limited to 10 shards). Maybe this is just some configuration > issue on our side? Sounds like I haven't missed a change here: the stray directory is a special system directory that doesn't get fragmented like normal ones do. We just set it up (hard-coded even, IIRC, but maybe a config option) so that each MDS gets 10 of them after the first time somebody managed to make it large enough that a single stray directory object got too large. o_0 -Greg _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com