Re: Shard storage suggestions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

The suggestion you gave was in fact considered at the time of writing shard translator.
Here are some of the considerations for sticking with a single directory as opposed to a two-tier classification of shards based on the initial chars of the uuid string:
i) Even for a 4TB disk with the smallest possible shard size of 4MB, there will only be a max of 1048576 entries
 under /.shard in the worst case - a number far less than the max number of inodes that are supported by most backend file systems.

ii) Entry self-heal for a single directory even with the simplest case of 1 entry deleted/created while a replica is down required crawling the whole sub-directory tree, figuring which entry is present/absent between src and sink and then healing it to the sink. With granular entry self-heal [1], we no longer have to live under this limitation.

iii) Resolving shards from the original file name as given by the application to the corresponding shard within a single directory (/.shard in the existing case) would mean, looking up the parent dir /.shard first followed by lookup on the actual shard that is to be operated on. But having a two-tier sub-directory structure means that we not only have to resolve (or look-up) /.shard first, but also the directories '/.shard/d2', '/.shard/d2/18', and '/.shard/d2/18/d218cd1c-4bd9-40d7-9810-86b3f7932509' before finally looking up the shard, which is a lot of network operations. Yes, these are all one-time operations and the results can be cached in the inode table, but still on account of having to have dynamic gfids (as opposed to just /.shard, which has a fixed gfid - be318638-e8a0-4c6d-977d-7a937aa84806), it is trivial to resolve the name of the shard to gfid, or the parent name to parent gfid _even_ in memory.


Are you unhappy with the performance? What's your typical VM image size, shard block size and the capacity of individual bricks?

-Krutika

On Mon, Jul 18, 2016 at 2:43 PM, Gandalf Corvotempesta <gandalf.corvotempesta@xxxxxxxxx> wrote:
2016-07-18 9:53 GMT+02:00 Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx>:
> I'd say, like this:
>
> /.shard/d2/18/D218CD1C-4BD9-40D7-9810-86B3F7932509.1

Yes, something like this.
I was on mobile when I wrote. Your suggestion is better than mine.

Probably, using a directory for the whole shard is also better and
keep the directory structure clear:

 /.shard/d2/18/D218CD1C-4BD9-40D7-9810-86B3F7932509/D218CD1C-4BD9-40D7-9810-86B3F7932509.1

The current shard directory structure doesn't scale at all.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux