On 01/12/2010 10:50 PM, Akinobu Mita wrote:
This patch makes sense, but it also raises the question of whether or not
we
should move to a two-level directory scheme, eg.
123/456/7890ABCDEF
rather than
123/4567890ABCDEF
to limit the size of the top-level directories. It really depends on the
object counts a typical chunkd node will be seeing. As with the other
patch, I will give this some thought after sleep.
Two-level directory scheme looks good.
I will do it unless someone thinks 536,870,912,000(=4096*4096*32000)
objects in one table is not enough :)
FWIW, 32000 is only the limit on directories-with-a-directory. You can
easily have millions of regular files in a single ext3 directory. So it is
really 4096*4096*millions.
Oops, how embarrassing... so 1-level directory scheme with 3-bytes prefix
is nealy unlimited in maximum count of objects.
Yes. It mainly becomes a question of balancing lookup costs, at that point:
With a 1-level directory scheme, millions of objects could imply
prohibitively long directory-lookup times as those directories [although
super-large directories are better handled in ext3+htree, ext4, btrfs
and XFS].
On the other hand, a 2-level directory scheme would reduce or eliminate
the occurrence of large directories, with the cost of having to perform
many more mkdir(2) calls during object creation. Additional costs
include larger dcache footprint, and added fs_list_objs() complexity.
BTW, chunkd cannot have more than 32000 tables on ext3 by the same reason
(EXT3_MAX_LINK). So, should we use two or three-level directory scheme
for table_id in object pathname ?
At this point, I think it is unlikely that people will create more than
32000 tables on a single server. If I am wrong, we can eliminate this
limit at a later date.
Jeff
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html