> If I understand correctly the proposed data-classification > architecture, each server will have a number of bricks that will be > dynamically modified as needed: as more data-classifying conditions > are defined, a new layer of translators will be added (a new DHT or > AFR, or something else) and some or all existing bricks will be split > to accommodate the new and, maybe, overlapping condition. Correct. > How space will be allocated to each new sub-brick ? some sort of thin- > provisioning or will it be distributed evenly on each split ? That's left to the user. The latest proposal, based on discussion of the first, is here: https://docs.google.com/presentation/d/1e8tuh9DKNi9eCMrdt5vetppn1D3BiJSmfR7lDW2wRvA/edit?usp=sharing That has an example of assigning percentages to the sub-bricks created by a rule (i.e. a subvolume in a potentially multi-tiered configuration). Other possibilities include relative weights used to determine percentages, or total thin provisioning where sub-bricks compete freely for available space. It's certainly a fruitful area for discussion. > If using thin-provisioning, it will be hard to determine real > available space. If using a fixed amount, we can get to scenarios > where a file cannot be written even if there seems to be enough free > space. This can already happen today if using very big files on almost > full bricks. I think brick splitting can accentuate this. Is this really common outside of test environments, given the sizes of modern disks and files? Even in cases where it might happen, doesn't striping address it? We have a whole bunch of problems in this area. If multiple bricks are on the same local file system, their capacity will be double-counted. If a second local file system is mounted over part of a brick, the additional space won't be counted at all. We do need a general solution to this, but I don't think that solution needs to be part of data classification unless there's a specific real-world scenario that DC makes worse. > Also, the addition of multiple layered DHT translators, as it's > implemented today, could add a lot more of latency, specially on > directory listings. With http://review.gluster.org/#/c/7702/ this should be less of a problem. Also, lookups across multiple tiers are likely to be rare in most use cases. For example, for the name-based filtering (sanlock) case, a given file should only *ever* be in one tier so only that tier would need to be searched. For the activity-based tiering case, the vast majority of lookups will be for hot files which are (not accidentally) in the first tier. The only real problem is with *failed* lookups, e.g. during create. We can address that by adding "stubs" (similar to linkfiles) in the upper tier, but I'd still want to wait until it's proven necessary. What I would truly resist is any solution that involves building tier awareness directly into (one instance of) DHT. Besides requiring a much larger development effort in the present, it would throw away the benefit of modularity and hamper other efforts in the future. We need tiering and brick splitting *now*, especially as a complement to erasure coding which many won't be able to use otherwise. As far as I can tell, stacking translators is the fastest way to get there. > Another problem I see is that splitting bricks will require a > rebalance, which is a costly operation. It doesn't seem right to > require a so expensive operation every time you add a new condition on > an already created volume. Yes, rebalancing is expensive, but that's no different for split bricks than whole ones. Any time you change the definition of what should go where, you'll have to move some data into compliance and that's expensive. However, such operations are likely to be very rare. It's highly likely that most uses of this feature will consist of a simple two-tier setup defined when the volume is created and never changed thereafter, so the only rebalancing would be within a tier - i.e. the exact same thing we do today in homogeneous volumes (maybe even slightly better). The only use case I can think of that would involve *frequent* tier-config changes is multi-tenancy, but adding a new tenant should only affect new data and not require migration of old data. _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel