On Thu, 24 Sep 2015, Igor Fedotov wrote: > On 24.09.2015 18:34, Sage Weil wrote: > > I was also assuming each stripe unit would be independently compressed, but > > I didn't think about the efficiency. This approach implies that you'd want a > > relatively large stripe size (100s of KB or more). Hmm, a quick google > > search suggests the zlib compression window is only 32KB anyway, which isn't > > so big. The more aggressive algorithms probably aren't what people would > > reach for anyway for CPU utilization reasons... I guess? sage > > There is probably no need in strict alignment with the stripe size. We can use > block sizes that client provides on write dynamically. If some client writes > in stripes - then we compress that block. If others use larger blocks ( e.g. > caching agent on flush) - we can use that size or split the provided block > into several smaller chunks ( e.g. up to max N*stripe_size ) for overhead > reduction on random read. Even if client uses dynamic block sizes ( low level > RADOS use?) we can rely on them some way without static bind to stripe size. > Surely this is much easier when appends are permitted only. General "random > writes" case will be more complex. Dynamic stripe sizes are possible but it's a significant change from the way the EC pool currently works. I would make that a separate project (as its useful in its own right) and not complicate the compression situation. Or, if it simplifies the compression approach, then I'd make that change first. sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html