On 07.12.2015 20:25, Vasiliy Tolstov wrote: > 07 дек. 2015 г. 18:13 пользователь "Daniel P. Berrange" <berrange@xxxxxxxxxx> > написал: >> >> On Mon, Dec 07, 2015 at 04:04:40PM +0100, Michal Privoznik wrote: >>> On 07.12.2015 14:51, Daniel P. Berrange wrote: >>>> On Mon, Dec 07, 2015 at 02:46:59PM +0100, Michal Privoznik wrote: >>>>> Dear list, >>>>> >>>>> I'd like to hear your opinion on the following bug: >>>>> >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1282859 >>>>> >>>>> Long story short, Imagine the following scenario: >>>>> >>>>> 1. Create 4GB file full of zeroes >>>>> 2. virsh vol-download it >>>>> >>>>> What happens is that all those 4GB are transferred byte after byte >>>>> through our RPC system. Not only this puts needles pressure on our > event >>>>> loop, it's suboptimal for network and other resources too. >>>>> >>>>> I'd like to explore our options here keeping in mind that the > original >>>>> volume might have been sparse and we ought to keep it sparse on the >>>>> destination too. >>>>> >>>>> In the bug the reporter (Matthew Booth) suggests introducing new > type of >>>>> RPC message that will let us keep our APIs unchanged. The source will >>>>> scan the file for windows of zeroes bigger than some value. When > found >>>>> the new type of message is passed to client without need to copy > those >>>>> zeroes. Yes, this is very similar to RLE. >>>>> >>>>> If we are going that way, should we enable users to put a compression >>>>> program in between read()/write() and our RPC? Well, should we let > users >>>>> to choose what compression program we will put there? Because there > are >>>>> better compression algorithms than RLE. >>>> >>>> It only looks like compression if you're solely looking at the network >>>> data transfer. A keep feature of sparse support is that we preserve >>>> the sparseness on both sides. >>>> >>>> ie, if I have a sparse raw file locally, and vol-upload it, it should >>>> remain a sparse file on the server. Likewise vol-downloading a sparse >>>> file should let me create a sparse file locally. For this reason the >>>> RPC program must explicitly represent data holes, and not merely >>>> consider them a type of compression algorithm, as that would not let >>>> us preserve the holes on both ends of the stream. >>> >>> Right. But how could we apply both our RLE algorithm and an external >>> program on the same stream? Should we multiplex and send holes to the >>> other side as they are and run the rest through the external compression >>> program? Otherwise I don't see how we could preserve sparseness. >> >> I think we should just focus on sending holes in the RPC protocol >> right now, and not try todo compression at the same time, as we need >> to be able to represent holes in the protocol regardless of whether >> compression is present. >> > > Sometimes ago I'm already ask about this and to add compress flag to vol > upload and download (don't have time to complete). > For my use case best way is to able to create compressed stream that goes > to libvirt. So in this case we effectively solve sparse file problem and > also can transfer less data, all my tests with lz4 compression says that I > get is about 20% minimum benefit compared to original volume size. Right. And as Dan pointed out, these two approaches are orthogonal to each other. Compressing a stream of data to reduce size is a nice feature to have, preserving sparseness of a file is something different though (although the way I'm intending to implement it will reduce data sent through virStream too). One thing that I am still wondering about is sparseness detection. Finding a window full of zeroes in a file does not necessarily mean that those come from read() over segment that's not on disk. We surely can have a raw file that is sparse and also contains a window full of zeroes. But I guess it's okay if we sparsify (if that's even a verb) file even more on volDownload or volUpload. Michal -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list