On Thu, May 12, 2022 at 05:58:46PM +0100, Dr. David Alan Gilbert wrote: > * Daniel P. Berrangé (berrange@xxxxxxxxxx) wrote: > > On Wed, May 11, 2022 at 07:31:45PM +0200, Claudio Fontana wrote: > > > That's great, I love when things are simple. > > > > > > If indeed we want to remove the copy in libvirt (which will also mean explicitly fsyncing elsewhere, as the iohelper would not be there anymore to do that for us on image creation), > > > with QEMU having a "file" protocol support for migration, > > > > > > do we plan to have libvirt and QEMU both open the file for writing concurrently, with QEMU opening O_DIRECT? > > > > For non-libvirt users, I expect QEMU would open the > > file directly . For libvirt usage, it is likely > > preferrable to pass the pre-opened FD, because that > > simplifies file permission handling. > > > > > The alternative being having libvirt open the file with > > > O_DIRECT, write some libvirt stuff in a new, O_DIRECT- > > > friendly format, and then pass the fd to qemu to migrate to, > > > and QEMU sending its new O_DIRECT friendly stream there. > > > > Yep. > > > > > In any case, the expectation here is to have a new > > > "file://pathname" or "file:://fdname" as an added feature in QEMU, > > > where QEMU would write a new O_DIRECT friendly stream > > > directly into the file, taking care of both optional > > > parallelization and compression. > > > > I could see several distinct building blocks > > > > * First a "file:/some/path" migration protocol > > that can just do "normal" I/O, but still writing > > in the traditional migration data stream > > > > * Modify existing 'fd:' protocol so that it fstat()s > > and passes over to the 'file' protocol handler if > > it sees the FD is not a socket/pipe > > We used to have that at one point. > > > * Add a migration capability "direct-mapped" to > > indicate we want the RAM data written/read directly > > to/from fixed positions in the file, as opposed to > > a stream. Obviously only valid with a sub-set > > of migration protocols (file, and fd: if a seekable > > FD). > > This worries me about how you're going to cleanly glue this into the > migration code; it sounds like what you want it to do is very different > to what it currently does. I've only investigated it lightly, but I see the key bit of code is this method which emits the header + ram page content: static int save_normal_page(RAMState *rs, RAMBlock *block, ram_addr_t offset, uint8_t *buf, bool async) { ram_transferred_add(save_page_header(rs, rs->f, block, offset | RAM_SAVE_FLAG_PAGE)); if (async) { qemu_put_buffer_async(rs->f, buf, TARGET_PAGE_SIZE, migrate_release_ram() && migration_in_postcopy()); } else { qemu_put_buffer(rs->f, buf, TARGET_PAGE_SIZE); } ram_transferred_add(TARGET_PAGE_SIZE); ram_counters.normal++; return 1; } my (perhaps wishful) thinking was that we just have an alternative impl of this which doesn't save the page header, and puts the page content at a fixed offset. I'm fuzzy on how we figure out the right offset - I was hoping that "RAMState" or "RAMBlock" somehow gives us enough info to figure out a deterministic mapping to a file location. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|