On Mon, Oct 16, 2017 at 7:47 AM, Stefan Hajnoczi <stefanha@xxxxxxxxx> wrote: > On Fri, Oct 13, 2017 at 06:48:15AM -0400, Pankaj Gupta wrote: >> > On Thu, Oct 12, 2017 at 09:20:26PM +0530, Pankaj Gupta wrote: >> > > +static blk_qc_t virtio_pmem_make_request(struct request_queue *q, >> > > + struct bio *bio) >> > > +{ >> > > + blk_status_t rc = 0; >> > > + struct bio_vec bvec; >> > > + struct bvec_iter iter; >> > > + struct virtio_pmem *pmem = q->queuedata; >> > > + >> > > + if (bio->bi_opf & REQ_FLUSH) >> > > + //todo host flush command >> > >> > This detail is critical to the device design. What is the plan? >> >> yes, this is good point. >> >> was thinking of guest sending a flush command to Qemu which >> will do a fsync on file fd. > > Previously there was discussion about fsyncing a specific file range > instead of the whole file. This could perform better in cases where > only a subset of dirty pages need to be flushed. > > One possibility is to design the virtio interface to communicate ranges > but the emulation code simply fsyncs the fd for the time being. Later > on, if the necessary kernel and userspace interfaces are added, we can > make use of the interface. Range based is not a natural storage cache management mechanism. All that is it available typically is a full write-cache-flush mechanism and upper layers would need to customized for range-based flushing. >> If we do a async flush and move the task to wait queue till we receive >> flush complete reply from host we can allow other tasks to execute >> in current cpu. >> >> Any suggestions you have or anything I am not foreseeing here? > > My main thought about this patch series is whether pmem should be a > virtio-blk feature bit instead of a whole new device. There is quite a > bit of overlap between the two. I'd be open to that... there's already provisions in the pmem driver for platforms where cpu caches are flushed on power-loss, a virtio mode for this shared-memory case seems reasonable.