Re: raid5-cache I/O path improvements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Tue, Sep 08, 2015 at 10:07:36AM -0700, Shaohua Li wrote:
> I need double confirm. But for write + flush, we aggregrate several
> writes and do a flush; for FUA, we do every meta write with FUA. So this
> is not apple to apple comparison.

Does that mean that upper layers are taking different actions
depending on whether the underlying device supports FUA?  That at
least wasn't the original model that I had on mind when implementing
the current incarnation of REQ_FLUSH and FUA.  The only difference FUA
was expected to make was optimizing out the flush after REQ_FUA and
the upper layers were expected to issue REQ_FLUSH/FUA the same whether
the device supports FUA or not.

Hmmm... grep tells me that dm and md actually are branching on whether
the underlying device supports FUA.  This is tricky.  I didn't even
mean flush_flags to be used directly by upper layers.  For rotational
devices, doing multiple FUAs compared multiple writes followed by
REQ_FLUSH is probably a lot worse - the head gets moved multiple times
likely skipping over data which can be written out while traversing
and it's not like stalling write pipeline and draining write queue has
much impact on hard drives.

Maybe it's different on SSDs.  I'm not sure about how expensive flush
itself would be given that a lot of write cost is paid asynchronously
anyway during gc but flush stalls IO pipeline and that could be very
noticeable on high iops devices.  Also, unless the implementation is
braindead FUA IOs are unlikely to be expensive on SSDs, so maybe what
we should do is making block layer hint upper layers regarding what's
likely to perform better.

But, ultimately, I don't think it'd be too difficult for high-depth
SSD devices to report write-through w/o losing any performance one way
or the other and it'd be great if we eventually can get there.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux