Re: [RESEND PATCH 0/5] Setting write hint in MD RAID

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 20, 2018 at 02:59:55PM +0100, Mariusz Dabrowski wrote:
> On 02/18/2018 06:59 PM, Shaohua Li wrote:
> > On Wed, Feb 14, 2018 at 02:23:29PM +0100, Mariusz Dabrowski wrote:
> > > This patchset adds support for write hints in MD driver. This is a new
> > > feature for NVMe drives compliant to 1.3 specification and introduced to
> > > Linux in kernel 4.13. Write hint has to be copied from bio containing user
> > > data to bios sent to RAID members. Additionally, write hint can be set for
> > > internal data like parity and PPL in RAID 5.
> > > 
> > > Setting write hint for parity is done with simple classification algorithm
> > > which works for sequential IO workload. It tries to predict which parity
> > > request are going to be overwritten in a moment and sets write hint for
> > > them. This algorithm uses stripe cache to count updates of each parity
> > > chunk. Parity request will be predicted as "soon-overwritten" if nubmer of
> > > parity updates is smaller than number of data chunks in stripe.
> > > 
> > > For PPL there is no special algorithm. It is updated very frequently so we
> > > can set write hint for each PPL write.
> > > 
> > > We have performed our internal tests which prove that setting write hint
> > > for parity and PPL can significantly reduce write amplification.
> > 
> > I can apply the first 2 patches first.
> > 
> > For other patches, I'm not confident. A write hint just means a write stream,
> > or a stream ID. Userspace doesn't need to assign shore live data to
> > RWH_WRITE_LIFE_SHORT. It could assign long live data to RWH_WRITE_LIFE_SHORT
> > but short live data to RWH_WRITE_LIFE_LONG. Nothing prevents userspace to do
> > this. Fixed policy like what the patches do isn't flexible and sometimes
> > harmful for performance depending on specific applications.
> > 
> > Thanks,
> > Shaohua
> > 
> 
> I agree that this fixed policy is not the best we can do. I can change it
> and allow setting which hint will be used for parity/ppl. I think of 2
> approaches:
> 1) setting hint ID at the same time as policy, for example:
> 	echo parity=2 > /sys/block/md126/md/write_hint_policy
> 2) new sysfs attributes for setting hint ID:
> 	echo parity > /sys/block/md126/md/write_hint_policy
> 	echo ppl > /sys/block/md126/md/write_hint_policy
> 	echo 1 > /sys/block/md126/md/parity_write_hint
> 	echo 2 > /sys/block/md126/md/ppl_write_hint
> 
> What are your thoughts about this, is any of those proposals acceptable for
> you? Maybe you've got better idea how to make this more flexible?

Frankly I have no idea what this should be done. Adding an interface before we
think through the problem is blind too, we don't have confidence the new
interface (part of ABI) will not be changed in the future. And the interface is
system wide, how does it work if we have two workloads running with different
write hint policy?

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux