On Wed, Feb 10, 2016 at 6:53 PM, Song Liu <songliubraving@xxxxxx> wrote: > Summary: > > Resending the patch to see whether we can get another chance... > > When testing current SATA SSDs as the journal device, we have > seen 2 challenges: throughput of long sequential writes, and > SSD life time. > > To ease the burn on the SSD, we tested bypassing journal for > full stripe writes. We understand that bypassing journal will > re-introduce write hole to the md layer. However, with > well-designed application and file system, such write holes > should not result in any data loss. To me the probability of data-lost during a full-stripe write is more than partial-stripe write. I understand your motivation of doing this however as Neil mentioned, this trade-off and your assumption about the "well-designed application and file system" put a question mark on the general usage of MD journal. > Our test systems have 2 RAID-6 array per server and 15 HDDs > per array. These 2 arrays shared 1 SSD as the journal (2 > partitions). Btrfs is created on both array. > > For squential write benchmarks, we observe significant > performance gain (250MB/s per volume vs. 150M/s) from > bypassing journal for full stripes. > > We all performed power cycle tests on these systems while > running a write workload. For more than 50 power cycles, > we have seen zero data loss. > Is it possible to share more details about your power cycle test procedure and data loss detection method? > To configure the bypass feature: > > echo 1 > /sys/block/mdX/md/r5l_bypass_full_stripe > > and > > echo 0 > /sys/block/mdX/md/r5l_bypass_full_stripe > > For file system integrity, the code does not bypass any write > with REQ_FUA. > > Signed-off-by: Song Liu <songliubraving@xxxxxx> > Signed-off-by: Shaohua Li <shli@xxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html