On Fri, May 18, 2018 at 3:00 PM, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > > > On Fri, 18 May 2018, Dan Williams wrote: > >> >> ...and I wonder what the benefit is of the 16-byte case? I would >> >> assume the bulk of the benefit is limited to the 4 and 8 byte copy >> >> cases. >> > >> > dm-writecache uses 16-byte writes frequently, so it is needed for that. >> > >> > If we split 16-byte write to two 8-byte writes, it would degrade >> > performance for architectures where memcpy_flushcache needs to flush the >> > cache. >> >> My question was how measurable it is to special case 16-byte >> transfers? I know Ingo is going to ask this question, so it would >> speed things along if this patch included performance benefit numbers >> for each special case in the changelog. > > I tested it some times ago - and the movnti instruction has 2% better > throughput than the existing memcpy_flushcache function. > > It is doing one 16-byte write for every sector written and one 8-byte > write for every sector clean-up. So, the overhead is measurable. Awesome, include those measured numbers in the changelog for the next spin of the patch. -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel