On 12.4.2013 2:22, Tejun Heo wrote: > On Thu, Apr 11, 2013 at 08:06:10PM -0400, Mikulas Patocka wrote: >> All that I can tell you is that adding an empty atomic operation >> "cmpxchg(&bio->bi_css->refcnt, bio->bi_css->refcnt, bio->bi_css->refcnt);" >> to bio_clone_context and bio_disassociate_task increases the time to run a >> benchmark from 23 to 40 seconds. > > Right, linear target on ramdisk, very realistic, and you know what, > hell with dm, let's just hand code everything into submit_bio(). I'm > sure it will speed up your test case significantly. > > If this actually matters, improve it in *sane* way. Make the refcnts > per-cpu and not use atomic ops. In fact, we already have proposed > implementation of percpu refcnt which is being used by aio restructure > patches and likely to be included in some form. It's not quite ready > yet, so please work on something useful like that instead of > continuing this non-sense. Hey, what's going on here? Seems dmcrypt problem transformed into block level refcount flame :) Mikulas, please, can you talk to Tejun and find some better way how to solve DM & block level context bio contexts here? (Ideally on some realistic scenario, you have enough hw in Red Hat to try, some raid0 ssds with linear on top should be good example) and later (when agreed) implement it on dmcrypt? I definitely do not want dmcrypt becomes guinea pig here, it should remain simple as possible and should do transparent _encryption_ and not any inline device-mapper super optimizing games. Thanks, Milan -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel