On 2012-08-09 09:32 NeilBrown <neilb@xxxxxxx> Wrote: >On Thu, 9 Aug 2012 09:20:05 +0800 "Jianpeng Ma" <majianpeng@xxxxxxxxx> wrote: > >> On 2012-08-08 20:53 Shaohua Li <shli@xxxxxxxxxx> Wrote: >> >2012/8/8 Jianpeng Ma <majianpeng@xxxxxxxxx>: >> >> On 2012-08-08 10:58 Shaohua Li <shli@xxxxxxxxxx> Wrote: >> >>>2012/8/7 Jianpeng Ma <majianpeng@xxxxxxxxx>: >> >>>> On 2012-08-07 13:32 Shaohua Li <shli@xxxxxxxxxx> Wrote: >> >>>>>2012/8/7 Jianpeng Ma <majianpeng@xxxxxxxxx>: >> >>>>>> On 2012-08-07 11:22 Shaohua Li <shli@xxxxxxxxxx> Wrote: >> >>>>>>>My directIO randomwrite 4k workload shows a 10~20% regression caused by commit >> >>>>>>>895e3c5c58a80bb. directIO usually is random IO and if request size isn't big >> >>>>>>>(which is the common case), delay handling of the stripe hasn't any advantages. >> >>>>>>>For big size request, delay can still reduce IO. >> >>>>>>> >> >>>>>>>Signed-off-by: Shaohua Li <shli@xxxxxxxxxxxx> >> >>>> [snip] >> >>>>>>>-- >> >>>>>> May be used size to judge is not a good method. >> >>>>>> I firstly sended this patch, only want to control direct-write-block,not for reqular file. >> >>>>>> Because i think if someone used direct-write-block for raid5,he should know the feature of raid5 and he can control >> >>>>>> for write to full-write. >> >>>>>> But at that time, i did know how to differentiate between regular file and block-device. >> >>>>>> I thik we should do something to do this. >> >>>>> >> >>>>>I don't think it's possible user can control his write to be a >> >>>>>full-write even for >> >>>>>raw disk IO. Why regular file and block device io matters here? >> >>>>> >> >>>>>Thanks, >> >>>>>Shaohua >> >>>> Another problem is the size. How to judge the size is large or not? >> >>>> A syscall write is a dio and a dio may be split more bios. >> >>>> For my workload, i usualy write chunk-size. >> >>>> But your patch is judge by bio-size. >> >>> >> >>>I'd ignore workload which does sequential directIO, though >> >>>your workload is, but I bet no real workloads are. So I'd like >> >> Sorry,my explain maybe not corcrect. I write data once which size is almost chunks-size * devices,in order to full-write >> >> and as possible as to no pre-read operation. >> >>>only to consider big size random directio. I agree the size >> >>>judge is arbitrary. I can optimize it to be only consider stripe >> >>>which hits two or more disks in one bio, but not sure if it's >> >>>worthy doing. Not ware big size directio is common, and even >> >>>is, big size request IOPS is low, a bit delay maybe not a big >> >>>deal. >> >> If add a acc_time for 'striep_head' to control? >> >> When get_active_stripe() is ok, update acc_time. >> >> For some time, stripe_head did not access and it shold pre-read. >> > >> >Do you want to add a timer for each stripe? This is even ugly. >> >How do you choose the expire time? A time works for harddisk >> >definitely will not work for a fast SSD. >> A time is like the size which is arbitrary. >> How about add a interface in sysfs to control by user? >> Only user can judge the workload, which sequatial write or random write. > >This is getting worse by the minute. A sysfs interface for this is >definitely not a good idea. > >The REQ_NOIDLE flag is a pretty clear statement that no more requests that >merge with this one are expected. If some use cases sends random requests, >maybe it should be setting REQ_NOIDLE. > >Maybe someone should do some research and find out why WRITE_ODIRECT doesn't >include REQ_NOIDLE. Understanding that would help understand the current >problem. > >NeilBrown > Hi neil: Thanks your suggestion. Direct-write can set REQ_NOIDLE because only finish this write-operation the next can do. But direct-write(struct dio) can break up to some bios(struct bios). For those bios, they have releationSo they may not set REQ_NOIDLE unless the last bio. I think this may increase the performance, because random-direct-write at most only one bio? ?韬{.n?????%??檩??w?{.n???{炳盯w???塄}?财??j:+v??????2??璀??摺?囤??z夸z罐?+?????w棹f