2018-04-22 0:23 GMT+08:00 Jens Axboe <axboe@xxxxxxxxx>: > On 4/21/18 8:07 AM, Zhengyuan Liu wrote: >> 2018-04-20 22:34 GMT+08:00 Jens Axboe <axboe@xxxxxxxxx>: >>> On 4/19/18 9:51 PM, Zhengyuan Liu wrote: >>>> Hi, Shaohua >>>> >>>> I found it indeed doesn't do front merge when two threads flush plug list concurrently. To >>>> reappear , I prepared two IO threads , named a0.io and a1.io . >>>> Thread a1.io uses libaio to write 5 requests : >>>> sectors: 16 + 8, 40 + 8, 64 + 8, 88 + 8, 112 + 8 >>>> Thread a0.io uses libaio to write other 5 requests : >>>> sectors: 8+ 8, 32 + 8, 56 + 8, 80 + 8, 104 + 8 >>> >>> I'm cutting some of the below. >>> >>> Thanks for the detailed email. It's mostly on purpose that we don't >>> spend cycles and memory on maintaining a separate front merge hash, >>> since it's generally not something that happens very often. If you have >>> a thread pool doing IO and split sequential IO such that you would >>> benefit a lot from front merging, then I would generally claim that >>> you're not writing your app in the most optimal manner. >>> >> >> Thanks for explanation, I only consider the problem through the code's >> perspective and ignore the reality situation of app. > > That's quite by design and not accidental. > >>> So I'm curious, what's the big interest in front merging? >> >> If it's not something that happens so much often, I think it's not worth to >> support front merging too. >> >> By the way, I got another question that why not blktrace tracing the back >> merging of requests while flushing plugged requests to queue, if it does >> we may get a more clear view about IO merging. > > Not sure I follow, exactly where is a back merge trace missing? > I mean blktrace only traces bio merging , not traces request merging; Let me give a example, I use thread a.out to write three bios, seeing bellow: a.out: 0 + 8, 16 + 8, 8 + 8 The result of blktrace was showed as bellow: 8,16 1 7 0.292069180 1222 Q WS 0 + 8 [a.out] 8,16 1 8 0.292073960 1222 G WS 0 + 8 [a.out] 8,16 1 9 0.292074440 1222 P N [a.out] 8,16 1 10 0.292079380 1222 Q WS 16 + 8 [a.out] 8,16 1 11 0.292081840 1222 G WS 16 + 8 [a.out] 8,16 1 12 0.292085860 1222 Q WS 8 + 8 [a.out] 8,16 1 13 0.292087240 1222 F WS 8 + 8 [a.out] 8,16 1 14 0.292089100 1222 I WS 0 + 8 [a.out] 8,16 1 15 0.292095200 1222 I WS 8 + 16 [a.out] 8,16 1 16 0.295931920 1222 U N [a.out] 2 8,16 1 17 0.298528980 1222 D WS 0 + 24 [a.out] 8,16 0 3 0.302617360 3 C WS 0 + 24 [0] Total (8,16): Reads Queued: 0, 0KiB Writes Queued: 3, 12KiB Read Dispatches: 0, 0KiB Write Dispatches: 1, 12KiB Reads Requeued: 0 Writes Requeued: 0 Reads Completed: 0, 0KiB Writes Completed: 1, 12KiB Read Merges: 0, 0KiB Write Merges: 1, 4KiB PC Reads Queued: 0, 0KiB PC Writes Queued: 0, 0KiB PC Read Disp.: 3, 0KiB PC Write Disp.: 0, 0KiB PC Reads Req.: 0 PC Writes Req.: 0 PC Reads Compl.: 3 PC Writes Compl.: 0 IO unplugs: 1 Timer unplugs: 0 we merge bio(8 + 8) into request(16 + 8) at plug stage and that's well traced as F, when comes to unplug stage request(0 + 8) and request(8 + 16) merge into only one request(0 + 24), but there isn't tracing information about that operation. So I'm just a bit curious and please forgive my ignorance. Thanks. > -- > Jens Axboe >