On Tuesday August 10, paul.clements@xxxxxxxxxxxx wrote: > Neil, > > I've implemented the improvements that you've suggested below, and > preliminary tests are showing some very good results! > > The latest patches are available at: > > http://www.parisc-linux.org/~jejb/md_bitmap/ > This is definitely heading in the right direction. However I think you have missed some of the subtleties of plugging. Plugging should be seen as entirely the responsibility of each device. Each device will plug it's own queue if or when it thinks that is a good idea, and will possibly unplug it spontaneously. Thus if you ever call generic_make_request, you have to assume that the request could get services at any moment. The only interest that clients of a device have in plugging is that they may request the device to unplug, in case it still has some old requests in a plugged queue. When a raid1 write request comes, your code passes that request to generic_make_request, and schedules a bitmap update for later. Presumably you assume that the underlying devices are or will-be plugged and only get unplugged when ->unplug_fn is called. However this isn't true. What you need to do is: In the WRITE branch of make_request, schedule the bitmap update as you currently do, but *don't* call generic_make_request on the bio. Instead, you should add the bio to a queue using ->bi_next for linkage. Then in raid1_unplug, you should: Call bitmap_unplug (this waits for the bitmaps to be safe) Walk through the queue of bio and submit them with make_request. As the bit(s) will already be marked dirty, the request will go straight through. Finally call unplug_slaves. That should then have the bitmap updates happening before the data updates, and should cluster bitmap updates better. The next step would be to stop write_page from being synchronous - I'm not comfortable about ->unplug-fn having to wait for disk io. This looks to be quite awkward as there is not call-back that happens when writeback on a page is completed. You need to have a kernel thread to wait for the pages to complete. I would get the bitmap_daemon to do this. write_page puts the page on a queue and wakes up bitmap_write_daemon bitmap_write_daemon calls prepare_write/commit_write and puts the page on another queue and wakes up bitmap_daemon. bitmap_daemon whenever it is woken up: - checks if any clean bitmaps needs to be written and puts them on the queue for bitmap_write_daemon - checks if there is a page in the queue from bitmap_write_daemon and, if there is, waits for it (wait_on_page_writeback). When that completes, put the page and decrease a count, and go round the loop. - if the count of pending bitmap updates hits zero, bitmap_daemon walks the queue of pending write requests of the array (as I suggested above could be done in raid1_unplug) and submits them. Does that make sense? I would suggest doing the first version first (leaving write_page synchronous), and then have a go at making write_page synchronous. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html