On Thu, 24 Jan 2013 14:02:52 +0800 majianpeng <majianpeng@xxxxxxxxx> wrote: > There is iobarrier in raid1 because two reason resync/recovery or reconfigure the array. > At present,it suspend all nornal IO when reysync/recovey. > But if nornal IO is outrange the resync/recovery windwos,it don't need to iobarrier. > So I rewrite the iobarrier. > Because the reasons of barrier are two,so i use two different methods. > First for resync/recovery, there is a reysnc window.The end position is 'next_resync'.Because the resync depth is RESYNC_DEPTH(32), > so the start is 'next_resync - RESYNC_SECTOR * RESYNC_DEPTH' > The nornal IO Will be divided into three categories by the location. > a: before the start of resync window > b: between the resync window > c: after the end of resync window > For a, it don't barrier. > For b, it need barrier and used the original method > For c, it don't barrier but it need record the minimum position.If next resync is larger this, resync action will suspend.Otherwise versa. > I used rbtree to order those io. > > For the reason of reconfigure of the arrary,I proposed a concept "force_barrier".When there is force_barrier, all Nornam IO must be suspended. > > NOTE: > Because this problem is also for raid10, but i only do it for raid1. It is post out mainly to make sure it is > going in the correct direction and hope to get some helpful comments from other guys. > If the methods is accepted,i will send the patch for raid10. > > Signed-off-by: Jianpeng Ma <majianpeng@xxxxxxxxx> Hi, thanks for this and sorry for the delay in replying. The patch is reasonably good, but there is room for improvement. I would break it up into several patches which are easier to review. - Firstly, don't worry about the barrier for 'read' requests - it really isn't relevant for them (your patch didn't do this). - Secondly, find those places where raise_barrier() is used to reconfigure the array and replace those with "freeze_array()". I think that is safe, and it means that we don't need to pass a 'force' argument to 'raise_barrier()' and 'lower_barrier()'. - The rest might have to be all one patch, though if it could be done in a couple of distinct stages that would be good. For the different positions (before, during, after), which you currently call 0, 1, and 2 you should use an enum so they all have names. I don't really like the rbtree. It adds an extra place where you take the spinlock and causes more work to be done inside a spinlock. And it isn't really needed. Instead, you can have two R1BIO flags for "AFTER_RESYNC". One for requests that are more than 2 windows after, one for request that are less then 2 windows after (or maybe 3 windows or maybe 8..). Each of these has a corresponding count (as you already have: nr_after). Then when resync gets to the end of a window you wait until the count of "less than 2 windows after" reaches zero, then atomically swap the meaning of the two bits (toggle a bit somewhere). This should give you nearly the same effect with constant (not log-n) effort. Finally, have you done any testing, either to ensure there is no slow-down, or to demonstrate some improvement? Thanks, NeilBrown
Attachment:
signature.asc
Description: PGP signature