On Thursday April 19, raziebe@xxxxxxxxx wrote: > > Neil Hello > I have been doing some thinking. I feel we should take a different path here. > In my tests I actually accumulate the user's buffers and when ready I submit > them, an elevator like algorithm. > > The main problem is the amount of IO's the stripe cache can hold which is > too small. My suggestion is to add an elevator of bios before moving them to the > stripe cache, trying to postpone as much as needed allocation of a new stripe. > This way we will be able to move as much as IOs to the "raid logic" > without congesting > it and still filling stripes if possible. > > Psuedo code; > > make_request() > ... > if IO direction is WRITE and IO not in stripe cache > add IO to raid elevator > .. > > raid5d() > ... > Is there a set of IOs in raid elevator such that they make a full stripe > move IOs to raid handling > while oldest IO in raid elevator is deadlined( 3ms ? ) > move IO to raid handling > .... > > Does it make any sense ? Yes. The "Is there a set of IOs in raid elevator such that they make a full stripe" would be hard to calculate. However the concept is still fine. In make request, if we cannot get a stripe_head without blocking, just add the request to a list. Once the number of active stripes drops below 75% (or 50% or whatever), raid5 reprocessed all the bios on the list, some will get added, some might not until next time. The fact that a single bio can require many stripe_heads adds an awkwardness. You would have to be able to store a partially-processed request on the list, but we do that in retry_aligned_read, so we know it is possible. Possibly the same code can be used for retry_aligned_read and for retry_delayed_write. And we can treat writes and reads the same - if no stripe_head is available, stick it on a queue. Another issue to be aware of is that write-throttling in the VM depends on the fact that each device has a limited queue. Just sticking everything on a list defeats thats. So we do need to impose some limit on the number of request in the queue. Possibly we limit the requests on the queue to some multiple of a full stripe. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html