[dm-devel] [RFC][PATCH] Fix BIO reordering when resuming devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

We should not never reorder BIOs (at least not around barriers),
as Jens Axboe affirmed. This will cause problems as soon as there
will be BIO users that set BIO_RW_BARRIER.

This is an attempt to fix this problem in the device-mapper core.

Currently this can happen while the device is suspended since
md->deferred implements a stack instead of a queue.

Also we must block incoming requests while the queue is flushed
because this could also lead to a reordering.

I thought of two solutions:
1. While the queue is being flushed new incoming requests are added
   to the tail of the queue.
2. When the queue is being flushed just block dm_request until flushing
   is finished.

The first has a problem: Under heavy IO we might never leave dm_resume
because under heavy IO dm_request keeps flooding the queue.

So I implemented the second.

The queue gets flushed while md->lock is held (for writing) so
dm_request blocks. Because we then cannot requeue the deferred requests
via generic_make_request (would deadlock in dm_request when trying to
acquire the read lock) it directly starts processing of the request.

Therefore the last few lines of dm_request are moved into a
__dm_request.

BTW: There was a up_read(&md->lock); missing in dm_request in the
(!dm->map) error case.


diff -Nur linux.orig/drivers/md/dm.c linux/drivers/md/dm.c
--- linux.orig/drivers/md/dm.c	2004-01-01 21:49:16.118235408 +0100
+++ linux/drivers/md/dm.c	2004-01-01 21:54:35.537676272 +0100
@@ -63,7 +63,8 @@
 	 */
 	atomic_t pending;
 	wait_queue_head_t wait;
-	struct bio *deferred;
+	struct bio *deferred_head;
+	struct bio *deferred_tail;
 
 	/*
 	 * The current mapping.
@@ -231,8 +232,13 @@
 		return 1;
 	}
 
-	bio->bi_next = md->deferred;
-	md->deferred = bio;
+	bio->bi_next = NULL;
+
+	if (md->deferred_tail)
+		md->deferred_tail->bi_next = bio;
+	else
+		md->deferred_head = bio;
+	md->deferred_tail = bio;
 
 	up_write(&md->lock);
 	return 0;		/* deferred successfully */
@@ -512,6 +518,16 @@
  *---------------------------------------------------------------*/
 
 
+static inline void __dm_request(struct mapped_device *md, struct bio *bio)
+{
+		if (!md->map) {
+			bio_io_error(bio, bio->bi_size);
+			return;
+		}
+
+		__split_bio(md, bio);
+}
+
 /*
  * The request function that just remaps the bio built up by
  * dm_merge_bvec.
@@ -550,12 +566,7 @@
 		down_read(&md->lock);
 	}
 
-	if (!md->map) {
-		bio_io_error(bio, bio->bi_size);
-		return 0;
-	}
-
-	__split_bio(md, bio);
+	__dm_request(md, bio);
 	up_read(&md->lock);
 	return 0;
 }
@@ -788,16 +799,16 @@
 }
 
 /*
- * Requeue the deferred bios by calling generic_make_request.
+ * Process the deferred bios
  */
-static void flush_deferred_io(struct bio *c)
+static void flush_deferred_io(struct mapped_device *md, struct bio *c)
 {
 	struct bio *n;
 
 	while (c) {
 		n = c->bi_next;
 		c->bi_next = NULL;
-		generic_make_request(c);
+		__dm_request(md, c);
 		c = n;
 	}
 }
@@ -891,12 +902,14 @@
 
 	dm_table_resume_targets(md->map);
 	clear_bit(DMF_SUSPENDED, &md->flags);
+	def = md->deferred_head;
+	md->deferred_head = md->deferred_tail = NULL;
+
+	flush_deferred_io(md, def);
+
 	clear_bit(DMF_BLOCK_IO, &md->flags);
-	def = md->deferred;
-	md->deferred = NULL;
 	up_write(&md->lock);
 
-	flush_deferred_io(def);
 	blk_run_queues();
 
 	return 0;




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux