Hi,
On 06/26/2017 11:27 PM, Eric Ren wrote:
Hi Mike,
On 06/26/2017 10:37 PM, Mike Snitzer wrote:
On Mon, Jun 26 2017 at 9:47am -0400,
Eric Ren <zren@xxxxxxxx> wrote:
[...snip...]
"""
Revert "dm mirror: use all available legs on multiple
failures"
dm io: fix duplicate bio completion due to missing ref count
I have a confusion about this "dm io..." fix. The fix itself
is good.
Without it, a mkfs.ext4 on a mirrored dev whose primary mirror
dev
has failed, will crash the kernel with the discard operation
from mkfs.ext4.
However, mkfs.ext4 can succeed on a healthy mirrored device.
This
is the thing I don't understand, because no matter the
mirrored device is
good or not, there's always a duplicate bio completion before
having this
this fix, thus write_callback() will be called twice, crashing
will
occur on the
second write_callback():
No, there is only a duplicate bio completion if the error path
is taken
(e.g. underlying device doesn't support discard).
Hmm, when "op == REQ_OP_DISCARD", please see comments in
do_region():
"""
static void do_region(int op, int op_flags, unsigned region,
struct dm_io_region *where, struct dpages
*dp,
struct io *io)
{
...
if (op == REQ_OP_DISCARD)
special_cmd_max_sectors =
q->limits.max_discard_sectors;
...
if ((op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES ||
op == REQ_OP_WRITE_SAME) &&
special_cmd_max_sectors == 0) {
atomic_inc(&io->count); ===>
[1]
dec_count(io, region, -EOPNOTSUPP); ===>
[2]
return;
}
"""
[1] ref count fixed by patch "dm io: ...";
[2] we won't come here if "special_cmd_max_sectors != 0", which is
true when both sides
of the mirror is good.
So only when a mirror device fails, "max_discard_sectors" on its
queue is 0, thus this error path
will be taken, right?
Yes, I checked this as below:
- print info
"""
@@ -310,8
+310,10 @@ static void do_region(int op, int op_flags, unsigned
region,
/*
*
Reject unsupported discard and write same requests.
*/
- if
(op == REQ_OP_DISCARD)
+ if
(op == REQ_OP_DISCARD) {
special_cmd_max_sectors = q->limits.max_discard_sectors;
+
DMERR("do_region: op=DISCARD,
q->limits.max_discard_sectors=%d\n",
special_cmd_max_sectors);
+ }
"""
- log message when mkfs on mirror device that primary leg fails:
[
169.672880] device-mapper: io: do_region: op=DISCARD,
q->limits.max_discard_sectors=0 ===> for the failed leg
[
169.672885] device-mapper: io: do_region: op=DISCARD,
q->limits.max_discard_sectors=8388607 ===> for the good
leg
Thanks,
Eric
"""
static void write_callback(unsigned long error, void *context)
{
unsigned i;
struct bio *bio = (struct bio *) context;
struct mirror_set *ms;
int should_wake = 0;
unsigned long flags;
ms = bio_get_m(bio)->ms; ====> NULL
pointer at the
duplicate completion
bio_set_m(bio, NULL);
"""
If no this fix, I expected the DISCARD IO would always crash
the
kernel, but it's not true when
the mirrored device is good. Hope someone happen to know the
reason
can give some hints ;-P
If the mirror is healthy then only one completion is returned to
dm-mirror (via write_callback). The problem was the error patch
wasn't
managing the reference count as needed. Whereas dm-io's normal
discard
IO path does.
Yes, I can understand this.
Thanks a lot,
Eric
Mike
--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel
--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel
|