Re: dm mirror: fix crash caused by NULL-pointer dereference

Eric Ren <zren@xxxxxxxx> · Tue, 27 Jun 2017 09:46:50 +0800



    Hi,

    
    On 06/26/2017 11:27 PM, Eric Ren wrote:

    
    Hi Mike,
      

      On 06/26/2017 10:37 PM, Mike Snitzer wrote:
      

      On Mon, Jun 26 2017 at  9:47am -0400,
        

        Eric Ren <zren@xxxxxxxx> wrote:
        

        [...snip...]
        

          """
            

            Revert "dm mirror: use all available legs on multiple
            failures"
            

            dm io: fix duplicate bio completion due to missing ref count
            

          I have a confusion about this "dm io..." fix. The fix itself
          is good.
          

          Without it, a mkfs.ext4 on a mirrored dev whose primary mirror
          dev
          

          has failed, will crash the kernel with the discard operation
          from mkfs.ext4.
          

          However, mkfs.ext4 can succeed on a healthy mirrored device.
          This
          

          is the thing I don't understand, because no matter the
          mirrored device is
          

          good or not, there's always a duplicate bio completion before
          having this
          

          this fix, thus write_callback() will be called twice, crashing
          will
          

          occur on the
          

          second write_callback():
          

        No, there is only a duplicate bio completion if the error path
        is taken
        

        (e.g. underlying device doesn't support discard).
        

      Hmm, when "op == REQ_OP_DISCARD", please see comments in
      do_region():
      

      """
      

      static void do_region(int op, int op_flags, unsigned region,
      

                            struct dm_io_region *where, struct dpages
      *dp,
      

                            struct io *io)
      

      {
      

      ...
      

       if (op == REQ_OP_DISCARD)
      

                      special_cmd_max_sectors =
      q->limits.max_discard_sectors;
      

      ...
      

              if ((op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES ||
      

                   op == REQ_OP_WRITE_SAME) &&
      special_cmd_max_sectors == 0) {
      

                      atomic_inc(&io->count);        ===>  
      [1]
      

                      dec_count(io, region, -EOPNOTSUPP);     ===> 
      [2]
      

                      return;
      

              }
      

      """
      

      [1] ref count fixed by patch "dm io: ...";
      

      [2] we won't come here if "special_cmd_max_sectors != 0", which is
      true when both sides
      

      of the mirror is good.
      

      So only when a mirror device fails, "max_discard_sectors" on its
      queue is 0, thus this error path
      

      will be taken, right?
      

    Yes, I checked this as below:

    
    - print info

    """

    @@ -310,8
        +310,10 @@ static void do_region(int op, int op_flags, unsigned
        region,
            /*
             *
        Reject unsupported discard and write same requests.
             */
    -       if
        (op == REQ_OP_DISCARD)
    +       if
        (op == REQ_OP_DISCARD) {
                   
        special_cmd_max_sectors = q->limits.max_discard_sectors;
    +              
        DMERR("do_region: op=DISCARD,
        q->limits.max_discard_sectors=%d\n",
        special_cmd_max_sectors);
    +       }
    """

    - log message when mkfs on mirror device that primary leg fails:

    [
      169.672880] device-mapper: io: do_region: op=DISCARD,
      q->limits.max_discard_sectors=0    ===> for the failed leg

    [ 
      169.672885] device-mapper: io: do_region: op=DISCARD,
      q->limits.max_discard_sectors=8388607   ===> for the good
      leg

      
      Thanks,

      Eric

    
        """
          

          static void write_callback(unsigned long error, void *context)
          

          {
          

                   unsigned i;
          

                   struct bio *bio = (struct bio *) context;
          

                   struct mirror_set *ms;
          

                   int should_wake = 0;
          

                   unsigned long flags;
          

                   ms = bio_get_m(bio)->ms;        ====> NULL
          pointer at the
          

          duplicate completion
          

                   bio_set_m(bio, NULL);
          

          """
          

          If no this fix, I expected the DISCARD IO would always crash
          the
          

          kernel, but it's not true when
          

          the mirrored device is good. Hope someone happen to know the
          reason
          

          can give some hints ;-P
          

        If the mirror is healthy then only one completion is returned to
        

        dm-mirror (via write_callback).  The problem was the error patch
        wasn't
        

        managing the reference count as needed.  Whereas dm-io's normal
        discard
        

        IO path does.
        

      Yes, I can understand this.
      

      Thanks a lot,
      

      Eric
      

        Mike
        

        --
        

        dm-devel mailing list
        

        dm-devel@xxxxxxxxxx
        

        https://www.redhat.com/mailman/listinfo/dm-devel
        

      --
      

      dm-devel mailing list
      

      dm-devel@xxxxxxxxxx
      

      https://www.redhat.com/mailman/listinfo/dm-devel
      

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel