Re: rbd: null pointer dereferenced during osd_reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> The patch below (also pushed to ceph-client.git master) should fix this.
> Can you give it a test?
>

The exception still occurred with this patch. From the log below, the
case seems to be:

===================
kernel: libceph: osd0 192.168.101.134:6800 socket closed
kernel: libceph:  fault ffff880002284830 state 69 to peer 192.168.101.134:6800
kernel: libceph:  fault on LOSSYTX channel
kernel: libceph:  osd_reset osd0
kernel: libceph:  __kick_osd_requests osd0
kernel: libceph:  __reset_osd ffff880002284800 osd0
kernel: libceph:  con_close ffff880002284830 peer 192.168.101.134:6800
kernel: libceph:  get_osd ffff880002284800 2 -> 3
kernel: libceph:  queue_con ffff880002284830 - already BUSY
kernel: libceph:  put_osd ffff880002284800 3 -> 2
kernel: libceph:  con_open ffff880002284830 192.168.101.134:6800
kernel: libceph:  get_osd ffff880002284800 2 -> 3
kernel: libceph:  queue_con ffff880002284830 - already BUSY
kernel: libceph:  put_osd ffff880002284800 3 -> 2
kernel: libceph:  __unregister_linger_request ffff880002511e00
kernel: libceph:  moving osd to ffff880002284800 lru
kernel: libceph:  __move_osd_to_lru ffff880002284800
===================

The linger request should had succeeded, so it was removed from osd0's
o_requests list and put on o_linger_requests during handle_reply().
Since it is not on o_requests any more, the req->r_osd is set to NULL
even with the patch.

===================
kernel: libceph:  register_request ffff880002511e00 tid 178
kernel: libceph:   first request, scheduling timeout
kernel: libceph:  requeued lingering ffff880002511e00 tid 178 osd0
kernel: libceph:  send_queued
kernel: BUG: unable to handle kernel NULL pointer dereference at
0000000000000010
kernel: IP: [<ffffffffa01494ac>] __send_request+0x27/0xd5 [libceph]
===================

I wonder if we should not requeue the succeeded linger request in
__kick_osd_requests as below.

@@ -576,15 +576,6 @@ static void __kick_osd_requests(struct
ceph_osd_client *osdc,
                if (!req->r_linger)
                        req->r_flags |= CEPH_OSD_FLAG_RETRY;
        }
-
-       list_for_each_entry_safe(req, nreq, &osd->o_linger_requests,
-                                r_linger_osd) {
-               __unregister_linger_request(osdc, req);
-               __register_request(osdc, req);
-               list_move(&req->r_req_lru_item, &osdc->req_unsent);
-               dout("requeued lingering %p tid %llu osd%d\n", req, req->r_tid,
-                    osd->o_osd);
-       }
 }

--
Henry
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux