Re: [PATCH] rbd: Clear ceph_msg->bio_iter for retransmitted message

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/06/2012 03:03 AM, Yan, Zheng wrote:
> From: "Yan, Zheng" <zheng.z.yan@xxxxxxxxx>
> 
> The bug can cause NULL pointer dereference in write_partial_msg_pages
> 
> Signed-off-by: Zheng Yan <zheng.z.yan@xxxxxxxxx>
> ---
>  net/ceph/messenger.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 1a80907..785b953 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -598,6 +598,7 @@ static void prepare_write_message(struct ceph_connection *con)
>  	     le32_to_cpu(con->out_msg->footer.front_crc),
>  	     le32_to_cpu(con->out_msg->footer.middle_crc));
>  
> +	m->bio_iter = NULL;
>  	/* is there a data payload? */
>  	if (le32_to_cpu(m->hdr.data_len) > 0) {
>  		/* initialize page iterator */

OK, now that I've looked at this a little bit I think I have a small
preference for doing this assignment earlier in that function, where
there is already special handling that is dependent on whether the
message being processed is being re-sent or not:

        /*
         * only assign outgoing seq # if we haven't sent this message
         * yet.  if it is requeued, resend with it's original seq.
         */
        if (m->needs_out_seq) {
                m->hdr.seq = cpu_to_le64(++con->out_seq);
                m->needs_out_seq = false;
        }

So I'd suggest it look like:

        /*
         * only assign outgoing seq # if we haven't sent this message
         * yet.  if it is requeued, resend with it's original seq.
         */
        if (m->needs_out_seq) {
                m->hdr.seq = cpu_to_le64(++con->out_seq);
                m->needs_out_seq = false;
        }
#ifdef CONFIG_BLOCK
	else {
		/* Need to reinitialize the iterator on resends */
		m->bio_iter = NULL;
	}
#endif

Please let me know if you concur with my proposed modification or
whether you feel strongly we should go with your original fix.

In any case, I think your fix looks like the right thing to do.

Reviewed-by: Alex Elder <elder@xxxxxxxxxxx>



Now let me know if you see anything wrong in the following explanation
of the problem.

The first time a message is sent, it will have been allocated via
ceph_msg_new(), which will initialize both the bio and bio_iter
fields to NULL (and bio_seg to 0).

rbd is an osd client.  It sets up its request queue with
request_queue->request_fn = rbd_rq_fn()
So here's the path that leads to a bio in a message:
  rbd_rq_fn()
  ->rbd_req_{read,write}()
    ->rbd_do_op()
      ->rbd_do_request()
        ->ceph_osdc_alloc_request()

And rbd_osdc_alloc_request() assigns the passed-in bio to req->r_bio
and grabs a reference to it.

Later, ceph_osdc_start_request() will copy the osd request's r_bio
pointer value into the request message that will be sent to the osd
server.  It does not update either of the other two bio-related fields
in that message--in particular, bio_iter is still NULL.

Then, in write_partial_msg_pages(), this code uses the null bio_iter
as an indicator that we're just starting with processing the message
that is currently referred to in con->out_msg, and initializes the
iteration fields.

#ifdef CONFIG_BLOCK
        if (msg->bio && !msg->bio_iter)
                init_bio_iter(msg->bio, &msg->bio_iter, &msg->bio_seg);
#endif

So...

In order for this to work properly, msg->bio_iter needs to be
a null pointer when we're first processing a message in
write_partial_msg_pages().  And that's fine the first time through,
but if a connection failure occurs in the middle of processing
a bio request, and ceph_fault() gets called, all sent messages get
re-queued to be re-sent, and at that point, msg->bio_iter is not
guaranteed to be a null pointer.

(I worked through describing all this because in doing so I can
convince myself I understand the problem enough to know this is
the right fix.)

					-Alex
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux