On 2020-03-10 11:40, Ilya Dryomov wrote:
[skip]
And seems if the ownership of the ->pages is transferred to
the handle_watch_notify() and freed there, then it should be
fixed by having release in one place: here or there.
The problem is that at least at one point CEPH_MSG_DATA_PAGES needed
a reference count -- it couldn't be freed it in one place. pagelists
are reference counted, but can't be easily swapped in, hence that TODO.
Thanks for reminding me about this. I'll take a look -- perhaps the
reference count is no longer required and we can get away with a simple
flag.
To my shallow understanding handle_watch_notify() also has an error
path which eventually does not free or take ownership of data->pages,
e.g. callers of 'goto out_unlock_osdc'. Probably code relies on the
fact, that sender knows what is doing and never sends ->data with
wrong cookie or opcode, but looks very suspicious to me.
Seems for this particular DATA_PAGES case it is possible just to
take an ownership by zeroing out data->pages and data->length,
which prevents double free, something as the following:
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index b68b376d8c2f..15ae6377c461 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -4440,6 +4440,8 @@ static void handle_watch_notify(struct
ceph_osd_client *osdc,
ceph_release_page_vector(data->pages,
calc_pages_for(0,
data->length));
}
+ data->pages = NULL;
+ data->length = 0;
}
lreq->notify_finish_error = return_code;
and keeping the current patch with explicit call of
ceph_release_page_vector from ceph_msg_data_destroy() from
messenger.c.
--
Roman