On Tue, Mar 10, 2020 at 1:44 PM Roman Penyaev <rpenyaev@xxxxxxx> wrote: > > On 2020-03-10 11:40, Ilya Dryomov wrote: > > [skip] > > >> And seems if the ownership of the ->pages is transferred to > >> the handle_watch_notify() and freed there, then it should be > >> fixed by having release in one place: here or there. > > > > The problem is that at least at one point CEPH_MSG_DATA_PAGES needed > > a reference count -- it couldn't be freed it in one place. pagelists > > are reference counted, but can't be easily swapped in, hence that TODO. > > > > Thanks for reminding me about this. I'll take a look -- perhaps the > > reference count is no longer required and we can get away with a simple > > flag. > > To my shallow understanding handle_watch_notify() also has an error > path which eventually does not free or take ownership of data->pages, > e.g. callers of 'goto out_unlock_osdc'. Probably code relies on the > fact, that sender knows what is doing and never sends ->data with > wrong cookie or opcode, but looks very suspicious to me. > > Seems for this particular DATA_PAGES case it is possible just to > take an ownership by zeroing out data->pages and data->length, > which prevents double free, something as the following: > > diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c > index b68b376d8c2f..15ae6377c461 100644 > --- a/net/ceph/osd_client.c > +++ b/net/ceph/osd_client.c > @@ -4440,6 +4440,8 @@ static void handle_watch_notify(struct > ceph_osd_client *osdc, > > ceph_release_page_vector(data->pages, > calc_pages_for(0, > data->length)); > } > + data->pages = NULL; > + data->length = 0; > } > lreq->notify_finish_error = return_code; > > > and keeping the current patch with explicit call of > ceph_release_page_vector from ceph_msg_data_destroy() from > messenger.c. Not quite -- that would break all of OSD client, which supplies its own page vectors. I'll send a patch in a few. Thanks, Ilya