сб, 19 окт. 2019 г. в 04:10, David Wysochanski <dwysocha@xxxxxxxxxx>: > Right but look at it this way. If we conditionally set the state, > then what is preventing a duplicate list_del_init call? Let's say we > get into the special case that you're not sure it could happen > (mid_entry->mid_state == MID_REQUEST_SUBMITTED is false), and so the > mid_state does not get set to MID_RETRY_NEEDED inside cifs_reconnect > but yet the mid gets added to retry_list. In that case both the > cifs_reconnect code path will call list_del_init as well as the other > code paths which we're adding the conditional tests and that will > cause a blowup again because cifs_reconnect retry_list loop will end > up in a singleton list and exhaust the refcount, leading to the same > crash. This is exactly why the refcount only patch crashed again - > it's erroneous to think it's ok to modify mid_entry->qhead without a) > taking globalMid_Lock and b) checking mid_state is what you think it > should be. But if you're really concerned about that 'if' condition > and want to leave it, and you want a stable patch, then the extra flag > seems like the way to go. But that has the downside that it's only > being done for stable, so a later patch will likely remove it > (presumably). I am not sure what such policy is or if that is even > acceptable or allowed. This is acceptable and it is a good practice to fix the existing issue with the smallest possible patch and then enhance the code/fix for the current master branch if needed. This simplify backporting a lot. Actually looking at the code: cifsglob.h: 1692 #define MID_DELETED 2 /* Mid has been dequeued/deleted */ ^^^ Isn't "deqeueued" what we need? It seems so because it serves the same purpose: to indicate that a request has been deleted from the pending queue. So, I think we need to just make use of this existing flag and mark the mid with MID_DELETED every time we remove the mid from the pending list. Also assume moving mids from the pending lists to the local lists in cleanup_demultiplex_info and cifs_reconnect as a deletion too because those lists are not exposed globally and mids are removed from those lists before the functions exit. I made a patch which is using MID_DELETED logic and merging DeleteMidQEntry and cifs_mid_q_entry_release into one function to avoid possible use-after free of mid->resp_buf. David, could you please test the attached patch in your environment? I only did sanity testing of it. -- Best regards, Pavel Shilovsky
Attachment:
mid_dequeue.patch
Description: Binary data