Good analysis, that is exactly what is happening. The multiplex_thread will race with compound_send_recv() where the latter may call list_del() on a mid which the multiplex_loop has not yet processed with standard_receive3, the crash happens. I have verified by adding new mid_flags to track when we call list_del_init() after we have already called list_del() on a mid. That also explains why my kludge removed the crashes. For compounds, I think it would make sense to only invoke the callback for the last component. But instead of the "fix" I used the multiRsp/multiEnd flags should be set accordingly inside compound_send_recv() with a comment on why we can / should only invoke the callbacks for the last item in a compound. regards ronnie sahlberg ----- Original Message ----- > From: "Pavel Shilovsky" <piastryyy@xxxxxxxxx> > To: "ronnie sahlberg" <ronniesahlberg@xxxxxxxxx> > Cc: "Aurélien Aptel" <aaptel@xxxxxxxx>, "Ronnie Sahlberg" <lsahlber@xxxxxxxxxx>, "linux-cifs" > <linux-cifs@xxxxxxxxxxxxxxx>, "Steve French" <smfrench@xxxxxxxxx> > Sent: Saturday, 25 August, 2018 4:48:30 AM > Subject: Re: WIP more compounding > > Tue, 21 Aug 2018 1:54, ronnie sahlberg <ronniesahlberg@xxxxxxxxx>: > > > > I think this is probably not due to these patches themselves but as > > they may make compounding more common they probably uncover bugs > > introduced in the previous patch sets of mine > > (that did touch a lot of very hairy code I struggled to unwind.) > > I suspect that in some conditions the following part of > compound_send_recv is being executed once we process a part of the > compounding chain: > > 910 out: > 911 >---for (i = 0; i < num_rqst; i++) > 912 >--->---cifs_delete_mid(midQ[i]); > 913 >---add_credits(ses->server, credits, optype); > 914 > 915 >---return rc; > 916 } > > cifs_delete_mid() calls list_del() for a mid. So a subsequent > dequeue_mid() may generate the general protection fault reported by > Aurelien. > > Actually cifs_delete_mid() and dequeue_mid() doing the same thing > (removing the mid from the MID list) looks scary and this needs to be > revised. The easiest fix would be to change list_del() to > list_del_init() in cifs_delete_mid() but this may just hide the > problem. > > -- > Best regards, > Pavel Shilovsky >