Re: WIP more compounding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Good analysis, that is exactly what is happening.

The multiplex_thread will race with compound_send_recv() where the latter may call list_del()
on a mid which the multiplex_loop has not yet processed with standard_receive3, the crash happens.


I have verified by adding new mid_flags to track when we call list_del_init() after we have already called list_del()
on a mid.

That also explains why my kludge removed the crashes.


For compounds, I think it would make sense to only invoke the callback for the last component.
But instead of the "fix" I used the multiRsp/multiEnd flags should be set accordingly inside compound_send_recv() with a comment
on why we can / should only invoke the callbacks for the last item in a compound.


regards
ronnie sahlberg

----- Original Message -----
> From: "Pavel Shilovsky" <piastryyy@xxxxxxxxx>
> To: "ronnie sahlberg" <ronniesahlberg@xxxxxxxxx>
> Cc: "Aurélien Aptel" <aaptel@xxxxxxxx>, "Ronnie Sahlberg" <lsahlber@xxxxxxxxxx>, "linux-cifs"
> <linux-cifs@xxxxxxxxxxxxxxx>, "Steve French" <smfrench@xxxxxxxxx>
> Sent: Saturday, 25 August, 2018 4:48:30 AM
> Subject: Re: WIP more compounding
> 
> Tue, 21 Aug 2018 1:54, ronnie sahlberg <ronniesahlberg@xxxxxxxxx>:
> >
> > I think this is probably not due to these patches themselves but as
> > they may make compounding more common they probably uncover bugs
> > introduced in the previous patch sets of mine
> > (that did touch a lot of very hairy code I struggled to unwind.)
> 
> I suspect that in some conditions the following part of
> compound_send_recv is being executed once we process a part of the
> compounding chain:
> 
>  910 out:
>  911 >---for (i = 0; i < num_rqst; i++)
>  912 >--->---cifs_delete_mid(midQ[i]);
>  913 >---add_credits(ses->server, credits, optype);
>  914
>  915 >---return rc;
>  916 }
> 
> cifs_delete_mid() calls list_del() for a mid. So a subsequent
> dequeue_mid() may generate the general protection fault reported by
> Aurelien.
> 
> Actually cifs_delete_mid() and dequeue_mid() doing the same thing
> (removing the mid from the MID list) looks scary and this needs to be
> revised. The easiest fix would be to change list_del() to
> list_del_init() in cifs_delete_mid() but this may just hide the
> problem.
> 
> --
> Best regards,
> Pavel Shilovsky
> 



[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux