On 10.03.23 05:14, Ye, Xiang wrote:
Hi Oliver,
Hi,
sorry for the delayed answer.
Thanks for your review.
On Thu, Mar 09, 2023 at 01:53:28PM +0100, Oliver Neukum wrote:
On 09.03.23 08:10, Ye Xiang wrote:
+static int ljca_stub_write(struct ljca_stub *stub, u8 cmd, const void *obuf, unsigned int obuf_len,
+ void *ibuf, unsigned int *ibuf_len, bool wait_ack, unsigned long timeout)
Why do you make ibuf_len a pointer?
Because ibuf_len is also used as output of this function here.
It stores the actual length of ibuf receive from LJCA device.
Yes, I understand that now, thank you for the explanation, yet
that is problematic, if we look at another issue. See further down:
+ ret = -ENODEV;
+ goto error_put;
+ }
+
+ mutex_lock(&dev->mutex);
+ stub->cur_cmd = cmd;
+ stub->ipacket.ibuf = ibuf;
+ stub->ipacket.ibuf_len = ibuf_len;
Here you store the pointer into the stub. Hence we must make sure
that the location it points to stays valid.
+ stub->acked = false;
+ ret = usb_bulk_msg(dev->udev, usb_sndbulkpipe(dev->udev, dev->out_ep), header, msg_len,
+ &actual, LJCA_USB_WRITE_TIMEOUT_MS);
+ kfree(header);
+ if (ret) {
+ dev_err(&dev->intf->dev, "bridge write failed ret:%d\n", ret);
+ goto error_unlock;
+ }
+
+ if (actual != msg_len) {
+ dev_err(&dev->intf->dev, "bridge write length mismatch (%d vs %d)\n", msg_len,
+ actual);
+ ret = -EINVAL;
+ goto error_unlock;
+ }
+
+ if (wait_ack) {
+ ret = wait_event_timeout(dev->ack_wq, stub->acked, msecs_to_jiffies(timeout));
+ if (!ret) {
+ dev_err(&dev->intf->dev, "acked wait timeout\n");
+ ret = -ETIMEDOUT;
The function will return an error in the timeout case.
This may be a trivial observation but it becomes important.
If that triggers, you may have a pending URB.
I misspoke. Pending IO would have been correct.
You must kill it.
which URB? I guess what you mean is dev->in_urb?
But the in_urb should always be up to waiting for message from firmware,
even through this timeout happen.
Now let's look at ljca_mng_reset_handshake(). I am afraid I have to quote
its first part in full:
+static int ljca_mng_reset_handshake(struct ljca_stub *stub)
+{
+ struct ljca_mng_priv *priv;
+ __le32 reset_id;
+ __le32 reset_id_ret = 0;
+ unsigned int ilen = sizeof(__le32);
This is on the _stack_
Highly important !!!
+ int ret;
+
+ priv = ljca_priv(stub);
+ reset_id = cpu_to_le32(priv->reset_id++);
+ ret = ljca_stub_write(stub, LJCA_MNG_RESET_NOTIFY, &reset_id, sizeof(reset_id),
+ &reset_id_ret, &ilen, true, LJCA_USB_WRITE_ACK_TIMEOUT_MS);
If we run into the timeout error case, ret will be -ETIMEDOUT.
+ if (ret)
+ return ret;
And thus here we return and free the stack _including_ "ilen", which we
still have a pointer to. That means if the operation concludes after
a timeout, we _will_ follow a rogue pointer.
A couple of functions have this race condition.
What happens to stub in the error case?
ljca_add_mfd_cell only failed when krealloc_array failing. When
ljca_add_mfd_cell fails, the related stub just be left alone here.
Maybe I should free the stub here when fails? what is your advice?
Yes, that is the cleanest solution.
+ }
+
+ return 0;
+}
+
+static void ljca_disconnect(struct usb_interface *intf)
+{
+ struct ljca_dev *dev = usb_get_intfdata(intf);
+
+ ljca_stop(dev);
What prevents restarting the device here?
Sorry, you are calling ljca_start() only in probe(9 and resume()
Your code is correct.
Regards
Oliver