On Wed, Mar 27, 2019 at 11:56:51AM +0100, Martin Christian wrote: > Hi, > > I've written a linux kernel module for an USB device. The USB driver > provides 2 read-only character devices, which can be opened only > exclusively by one process: > - `/dev/cdev_a` > - `/dev/cdev_b` "exclusive" opening really isn't that, unless you go through _HUGE_ gyrations to try to control this. It's not worth it in the end, so just let userspace deal with it. If it wants to interleave data to a device node, let it, it can handle the fallout. As an example of this, serial ports are not "exclusively owned", right? > The USB device can only handle one request at a time. > > The test setup is a follows: > - Processes A reads data from 1st device: `dd if=/dev/cdev_a of=/tmp/a > bs=X` > - Processes B reads data from 2nd device: `dd if=/dev/cdev_b of=/tmp/b > bs=X` > - Process A and B run in parallel > - After 10 seconds both processes are killed and size of both output > files is compared. > > For certain values of `X` there is a significant difference in size > between the two files, which I don't expect. > > A read call to the driver does the following: > 1. `mutex_lock_interruptible(iolock)` > 2. `usb_bulk_msg(dev, pipe, buf, X, timeout)` > 3. `mutex_unlock(iolock)` > 4. `copy_to_user(buf)` What are these values of X that cause differences here? > What I would expect is the following: > 1. Proc A: `mutex_lock_interruptible(iolock)` > 2. Proc A: `usb_bulk_msg(dev, pipe, buf, X, timeout)` > 3. Scheduling: A -> B > 4. Proc B: `mutex_lock_interruptible(iolock)` -> blocks > 5. Scheduling: B -> A > 6. Proc A: `mutex_unlock(iolock)` > 7. Proc A: `copy_to_user(buf)` > 8. Proc A: `mutex_lock_interruptible(iolock)` -> blocks > 9. Scheduling: A -> B > 10. Proc B: `usb_bulk_msg(dev, pipe, buf, X, timeout)` > > But what I see with ftrace is that in step 8, process A still continues. > And it seems that for certain values of X the time inside the critical > region is a multiple of the time slice, so that process B always gets > the time slice when the critical region is blocked. What would be a best > practise solution for this? I was thinking of calling `schedule()` each > time after copying to user space or playing with nice values or using > wait_queues? Step back a second and let me ask what exactly you are trying to solve here? If you are just playing around and want to watch mutexes being grabbed and passed of, that's fine, and fun. You are getting a good education with how scheduling works and of course, the hell^Wmess that USB really is. But if you are trying to somehow create a real api that you have to enforce the passing off of writing data from two different character devices in an interleaved format, you are doing this totally wrong, as this is not going to work with a simple mutex, as you have found out. There are so many different variables in play here, that you are trying to somehow sync up with just a single lock. You have, just off the top of my head: - scheduling issues of the different userspace programs - variability in USB transports (usb_bulk_msg() is just about the most inefficient way of ever sending data on a USB device, you end up wasting loads of time sleeping and waking up and waiting for things to happen, and the USB pipeline is almost totally empty for most of it.) - scheduling issues of the USB wakequeue that handles the data to be sent - sleeping/memory fault issues when copying data to/from userspace can play a huge factor for some systems. and I am sure there are more. A mutex isn't always "fair" in that it instantly gives up and passes control to someone else who is holding it at the same time. Sometimes it can be grabbed by someone else, like the person who just dropped it, based on a whole raft of factors that have been worked out over the years to provide a robust and scalable general purpose operating system. Schedulers too are not always "fair", that depends on a whole raft of things like what CPU is running when, where your task happens to be living at the moment, and of course, what else is happening in the system at the exact same time (I'm sure other things are happening, right?) Again, this is all due to the way CPUs work, and how Linux manages tasks in order to try to keep all resources used best at the moment. So, I really haven't answered your question here except to say, "it's complicated" and "you aren't measuring what you think you are measuring" :) Try to take USB out of the picture as well as userspace, and try running two kernel threads trying to grab a mutex and then print out "A" or "B" to the kernel log and then give it up. Is that output nicely interleaved or is there some duplicated messages.[1] Again, what are you really trying to determine here? Odds are there is a better way to do it, given that your above sequence of events is highly variable for a whole raft of reasons. thanks, greg k-h [1] Extra bonus points for those that recognize this task... _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies