Re: [PATCH 5/7] USB: cdc-wdm: adding multi-reader support

Bjørn Mork <bjorn@xxxxxxx> · Fri, 20 Jan 2012 12:23:11 +0100

Oliver Neukum <oliver@xxxxxxxxxx> writes:
> Am Freitag, 20. Januar 2012, 01:50:01 schrieb Bjørn Mork:
>
>> +static ssize_t wdm_copy_to_ubuf(struct wdm_device *desc, u8 *data, size_t len)
>> +{
>> +	int rv = -EFAULT;
>> +	off_t offset;
>> +	size_t n;
>> +
>> +	if (len > desc->bufsize)
>> +		goto err;
>
> Why bother? Either you can deal with a full buffer or not. Even in this
> case why not take as much data as you can?

True.  This should not happen, so it can be removed.

>> +	/* move current position */
>> +	desc->pos += len;
>
> And what keeps this from overflowing?

Time. Transferring 2^64 bytes over a DM interface is not going to
happen.  OK, understand that is not an argument and that the unlikely
errors *will* happen to someone someday....  But as the effect of an
overflow only is that readers will stop reading, I was prepared to
ignore it.

I guess your question means that I need to fix this?

>> @@ -360,30 +385,34 @@ static ssize_t wdm_read
>>  {
>>  	int rv, cntr = 0;
>>  	int i = 0;
>> +	size_t len, offset;
>>  	struct wdm_device *desc = file->private_data;
>>  
>> +	/* cannot read more than a full buffer */
>> +	if (count > desc->bufsize)
>> +		count = desc->bufsize;
>
> Again, why bother? Either you can deal with requests to read more than
> available or you cannot.

True.  This is meaningless.

>> -	rv = mutex_lock_interruptible(&desc->rlock); /*concurrent reads */
>> -	if (rv < 0)
>> -		return -ERESTARTSYS;
>> +	/*
>> +	 * if we are more than a bufsize behind, then we've inevitably
>> +	 * lost data - nothing to do but to do a forced llseek to the
>> +	 * last known boundary
>> +	 */
>> +	if (*ppos + desc->bufsize < desc->pos)
>> +		*ppos = desc->valid;
>
> This races with the callback

Yes, the thought was that it didn't matter.  But I see that it does, and
that this should be included as part of a locked "get-bufstart-and-len"
operation below, with a fallback to restart if it ends up with len == 0.

>> -	if (desc->length == 0) {
>> -		desc->read = 0;
>> -retry:
>> +	if (*ppos == desc->pos) {
>>  		if (test_bit(WDM_DISCONNECTING, &desc->flags)) {
>>  			rv = -ENODEV;
>>  			goto err;
>>  		}
>>  		i++;
>
> What is the function of i?

Beats me :-)  

I believe it has been like that since the driver was included.  Unless
I'm missing something the function is the same in commit afba937e
Well, nearly the same:  "i" used to count the retries, but that seems
equally pointless unless the count is used.  I might be overlooking
something but I believe it never was...

Anyway,  I did noticie this and planned on removing it.  Just forgot.
Will fix.  I did remove a lot of other stuff, though ;-)

>>  		if (file->f_flags & O_NONBLOCK) {
>> -			if (!test_bit(WDM_READ, &desc->flags)) {
>> -				rv = cntr ? cntr : -EAGAIN;
>> -				goto err;
>> -			}
>> -			rv = 0;
>> +			rv = -EAGAIN;
>> +			goto err;
>
> No way. If you remove the loop, you must be able to handle read errors
> in the O_NONBLOCK case. You cannot unconditionally return -EAGAIN

Right.  Yes.  That's... embarrassing.  Must be fixed.

>
>>  		} else {
>>  			rv = wait_event_interruptible(desc->wait,
>> -				test_bit(WDM_READ, &desc->flags));
>> +						(*ppos < desc->pos) ||
>> +						test_bit(WDM_DISCONNECTING, &desc->flags));
>
> Again, this is wrong.
>
> 1. You fail to test for an error condition as a reason for a wakeup

Yes, needs fixing

> 2. As you test for a disconnection here, you must be able to handle it

I do, by returning through the normal path.

> 3. As you have removed the loop and you are rightly keeping the interruptible
>    sleep, you must be able to handle signals.

Yes, needs fixing.

>>  		/* may have happened while we slept */
>> @@ -397,50 +426,32 @@ retry:
>>  			goto err;
>>  		}
>>  
>> -		spin_lock_irq(&desc->iuspin);
>> -
>> -		if (desc->rerr) { /* read completed, error happened */
>> -			desc->rerr = 0;
>
> No, you cannot simply remove unsetting the error.

It's unset (or actually set) by the callback.  The errors are shared
among the readers and we cannot have one of them resetting it.  They all
need to see the error.

> And you need to hold the spinlock to handle the race with the callback.

Only if setting it, I presume?

>> -			spin_unlock_irq(&desc->iuspin);
>> +		/* read completed, error happened */
>> +		if (desc->rerr) {
>>  			rv = -EIO;
>>  			goto err;
>>  		}
>> -		/*
>> -		 * recheck whether we've lost the race
>> -		 * against the completion handler
>> -		 */
>> -		if (!test_bit(WDM_READ, &desc->flags)) { /* lost race */
>> -			spin_unlock_irq(&desc->iuspin);
>> -			goto retry;
>> -		}
>> -		if (!desc->reslength) { /* zero length read */
>> -			spin_unlock_irq(&desc->iuspin);
>> -			goto retry;
>> -		}
>> -		clear_bit(WDM_READ, &desc->flags);
>> -		spin_unlock_irq(&desc->iuspin);
>>  	}
>>  
>> -	cntr = count > desc->length ? desc->length : count;
>> -	rv = copy_to_user(buffer, desc->ubuf, cntr);
>> +	offset = *ppos % desc->bufsize;	/* buffer offset */
>> +	len = desc->pos - *ppos;	/* data length */
>> +	cntr = min(count, len);		/* read length */
>> +	len = min((int)(desc->bufsize - offset), cntr);	/* copy length */
>
> So you will happily give the same data to multiple readers?

Yes, that is much of the intention of the whole patch.  Let me explain
why: 

Unless you start doing protocol specific matching, then you cannot tell
which message is a reply to which client.  So the choices are
1) allow only one client at the time
2) one lucky first-comer receives the reply, the others (which quite
   possibly included the one sending the request) gets nothing
3) everyone receives everything

I have chosen 3 because I see absolutely no harm in allowing that, and
it makes the interface so much more flexible.  If required, userspace
can implement its own locking mechanism.  Or we could abuse O_EXCL or
something to make a client request exclusiveness.

My second choice would have been 1, allowing only one open() to
succeed at a time.  It would give the clients with predictable results.
But I still believe that it is very limiting for no good reason.

Alternative 2, which is what the driver used to do, is the absolutely
worst IMHO.  The will be no way to predict which client receives the
data in case of contention, and in the real world you are pretty much
guaranteed that it will be the "wrong" one.  This will cause clients
to block forever waiting for a reply which was eaten by another client
who was waiting for something else.

So:  Giving the same data to everyone is the goal of this patch.

> And what happens if the callback arrives here? You'll deliver
> partially old and partially new data.

True.  I need to ensure that the part of the buffer being updated is
locked against reading.  Will fix that.

>> +
>> +	rv = copy_to_user(buffer, desc->ubuf + offset, len);
>> +	if (len < cntr) /* wrap around? */
>> +		rv += copy_to_user(buffer + len, desc->ubuf, cntr - len);
>> +
>>  	if (rv > 0) {
>>  		rv = -EFAULT;
>>  		goto err;
>>  	}
>>  
>> -	for (i = 0; i < desc->length - cntr; i++)
>> -		desc->ubuf[i] = desc->ubuf[i + cntr];
>> -
>> -	spin_lock_irq(&desc->iuspin);
>> -	desc->length -= cntr;
>> -	spin_unlock_irq(&desc->iuspin);
>> -	/* in case we had outstanding data */
>> -	if (!desc->length)
>> -		clear_bit(WDM_READ, &desc->flags);
>> +	/* move file position */
>> +	*ppos += cntr;
>
> And this races in case of multiple readers.

It does?  How?

>>  	rv = cntr;
>>  
>>  err:
>> -	mutex_unlock(&desc->rlock);
>>  	return rv;
>>  }
>>  
>
> I am afraid this needs some serious rework. It is currently in no shape
> for inclusion.

I was sort of expecting that, but I needed to get it out to get a
feeling on exactly how much work.  Thanks a lot for taking the time to
review this.  I do expect to take some time to try to clean it up
properly before sending a new version.

Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html