Re: [PATCH RESEND v11] hwrng: core - let sleep be interrupted when unregistering hwrng

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Herbert,

On Tue, Jul 26, 2022 at 05:44:41PM +0800, Herbert Xu wrote:
> Thanks for all your effort in resolving this issue.
> 
> I think Valentin's concern is valid though.  The sleep/wakeup paradigm
> in this patch-set is slightly unusual.
> 
> So what I've done is taken your latest patch, and incorporated
> Valentin's suggestions on top of it.  I don't think there is an
> issue with other drivers as neither approach really changes them.

Thanks so much for taking charge of this patch. I really, really
appreciate it. I'm also glad that we now have a working implementation
of Valentin's suggestion.

Just two small notes:
- I had a hard time testing everything because I don't actually have an
  ath9k, so I wound up playing with variations on
  https://xn--4db.cc/vnRj8zQw/diff in case that helps. I assume you've
  got your own way of testing things, but in case not, maybe that diff
  is useful.
- I'll mark this patch as "other tree" in the wireless tree's patchwork
  now that you're on board so Kalle doesn't have to deal with it.

> In between the setting and clearing of current_waiting_reader,
> a reader that gets a new RNG may fail simply because it detected
> the value of UNREGITERING_READER.

Yea, I actually didn't consider this a bug, but just "nothing starts
until everything else is totally done" semantics. Not wanting those
semantics is understandable. I haven't looked in detail at how you've
done that safely without locking issues, but if you've tested it, then I
trust it works. When I was playing around with different things, I
encountered a number of locking issues, depending on what the last
locker was - a user space thread, the rng kthread, or something opening
up a reader afresh. So just be sure the various combinations still work.

Also, I like the encapsulation you did with hwrng_msleep(). That's a
very clean way of accomplishing what Valentin suggested, a lot cleaner
than the gunk I had in mind would be required. It also opens up a
different approach for fixing this, which is what I would have preferred
from the beginning, but the prereq kernel work hadn't landed yet.

Specifically, Instead of doing all of this task interruption stuff and
keeping track of readers and using RCU and all that fragile code, we
can instead just use wait_completion_interruptible_timeout() inside of
hwrng_msleep(), using a condition variable that's private to the hwrng.
Then, when it's time for all the readers to quit, we just set the
condition! Ta-da, no need for keeping track of who's reading, the
difference between a kthread and a normal thread, and a variety of other
concerns.

That won't be possible until 5.20, though, and this patch is needed to
fix earlier kernels, so the intermediate step here is still required.
But please keep this on the back burner of your mind. The 5.20
enablement patch for that is here:

https://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git/commit/?h=for-next&id=a7c01fa93aeb03ab76cd3cb2107990dd160498e6

When developing that patch, I had used my little test harness above and
hwrng to test that it worked as intended. But I hadn't thought of an
elegant way of plumbing the condition through without changing up tons
of callsites. However, now you've come up with this hwrng_msleep()
solution to that. So now all the puzzle pieces are ready for a 5.20 fix
moving this toward the nicer condition setup.

A few odd unimportant comments below:

> +/* rng_dying without a barrier.  Use this if there is a barrier elsewhere. */
> +static inline bool __rng_dying(struct hwrng *rng)
> +{
> +	return list_empty(&rng->list);
> +}
> +
> +static inline bool rng_dying(struct hwrng *rng)
> +{
> +	/*
> +	 * This barrier pairs with the one in
> +	 * hwrng_unregister.  This ensures that
> +	 * we see any attempt to unregister rng.
> +	 */
> +	smp_mb();
> +	return list_empty(&rng->list);

Unimportant nit: could call __rng_dying() instead so these don't diverge
by accident.

> +}
> +
>  static size_t rng_buffer_size(void)
>  {
>  	return SMP_CACHE_BYTES < 32 ? 32 : SMP_CACHE_BYTES;
> @@ -204,6 +225,7 @@ static inline int rng_get_data(struct hwrng *rng, u8 *buffer, size_t size,
>  static ssize_t rng_dev_read(struct file *filp, char __user *buf,
>  			    size_t size, loff_t *offp)
>  {
> +	int synch = false;
>  	ssize_t ret = 0;
>  	int err = 0;
>  	int bytes_read, len;
> @@ -225,9 +247,16 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
>  			goto out_put;
>  		}
>  		if (!data_avail) {
> +			bool wait = !(filp->f_flags & O_NONBLOCK);
> +
> +			if (wait)
> +				current_waiting_reader = current;
>  			bytes_read = rng_get_data(rng, rng_buffer,
> -				rng_buffer_size(),
> -				!(filp->f_flags & O_NONBLOCK));
> +				rng_buffer_size(), wait);
> +			if (wait) {
> +				current_waiting_reader = NULL;
> +				synch |= rng_dying(rng);
> +			}
>  			if (bytes_read < 0) {
>  				err = bytes_read;
>  				goto out_unlock_reading;
> @@ -269,6 +298,9 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
>  		}
>  	}
>  out:
> +	if (synch)
> +		synchronize_rcu();

The synch usage no longer makes sense to me here. In my version,
synchronize_rcu() would almost never be called. It's only purpose was to
prevent rng_dev_read() from returning if the critical section inside of
hwrng_unregister() was in flight. But now it looks like it will be
called everytime there are no RNGs left? Or maybe I understood how this
works. Either way, please don't feel that you have to write back an
explanation; just make sure it has those same sorts of semantics when
testing.

>                 if (rc <= 0) {
> -                       pr_warn("hwrng: no data available\n");
> -                       msleep_interruptible(10000);
> +                       set_current_state(TASK_INTERRUPTIBLE);
> +                       if (kthread_should_stop())
> +                               __set_current_state(TASK_RUNNING);
> +                       schedule_timeout(10 * HZ);
>                         continue;
>                 }

Here you made a change whose utility I don't understand. My original
hunk was:

+                       if (kthread_should_stop())
+                               break;
+                       schedule_timeout_interruptible(HZ * 10);

Which I think is a bit cleaner, as schedule_timeout_interruptible sets
the state to INTERRUPTIBLE and such.

Regards,
Jason



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux