Hi Paul, On Wed, Jan 18, 2012 at 2:43 PM, Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote: > On Wed, Jan 18, 2012 at 02:15:59PM -0800, Simon Glass wrote: >> Hi Paul, >> >> On Wed, Jan 18, 2012 at 1:42 PM, Paul E. McKenney >> <paulmck@xxxxxxxxxxxxxxxxxx> wrote: >> > On Wed, Jan 18, 2012 at 01:08:13PM -0800, Simon Glass wrote: >> >> [+cc Rafael J. Wysocki <rjw@xxxxxxx> who I think wrote the wakeup.c code] >> >> >> >> Hi Alan, Paul, >> >> >> >> On Tue, Jan 17, 2012 at 8:17 PM, Paul E. McKenney >> >> <paulmck@xxxxxxxxxxxxxxxxxx> wrote: >> >> > On Tue, Jan 17, 2012 at 08:10:36PM +0000, Alan Cox wrote: >> >> >> On Tue, 17 Jan 2012 10:56:03 -0800 >> >> >> Simon Glass <sjg@xxxxxxxxxxxx> wrote: >> >> >> >> >> >> > Since serial_core now does not make serial ports wake-up capable by >> >> >> > default, add a parameter to support this feature in the 8250 UART. >> >> >> > This is the only UART where I think this feature is useful. >> >> >> >> >> >> NAK >> >> >> >> >> >> Things should just work for users. Magic parameters is not an >> >> >> improvement. If its a performance problem someone needs to fix the rcu >> >> >> sync overhead or stop using rcu on that path. >> >> >> >> OK fair enough, I agree. Every level I move down the source tree >> >> affects more people though. >> >> >> >> > >> >> > I must say that I lack context here, even after looking at the patch, >> >> > but the synchronize_rcu_expedited() primitives can be used if the latency >> >> > of synchronize_rcu() is too large. >> >> > >> >> >> >> Let me provide a bit of context. The serial_core code seems to be the >> >> only place in the kernel that does this: >> >> >> >> device_init_wakeup(tty_dev, 1); >> >> device_set_wakeup_enable(tty_dev, 0); >> >> >> >> The first call makes the device wakeup capable and enables wakeup, The >> >> second call disabled wakeup. >> >> >> >> The code that removes the wakeup source looks like this: >> >> >> >> void wakeup_source_remove(struct wakeup_source *ws) >> >> { >> >> if (WARN_ON(!ws)) >> >> return; >> >> >> >> spin_lock_irq(&events_lock); >> >> list_del_rcu(&ws->entry); >> >> spin_unlock_irq(&events_lock); >> >> synchronize_rcu(); >> >> } >> >> >> >> The sync is there because we are about to destroy the actual ws >> >> structure (in wakeup_source_destroy()). I wonder if it should be in >> >> wakeup_source_destroy() but that wouldn't help me anyway. >> >> >> >> synchronize_rcu_expedited() is a bit faster but not really fast >> >> enough. Anyway surely people will complain if I put this in the wakeup >> >> code - it will affect all wakeup users. It seems to me that the right >> >> solution is to avoid enabling and then immediately disabling wakeup. >> > >> > Hmmm... What hardware are you running this one? Normally, >> > synchronize_rcu_expedited() will be a couple of orders of magnitude >> > faster than synchronize_rcu(). >> > >> >> I assume we can't and shouldn't change device_init_wakeup() . We could >> >> add a call like device_init_wakeup_disabled() which makes the device >> >> wakeup capable but does not actually enable it. Does that work? >> > >> > If the only reason for the synchronize_rcu() is to defer the pair of >> > kfree()s in wakeup_source_destroy(), then another possible approach >> > would be to remove the synchronize_rcu() from wakeup_source_remove() >> > and then use call_rcu() to defer the two kfree()s. >> > >> > If this is a reasonable change to make, the approach is as follows: >> > >> > 1. Add a struct rcu_head to wakeup_source, call it "rcu". >> > Or adjust the following to suit your choice of name. >> > >> > 2. Replace the pair of kfree()s with: >> > >> > call_rcu(&ws->rcu, wakeup_source_destroy_rcu); >> > >> > 3. Create the wakeup_source_destroy_rcu() as follows: >> > >> > static void wakeup_source_destroy_rcu(struct rcu_head *head) >> > { >> > struct wakeup_source *ws = >> > container_of(head, struct wakeup_source, rcu); >> > >> > kfree(ws->name); >> > kfree(ws); >> > } >> > >> > Of course, this assumes that it is OK for wakeup_source_unregister() >> > to return before the memory is freed up. This often is OK, but there >> > are some cases where the caller requires that there be no further >> > RCU readers with access to the old data. In these cases, you really >> > do need the wait. >> >> Thanks very much for that. I'm not sure if it is a reasonable change, >> but it does bug me that we add it to a data structure knowing that we >> will immediately remove it! >> >> >From what I can see, making a device wakeup-enabled mostly happens on >> init or in response to a request to the driver (presumably from user >> space). In the latter case I suspect the synchronise_rcu() is fine. In >> the former it feels like we should make up our minds which of the >> three options is required (incapable, capable but not enabled, capable >> and enabled). >> >> I will try a patch first based on splitting the two options (capable >> and enable) and see if that get a NAK. >> >> Then I will come back to your solution - it seems fine to me and not a >> lot of code. Do we have to worry about someone enabling, disabled, >> enabling and then disabling wakeup quickly? Will this method break in >> that case if the second call to call_rcu() uses the same wc->rcu? > > There are a couple of questions here, let me take them one at a time: > > 1. If you just disabled, can you immediately re-enable? > > The answer is "yes". The reason that this works is that you > allocate a new structure for the re-enabling, and that new > structure has its own rcu_head field. > > 2. If you repeatedly disable and re-enable in a tight loop, > can this cause problems? > > The answer to this is also "yes" -- you can run the system > out of memory doing that. However, there are a number of > simple ways to avoid this problem: > > a. Do a synchronize_rcu() on every (say) thousandth > disable operation. > > b. As above, but only do the synchronize_rcu() if > all 1,000 disable operations occurred within > (say) a second of each other. > > c. As above, but actually count the number of > pending call_rcu() callbacks. > > Both (a) and (b) can be carried out on a per-CPU basis if there > is no convenient locked structure in which to track the state. > You cannot carry (c) out on a per-CPU basis because RCU callbacks > can sometimes be invoked on a different CPU from the one that > call_rcu()ed them. Rare, but it can happen. > > I would expect that option (a) would work in almost all cases. > > If this can be exercised freely from user space, then you probably > really do need #2 above. OK I see, thank you. It does sound a bit complicated although the chances of anyone actually doing this are probably remote. I will send my patch to avoid getting into this situation and see what you think. Regards, Simon -- To unsubscribe from this list: send the line "unsubscribe linux-serial" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html