Re: [PATCH 2/8] introduce the device async action mechanism

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Friday 17 July 2009, Zhang Rui wrote:> On Fri, 2009-07-17 at 09:11 +0800, Rafael J. Wysocki wrote:> > On Thursday 16 July 2009, Zhang Rui wrote:> > > On Wed, 2009-07-15 at 21:00 +0800, Arjan van de Ven wrote:> > > > On Wed, 15 Jul 2009 15:38:36 +0800> > > > Zhang Rui <rui.zhang@xxxxxxxxx> wrote:> > > > > > > > > Introduce the device async action mechanism.> > > > > > > > > > In order to speed up Linux suspend/resume/shutdown process,> > > > > we introduce the device async action mechanism that allow devices> > > > > to suspend/resume/shutdown asynchronously.> > > > > > > > > > The basic idea is that,> > > > > if the suspend/resume/shutdown process of a device set,> > > > > including a root device and its child devices, are independent of> > > > > other devices, we create an async domain for this device set,> > > > > and make them suspend/resume/shutdown asynchronously.> > > > > > > > Hi,> > > > > > > > I have some concerns about having an async domain per device(group)> > > > rather than having one async domain for all of this, > > > > > > we create an async domain ONLY if we are sure that the device group is> > > independent with the other devices.> > > > > > and IMO, using multiple async domains brings real device async actions.> > > For example, in S3 resume case, there are two device groups:> > > device group1: device1, device2, device3> > > device group2: device4, device5, device6> > > > > > If they share the global domain, we may get:> > > device group1: device1(cookie 1), device2(cookie 4), device3(cookie 5)> > > device group2: device4(cookie 2), device5(cookie 3), device6(cookie 6)> > > > > > this is not real asynchronous resume because> > > device3 needs to call async_synchronize_cookie(5) before resume itself.> > > which means that device4 and device5 must be resumed before device3.> > > > > > But if multiple async domain is used, we get:> > > device group1: device1(cookie 1), device2(cookie 2), device3(cookie 3)> > > device group2: device4(cookie 1), device5(cookie 2), device6(cookie 3)> > > > > > device group1 and group2 can be resumed asynchronously.> > > > > > > > > Another example, in my previous test,> > > 1. sata controller. takes 0.4s to resume.> > > 2. usb, including uchi and ehci controller takes 1.4s to resume> > > 3. ACPI battery takes 0.3s to resume.> > > 3. all the other devices take 0.2s to resume.> > > > > > sata, usb and ACPI battery are independent device groups.> > > If we use multiple async domains, we can reduce the total device resume> > > time from 2.3s to a little more than 1.4s because there are a lot of> > > sleep in usb resume process.> > > But if we share the global async domain, the total resume time can only> > > be reduced to about 2.1s because sata, usb and ACPI battery are actually> > > resumed synchronously.> > > > Well, first, I'm not really sure that using the async _boot_ infrastructure for> > that is a good choice.> > IMO, kernel/async.c provides good interfaces that can be used not only> in boot phrase.> it's generic enough for suspend/resume.
Well, if that were not your opinion, you certainly wouldn't post patches usingit. :-)
> >  During suspend-resume we know dependencies between> > devices beforehand, at least in theory, so we can use them.> > > that's why I use multiple async domains. :)> One domain for a device group.
And how exactly the device groups are defined?
> > In particular, we have to make sure that parent devices will not be suspended> > until all of their children have been suspended and children devices will not> > be resumed before the parents.> > that's not enough.> For examples,> ACPI battery and EC are independent devices, but EC must be resumed> before battery because battery driver may access some EC address space> during resume time.> Of course the problem I describe above doesn't exist because ACPI> battery driver doesn't have .resume method right now.> But this is the case that works well in the current code but can not be> converted to async device resume.> > > The current code handles this quite> > efficiently, so I wonder what you're going to do not to break it.> > > sorry I don't quite understand.> > > Second, you seem to think that it only makes sense to execute ->suspend()> > and ->resume() asynchronously (or in parallel), while for example in the case> > of PCI ->suspend_noirq() and ->resume_noirq() also contain code that> > potentially can take quite some time to execute.> > > IMO, this patch set just provides a framework that can be used for> suspend/resume/shutdown optimization, and it doesn't solve all the> problem at one time.> > IMO, the next step we should do is:> 1. analyze the suspend/resume/shutdown process and find out the hotspot,>    i.e. which drivers suspend/resume slowly> 2. If it's software problem that we can fix in the driver, fix it.>    like commit 24920c8a6358bf5532f1336b990b1c0fe2b599ee> 3. If the hardware is slow, try to do it asynchronously.>    like I did in patch 8/8.> > This framework makes it quite easy to register an async device group.> > And even the suspend_noirq and resume_noirq can be covered easily with> this framework.> For example,> 1. introduce two new device async actions.>    DEV_ASYNC_SUSPEND_NOIRQ/DEV_ASYNC_RESUME_NOIRQ>    just like what I did in patch 4/8, 5/8 and 6/8.> 2. find out which device is slow in ->suspend_noirq and ->resume_noirq> 3. see if we can find an async device group that including this device.> 4. if yes, register this new async device group,>    with the type DEV_ASYNC_SUSPEND_NOIRQ/DEV_ASYNC_RESUME_NOIRQ> > > Finally, I don't really understand what the code in the $subject patch is> > supposed to do.  In particular, what's the purpose of dev_action()?> > It only seems to check if func is not NULL right now.  Also, you define> > DEV_ASYNC_ACTIONS_ALL as 0, so the condition> > if (!(DEV_ASYNC_ACTIONS_ALL & type)) in dev_async_register() is always> > satisfied.> > please refer to patch 4/8 and 5/8 and 6/8> > Patch 2/8 is just a framework. No device async action is support yet.
OK
So please remove the redundant return statements from there and please don'tuse (void *) for passing function pointers.
> And in Patch 4, 5, 6, we introduced three different types of device> async actions, DEV_ASYNC_SUSPEND, DEV_ASYNC_RESUME and> DEV_ASYNC_SHUTDOWN.> I tried to split a big patch into several small patches.> And it suggests how to adding a new device async type, like> DEV_ASYNC_SUSPEND_NOIRQ, DEV_ASYNC_RESUME_NO_IRQ, etc. :)> > > Can we please discuss this thoroughly before any new patches are sent?> > > do I still need to resend the patch?
Yes, please.
> If yes, I'll resend patch 2/8, 3/8, 4/8, 5/8, 6/8 as a new big one. :)
Yes, please, I think that would help to understand the design.
Best,Rafael_______________________________________________linux-pm mailing listlinux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx://lists.linux-foundation.org/mailman/listinfo/linux-pm


[Index of Archives]     [Linux ACPI]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [CPU Freq]     [Kernel Newbies]     [Fedora Kernel]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux