Re: [linux-pm] [PATCH] PCI PM: Restore standard config registers of all devices early (was: Re: EeePC resume failure - timers)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Fri, 16 Jan 2009, Alan Stern wrote:
> 
> USB does this.  However I admit that fixing every PCI driver would be a 
> big chore...

I suspect USB is one of the very few drivers that may get suspend/resume 
very close to right, partly because _everybody_ has a USB controller these 
days, partly because USB interrupts are very commonly shared, and partly 
because there are several core people who work on USB.

That's actually very rare (especially the last point - most drivers hardly 
have _any_ maintainer, much less anybody who can see the big picture and 
knows about the rest of the system).

So if everybody did what USB does, we'd indeed be fine, and suspend and 
resume would probably work on practically every machine out there. But as 
you note, it's not really an option. The common case for 99% of all 
drivers is probably that none of the core kernel people can even test 
them, and they likely do not exist even in some distro test area.

So I'm the one who pushed Rafael towards his patch, because we've worked 
on suspend/resume for years, but while we're _much_ better than we used to 
be (especially on machines that are really bog-standard and only have the 
common chips), I don't think we'll ever really "get there" if we rely on 
the drivers always getting things right.

And getting things right does mean changing every single PCI interrupt 
handler to know about suspend/resume - not to mention do the 
suspend/resume sequence right in the first place. Neither of the two is 
very likely to really happen.

I did suggest that we could also add some test-infrastructure like the 
DEBUG_SHIRQ thing we already have, which sends an interrupt immediately on 
resume after interrupts have been enabled. That will likely uncover a 
_lot_ of problems, and it's almost certainly worth doing regardless.

But even then, it would just be so much _simpler_ if the generic layer 
just took care of this issue, so that drivers could just ignore it, and a 
driver would only have to worry about its _own_ resume issues. 

So yes, doing it in the generic PCI layer clearly has some problems, and 
no, it's not "perfect". In a perfect world, doing it in the driver has 
real advantages - it's the most flexible approach, and it allows the 
driver to do what it wants. But from all I've ever seen, I'm personally 
pretty convinced that when it comes to drivers, the less the driver writer 
has to worry about, the better off we are.

I'm franklly hoping that most PCI drivers would not need to have a 
suspend/resume method at all - the PCI layer should get the normal PCI 
suspend parts right, and the upper layers should do the rest. For example, 
a network driver shouldn't have to close itself down - the network layer 
should just do it for it (all the netif_stop_queue() etc crud to make 
sure that the network layer doesn't try to touch it). 

So we're clearly not there yet. But after having yet another random 
machine that just didn't resume because of yet another new random driver, 
I'm just not seeing anything else that is viable long-term.

IOW, we can ask driver writers to write perfect code, or we can spend some 
effort on trying to make things work even when they don't. I know which 
approach I consider to be realistic.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux