On 8/27/2014 4:20 PM, Andrew Morton wrote: > On Wed, 27 Aug 2014 16:15:28 -0700 Mike Travis <travis@xxxxxxx> wrote: > >> >>> >>>> There are two causes for requiring a restart/reload of the drivers. >>>> First is periodic preventive maintenance (PM) and the second is if >>>> any of the devices experience a fatal error. Both of these trigger >>>> this excessively long delay in bringing the system back up to full >>>> capability. >>>> >>>> The problem was tracked down to a very slow IOREMAP operation and >>>> the excessively long ioresource lookup to insure that the user is >>>> not attempting to ioremap RAM. These patches provide a speed up >>>> to that function. >>> >>> With what result? >>> >> >> Early measurements on our in house lab system (with far fewer cpus >> and memory) shows about a 60-75% increase. They have a 31 devices, >> 3000+ cpus, 10+Tb of memory. We have 20 devices, 480 cpus, ~2Tb of >> memory. I expect their ioresource list to be about 5-10 times longer. >> [But their system is in production so we have to wait for the next >> scheduled PM interval before a live test can be done.] > > So you expect 1+ hours? That's still nuts. > Actually I expect a lot better improvement. We are removing cycles through the I/O resource list and the longer the list, the longer it takes to pass completely through it. As mentioned for a 128M I/O BAR region, that is 32 passes, so we are removing 31 of them. 31 times a list 5-10 times longer should be a much better overall improvement in the ioremap time. The startup time of the device will still be there, though we are encouraging the vendor to look at starting them up in parallel. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>