On Tue, Mar 27, 2012 at 08:09:31AM -0700, Ben Widawsky wrote: > On Tue, 27 Mar 2012 16:50:39 +0200 > Daniel Vetter <daniel at ffwll.ch> wrote: > > > On Tue, Mar 27, 2012 at 07:19:43AM -0700, Ben Widawsky wrote: > > > I wanted to run this by folks before I start doing any actual work. > > > > > > This is primarily for GPGPU, or perhaps *really* accurate rendering > > > requirements. > > > > > > IVB+ has an interrupt to tell us when a cacheline seems to be going > > > bad. There is also a mechanism to remap the bad cachelines. The > > > implementation details aren't quite clear to me yet, but I'd like to > > > enable this feature for userspace. > > > > > > Here is my current plan, but it involves filesystem access, so it's > > > probably going to get a lot of flames. > > > > > > 1. Handle cache line going bad interrupt. > > > <After n number of these interrupts to the same line,> > > > 2. send a uevent > > > 2.5 reset the GPU (docs tell us to) > > > <On module load> > > > 3. Read a module parameter with a path in the filesystem > > > of the list of bad lines. It's not clear to me yet exactly what I > > > need to store, but it should be a relatively simple list. > > > > .... path in filesystem is no-go for kernel interface. So bad > > cachelines need to go into the modele parameter itself. Or we add a > > sysfs interface and reset the gpu (because if my understanding is > > right, we can't disable cachelines once the gpu has used them). > > I think we have to assume the list could get quite long. So long in > fact, I imagine the user may often want to reset it and try his/her > luck again with some lines. > > Could you elaborate more on why it's a no-go? The module parameter > setting itself is limited to root. I was trying to clearly understand > exactly why this can't be done, and some of the lore behind why file > access in the kernel is such a bad thing (assuming the files being > accessed are set at module load time). I wouldn't want to go the route > of loading an arbitrary path - which seems like a terrible idea; > though it works for firmware blobs, and I half thought we could load > this like a firmware blob. > > Anyway, assuming a gpu reset is sufficient to remap (docs only clearly > state reset works for disabling, iirc) then I would like to do that. > What is the appropriate interface for that? The dev node? Sysfs? I personally prefer sysfs for this. Albeit you might have some issues with the one value per file limit ... I guess a list of hex values is ok though. > > > 4. Parse list on driver load, and handle as necessary. > > > 5. goto 1. > > > > > > Probably the biggest unanswered question is exactly when in the HW > > > loading do we have to finish remapping. If it can happen at any time > > > while the card is running, I don't need the filesystem stuff, but I > > > believe I need to remap the lines quite early in the device > > > bootstrap. > > > > I believe so, too ;-) > > > > > The only alternative I have is a huge comma separated string for a > > > module parameter, but I kind of like reading the file better. > > > > Well, you can't read a file from the kernel because we might init the > > driver without any userspace present (when the driver is built-in). > > Userspace should still be present in this case, right? The kernel > command line should suffice, I think. Somewhen later on, but only after the hw is intialized. But if you're going the runtime interface route anyway, it doesn't matter. -Daniel -- Daniel Vetter Mail: daniel at ffwll.ch Mobile: +41 (0)79 365 57 48