On Tue, 27 Mar 2012 16:50:39 +0200 Daniel Vetter <daniel at ffwll.ch> wrote: > On Tue, Mar 27, 2012 at 07:19:43AM -0700, Ben Widawsky wrote: > > I wanted to run this by folks before I start doing any actual work. > > > > This is primarily for GPGPU, or perhaps *really* accurate rendering > > requirements. > > > > IVB+ has an interrupt to tell us when a cacheline seems to be going > > bad. There is also a mechanism to remap the bad cachelines. The > > implementation details aren't quite clear to me yet, but I'd like to > > enable this feature for userspace. > > > > Here is my current plan, but it involves filesystem access, so it's > > probably going to get a lot of flames. > > > > 1. Handle cache line going bad interrupt. > > <After n number of these interrupts to the same line,> > > 2. send a uevent > > 2.5 reset the GPU (docs tell us to) > > <On module load> > > 3. Read a module parameter with a path in the filesystem > > of the list of bad lines. It's not clear to me yet exactly what I > > need to store, but it should be a relatively simple list. > > .... path in filesystem is no-go for kernel interface. So bad > cachelines need to go into the modele parameter itself. Or we add a > sysfs interface and reset the gpu (because if my understanding is > right, we can't disable cachelines once the gpu has used them). I think we have to assume the list could get quite long. So long in fact, I imagine the user may often want to reset it and try his/her luck again with some lines. Could you elaborate more on why it's a no-go? The module parameter setting itself is limited to root. I was trying to clearly understand exactly why this can't be done, and some of the lore behind why file access in the kernel is such a bad thing (assuming the files being accessed are set at module load time). I wouldn't want to go the route of loading an arbitrary path - which seems like a terrible idea; though it works for firmware blobs, and I half thought we could load this like a firmware blob. Anyway, assuming a gpu reset is sufficient to remap (docs only clearly state reset works for disabling, iirc) then I would like to do that. What is the appropriate interface for that? The dev node? Sysfs? > > > 4. Parse list on driver load, and handle as necessary. > > 5. goto 1. > > > > Probably the biggest unanswered question is exactly when in the HW > > loading do we have to finish remapping. If it can happen at any time > > while the card is running, I don't need the filesystem stuff, but I > > believe I need to remap the lines quite early in the device > > bootstrap. > > I believe so, too ;-) > > > The only alternative I have is a huge comma separated string for a > > module parameter, but I kind of like reading the file better. > > Well, you can't read a file from the kernel because we might init the > driver without any userspace present (when the driver is built-in). Userspace should still be present in this case, right? The kernel command line should suffice, I think. > > > Any feedback is highly appreciated. I couldn't really find much > > precedent for doing this in other drivers, so pointers to similar > > things would also be highly welcome. > -Daniel Thanks