On Mon, Sep 9, 2013 at 5:21 AM, Markus Trippelsdorf <markus at trippelsdorf.de> wrote: > On 2013.09.08 at 17:32 -0700, Eric W. Biederman wrote: >> Markus Trippelsdorf <markus at trippelsdorf.de> writes: >> >> > Here are a couple of patches that get kexec working with radeon devices. >> > I've tested this on my RS780. >> > Comments or flames are welcome. >> > Thanks. >> >> A couple of high level comments. >> >> This looks promising for the usual case. >> >> Removing the printk at the end of the kexec path seems a little dubious, >> what of other cpus, interrupt handlers, etc. Basically estabilishing a >> new rule on when printk is allowed seems a little dubious at this point, >> even if it is a useful debugging trick. > > OK. I will drop this patch. It doesn't seem to be necessary, because I > cannot reproduce the printk related hang anymore. > >> Having a clean shutdown of the radeon definitely seems worth doing, >> because the cases where we care abouty video are when a person is in >> front of the system. > > Yes. But please note that even with radeon_pci_shutdown implemented, I > still get ring test failures on roughly every eighth kexec boot: > > [drm:r600_dma_ring_test] *ERROR* radeon: ring 3 test failed (0xCAFEDEAD) > radeon 0000:01:05.0: disabling GPU acceleration > > That's definitely better than the current state of affairs, with ring > test failures on every second boot. But I haven't figured out the reason > for these failures yet. It's curious that once a ring test failure > occurs, it will reliably fail after each kexec invocation, no matter how > often repeated. Only a reboot brings the machine back to normal. > >> I don't know if you want to remove the sanity checks. They seem cheap >> and safe regardless. Are they expensive or ineffective? Moreover if >> they work a reasonable amount of the time that means that the kexec on >> panic case (where we don't shut anything down) can actually use the >> video, and that in general the driver will be more robust. I don't >> expect anyone much cares as kexec on panic is mostly used to just write >> a core file to the network, or the local disk. But if it is easy to >> keep that case working most of the time, why not. > > IIRC Alex said the sanity checks are expensive and boot-time could be > improved by dropping them. Maybe he can chime in? They shouldn't be necessary with a proper shutdown, but in this particular case, they are not very expensive. What is expensive is having a separate sanity check functions for all the various hw blocks to teardown everything on startup prior to starting it up in case kexec, etc. left the system in a bad state. It ends up amounting to a full tear down sequence followed by a full start up sequence every time you load the driver. I can't really comment on the first patch, but the rest seem fine. Alex