From: Linus Torvalds > Sent: 01 May 2020 19:29 ... > And as DavidL pointed out - if you ever have "iomem" as a source or > destination, you need yet another case. Not because they can take > another kind of fault (although on some platforms you have the machine > checks for that too), but because they have *very* different > performance profiles (and the ERMS "rep movsb" sucks baby donkeys > through a straw). I was actually thinking that the nvdimm accesses need to be treated much more like (cached) memory mapped io space than normal system memory. So treating them the same as "iomem" and then having access functions that report access failures (which the current readq() doesn't) might make sense. If you are using memory that 'might fail' for kernel code or data you really get what you deserve. OTOH system response to PCIe errors is currently rather problematic. Mostly reads time out and return ~0u. This can be checked for and, if possibly valid, a second location read. However we have a x86 server box (I've forgotten whether it is HP or Dell) that generates an NMI whenever a PCIe link goes down. (The 'platform' takes the AER interrupt and uses an NMI to pass it to the kernel - whose bright idea was it to use an NMI???) This happens even after we've done an 'echo 1 >remove'. The system is supposed to be NEBS (I think that is the term) compliant which is supposed to be suitable for telephony work (including emergency calls), but any PCIe failure crashes the box! I've another system here that sometimes fails to bring the PCIe link back up. I guess these code paths don't get regular testing. In my case the PCIe slave is an fpga, reloading the fpga image (either over JTAG or after rewriting eeprom) doesn't always work. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)