On 7/8/21 10:11 AM, Brijesh Singh wrote: > On 7/8/21 11:58 AM, Dave Hansen wrote:>> Logically its going to be > tricky to figure out which exact entry caused >>> the fault, hence I dump any non-zero entry. I understand it may dump >>> some useless. >> >> What's tricky about it? >> >> Sure, there's a possibility that more than one entry could contribute to >> a fault. But, you always know *IF* an entry could contribute to a fault. >> >> I'm fine if you run through the logic, don't find a known reason >> (specific RMP entry) for the fault, and dump the whole table in that >> case. But, unconditionally polluting the kernel log with noise isn't >> very nice for debugging. > > The tricky part is to determine which undocumented bit to check to know > that we should stop dump. I can go with your suggestion that first try > with the known reasons and fallback to dump whole table for unknown > reasons only. You *can't* stop because of undocumented bits. Fundamentally. You literally don't know if the bit means "this caused a fault" versus "this definitely couldn't cause a fault". Basically, if we get to the point of dumping the whole table, we should also spit out an error message saying that the kernel is dazed and confused and can't figure out why the hardware caused a fault. Then, dump out the whole table so that the "hardware" folks can have a look.