Dear experts,
My name is Makoto Harada, working for the ARM based board development manufacturer.
Our product is using Single 800 MHz Cortex-A9 processor, and using Linux 3.4
kernel is running on.
Now, we are working on unexpected boot hang issue, which happens once per 20-100
boots.
Simply explaining, the issue is as followings.
1. Data abort or prefetch abort exception happens to handle a certain page fault.
2. In page fault handler, it tries to fix the cause of page fault, however do
nothing because
PTE has nothing wrong.(The page is valid, AP(access permission field) is
correct).
3. After returning back to the user process, an access to the page occurs.
4. Since page fault handler does nothing on #2, the access causes page fault again.
Thus system falls into the infinite page fault handling loop between 2-4,
so boot process never completed.
5. The page fault loop can be exited by invalidating the TLB entry of the page
(we implemented the special routine for debug purpose.)
According to the symptom above, we think that due to some unknown reason TLB
entry is corrupted.
We want to identify the root cause which could cause TLB entry corruption.
Since I'm newbie for this memory management topic, I would like to hear the
advice of experts.
How you guys approach this kind of issue ? Any comments are highly appreciated.
P.S We know that Linux 3.4 is a little bit old, however we have to keep using
this version due to our private reason.
Kind Regards,
Makoto Harada
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>