12.02.2014, 15:29, "Allen Pais" <allen.pais@xxxxxxxxxx>: >>>>> [ 1487.027884] I7: <rt_mutex_setprio+0x3c/0x2c0> >>>>> [ 1487.027885] Call Trace: >>>>> [ 1487.027887] [00000000004967dc] rt_mutex_setprio+0x3c/0x2c0 >>>>> [ 1487.027892] [00000000004afe20] task_blocks_on_rt_mutex+0x180/0x200 >>>>> [ 1487.027895] [0000000000819114] rt_spin_lock_slowlock+0x94/0x300 >>>>> [ 1487.027897] [0000000000817ebc] __schedule+0x39c/0x53c >>>>> [ 1487.027899] [00000000008185fc] schedule+0x1c/0xc0 >>>>> [ 1487.027908] [000000000048fff4] smpboot_thread_fn+0x154/0x2e0 >>>>> [ 1487.027913] [000000000048753c] kthread+0x7c/0xa0 >>>>> [ 1487.027920] [00000000004060c4] ret_from_syscall+0x1c/0x2c >>>>> [ 1487.027922] [0000000000000000] (null) >>> Now, consistently I've been getting sun4v_data_access_exception. >>> Here's the trace: >>> [ 4673.360121] sun4v_data_access_exception: ADDR[0000080000000000] CTX[0000] TYPE[0004], going. >> I've never dived at sparc's tlb before, but it seems now I'm understanding. >> >> arch_enter_lazy_mmu_mode() makes possible delayed tlb flushing. In !RT kernel >> you collect flush requests before you really flush all of them. >> >> In RT you collect them too, but you are able to be preempted in any moment. >> So, you may switch to other process with unflushed tlb, which is very bad. >> >> Try to not to set tb->active = 1; in arch_enter_lazy_mmu_mode(). Set it to zero. >> We will look if this robust fix helps. > > Kirill, Well the change works. So far the machine is up and no stall or crashes > with Hackbench. I'll run it for longer period and check. Ok, good. But I don't know is this the best fix. May we have to implement another optimization for RT. For example, collect only batches which does not require smp call function. Or the main goal of lazy tlb was to prevent smp calls?! It's good to discover this.. The other serious thing is to know does __set_pte_at() execute in preemption disable context on !RT kernel. Because the place is interesting. If yes, we have to do the same for RT. If not, then no. Kirill > > Thanks, > > Allen > > -- > To unsubscribe from this list: send the line "unsubscribe sparclinux" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html