On 10/06/2013 10:26 PM, Geert Uytterhoeven wrote: > On Sun, Oct 6, 2013 at 10:08 PM, Toralf Förster <toralf.foerster@xxxxxx> wrote: >> On 10/06/2013 08:38 PM, Geert Uytterhoeven wrote: >>> On Sun, Oct 6, 2013 at 4:17 PM, Toralf Förster <toralf.foerster@xxxxxx> wrote: >>>> The UML stopped here : >>>> ... >>>> if (unlikely(task_ratelimit == 0)) { >>>> period = max_pause; >>>> pause = max_pause; >>>> BUG_ON(pause < 0); >>>> goto pause; >>>> } >>>> BUG_ON(pages_dirtied < 0); >>>> BUG_ON(task_ratelimit < 0); >>>> period = HZ * pages_dirtied / task_ratelimit; >>>> BUG_ON(period < 0); <----------------------here >>> >>> So pages_dirtied becomes that big compared to task_ratelimit (both are >>> "unsigned long"), that period (which is "long", just like "pause") overflows >>> into a negative number. >>> >>> This is indeed much more likely to happen on 32-bit. >>> >>>> The back trace is : >>> >>>> #9 0x08411c64 in balance_dirty_pages (pages_dirtied=9, mapping=<optimized out>) at mm/page-writeback.c:1471 >>> >>> But here pages_dirtied is only 9?? > >> Well, this points to an overflow or ? : > > Negative indicates an overflow, but pages_dirtied doesn't. > >> tfoerste@n22 ~/devel/linux $ nl -ba mm/page-writeback.c | grep -A 5 -B 5 1468 >> 1463 BUG_ON(pause < 0); >> 1464 goto pause; >> 1465 } >> 1466 period = HZ * pages_dirtied / task_ratelimit; >> 1467 pause = period; >> 1468 BUG_ON(pause < 0 && pages_dirtied > 0 && task_ratelimit > 0); >> 1469 if (current->dirty_paused_when) >> 1470 pause -= now - current->dirty_paused_when; >> 1471 /* >> 1472 * For less than 1s think time (ext3/4 may block the dirtier >> 1473 * for up to 800ms from time to time on 1-HDD; so does xfs, >> >> >> and the back trace is : >> >> #9 0x08411c6c in balance_dirty_pages (pages_dirtied=0, mapping=<optimized out>) at mm/page-writeback.c:1468 > > Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON() > asserts its strict positive?!? > > Can you please try the following instead of the BUG_ON(): > > if (pause < 0) { > printk("pages_dirtied = %lu\n", pages_dirtied); > printk("task_ratelimit = %lu\n", task_ratelimit); > printk("pause = %ld\n", pause); > } > > Gr{oetje,eeting}s, > > Geert I tried it in different ways already - I'm completely unsuccessful in getting any printk output. As soon as the issue happens I do have a BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521] at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able to run very basic commands like a crash dump, sysrq ond so on. > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like that. > -- Linus Torvalds > -- MfG/Sincerely Toralf Förster pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3 -- To unsubscribe from this list: send the line "unsubscribe trinity" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html