[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/02/2013 09:55 PM, Richard Weinberger wrote:
> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@xxxxxx> wrote:
>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>> yields into this konsole message :
>>
>>  * Starting local
>> net.core.warnings = 0                                                                         [ ok ]
>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>
>>
>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>> 2 subsequent made back traces made with
>>
>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>
>> shows nearly a similar position around __get_user_pages() - both are attached.
>>
>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>> expected or if it points to an issue.
>>
>>
>> FWIW the last lines of trinity log were :
>>
>>
>> [2031] [94] setsid() = 2031
>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>> [2031] [103] sigpending(set=0x40025000) = 0
>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>> [2031] [117] inotify_init() = 654
>> [2031] [118] getgid() = 100
>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
> 
> Reading your gdb backtraces show that schedule_timeout() got called
> with a negative value.
> Looks like an integer overflow.
> The soft-lockup might also origin from that (very big integer which
> did not overflow jet)
> 

If the culprit is solved by this patch I'd like to send it out. But I'm
unsure whether it catches the culprit or if it just covers the root cause.

ommit 7b367d5478dfcba1635e8cde3c663515fee129ec
Author: Toralf Förster <toralf.foerster@xxxxxx>
Date:   Thu Oct 3 20:46:38 2013 +0200

    mm/page-writeback.c: avoid integer overflow in balance_dirty_pages()

    avoid crash of a 32 bit UML (found using fuzz tool trinity)

    Signed-off-by: Toralf Förster <toralf.foerster@xxxxxx>

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..f5a3096 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1462,7 +1462,8 @@ static void balance_dirty_pages(struct
address_space *mapping,
                        pause = max_pause;
                        goto pause;
                }
-               period = HZ * pages_dirtied / task_ratelimit;
+               period = ((long) HZ) * pages_dirtied / task_ratelimit;
+
                pause = period;
                if (current->dirty_paused_when)
                        pause -= now - current->dirty_paused_when;


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3
--
To unsubscribe from this list: send the line "unsubscribe trinity" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux