Re: core dump / segfault after 48 hour run

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 30, 2013 at 12:07 PM, Jens Axboe <axboe@xxxxxxxxx> wrote:
> On 09/30/2013 07:04 AM, Roger Sibert wrote:
>> Hello Everyone,
>>
>> I was looking to use fio to run full disks writes to a SSD after doing
>> a secure erase to measure/see how long it takes before the performance
>> stabilizes.  Give or take after about 48 hours I see this on the
>> screen.
>>
>> B2-058:~/longtermruntime # ./fio.64bit.static longtermruntime-192h.fio
>> seqwrite-phase: (g=0): rw=write, bs=512K-512K/512K-512K/512K-512K,
>> ioengine=libaio, iodepth=16
>> fio-2.1.2-15-gd5603
>> Starting 1 process
>> fio: pid=6895, got signal=11ne] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>> 06d:07h:05m:31s]
>>
>> seqwrite-phase: (groupid=0, jobs=1): err= 0: pid=6895: Sun Sep 29 03:40:38 2013
>>     lat (usec) : 1000=0.01%
>>     lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=99.15%
>>     lat (msec) : 100=0.56%, 250=0.28%, 500=0.01%, 750=0.01%
>>   cpu          : usr=0.00%, sys=0.00%, ctx=0, majf=0, minf=0
>>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
>>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
>>      issued    : total=r=0/w=67108865/d=0, short=r=0/w=0/d=0
>>
>> Run status group 0 (all jobs):
>>   WRITE: io=0KB, aggrb=0KB/s, minb=0KB/s, maxb=0KB/s,
>> mint=144006511329msec, maxt=144006511329msec
>>
>> Disk stats (read/write):
>>   sdb: ios=0/67108865, merge=0/0, ticks=0/2354077568,
>> in_queue=2353971492, util=100.00%
>> fio: file hash not empty on exit
>>
>> I took a look at one of the core files
>>
>> B2-057:~/longtermruntime # gdb core core
>> GNU gdb (GDB) SUSE (7.0-0.4.16)
>> Copyright (C) 2009 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>> and "show warranty" for details.
>> This GDB was configured as "x86_64-suse-linux".
>> For bug reporting instructions, please see:
>> <http://www.gnu.org/software/gdb/bugs/>...
>> "/root/longtermruntime/core": not in executable format: File format
>> not recognized
>> Missing separate debuginfo for the main executable file
>> Try: zypper install -C
>> "debuginfo(build-id)=559375f8a046f376897b4923007bff5b07ecd8d4"
>> Core was generated by `./fio.64bit.static longtermruntime-216h.fio'.
>> Program terminated with signal 11, Segmentation fault.
>> #0  0x000000000040a6c9 in ?? ()
>>
>> Is there anything else that I can do prior to help pull out more debug
>> using gdb prior to restarting/retasking this systems?  My gdb skills
>> arent that great.
>
> I know it's a pain to reproduce (especially after a 48h run), but if you
> could edit the Makefile and remove the -O3 from the OPTFLAGS, then make
> clean, make all, and then reproduce. Then the core files will be of more
> use.
>
> For the core files you have now, try and do a 'bt' when you open them so
> I can see a backtrace. That might be enough to see what is going on.
>
> --
> Jens Axboe
>

Let me try that again...  My gdb skills may be bad but it doesnt mean
I shouldnt recognize I was missing something.

Changed how I called the core file which should have what you where
actually asking for.

B2-057:~/longtermruntime # gdb ./fio.64bit.static ./core
GNU gdb (GDB) SUSE (7.0-0.4.16)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/longtermruntime/fio.64bit.static...done.

warning: core file may not match specified executable file.
Core was generated by `./fio.64bit.static longtermruntime-216h.fio'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000000040a6c9 in __add_log_sample (iolog=0x872510, val=62,
ddir=<value optimized out>, bs=<value optimized out>,
    t=<value optimized out>) at stat.c:1517
1517    stat.c: No such file or directory.
        in stat.c
(gdb) bt
#0  0x000000000040a6c9 in __add_log_sample (iolog=0x872510, val=62,
ddir=<value optimized out>, bs=<value optimized out>,
    t=<value optimized out>) at stat.c:1517
#1  0x0000000000440b05 in fio_libaio_queued (nr=1, io_us=0x8929a0,
td=0x7fe5e312b000) at engines/libaio.c:199
#2  fio_libaio_commit (nr=1, io_us=0x8929a0, td=0x7fe5e312b000) at
engines/libaio.c:218
#3  0x0000000000405385 in td_io_commit (td=0x7fe5e312b000) at ioengines.c:379
#4  0x000000000040572a in td_io_queue (td=0x7fe5e312b000,
io_u=0x891f20) at ioengines.c:329
#5  0x000000000043692f in do_io (td=0x7fe5e312b000) at backend.c:701
#6  thread_main (td=0x7fe5e312b000) at backend.c:1314
#7  0x0000000000438447 in fork_main (offset=0, shmid=<value optimized
out>) at backend.c:1464
#8  run_threads (offset=0, shmid=<value optimized out>) at backend.c:1726
#9  0x000000000043889d in fio_backend () at backend.c:1912
#10 0x00000000004702a4 in __libc_start_main ()
#11 0x0000000000000000 in ?? ()

Thanks,
Roger
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux