On Fri, Apr 27, 2012 at 1:09 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > On Fri, Apr 27, 2012 at 11:04:33AM +0200, Juerg Haefliger wrote: >> On Fri, Apr 27, 2012 at 1:07 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> > On Fri, Apr 27, 2012 at 01:00:08AM +0200, Juerg Haefliger wrote: >> >> On Fri, Apr 27, 2012 at 12:44 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> >> > On Thu, Apr 26, 2012 at 02:37:50PM +0200, Juerg Haefliger wrote: >> >> >> > I'm assuming it is the event trace >> >> >> > that is causing it to blow out? If so, just the 30-60s either side of >> >> >> > the hang first showing up is probaby necessary, and that should cut >> >> >> > the size down greatly.... >> >> >> >> >> >> Can I shorten the existing trace.dat? > > Looks like you can - the "trace-cmd split" option. > >> >> > >> >> > No idea, but that's likely the problem - I don't want the binary >> >> > trace.dat file. I want the text output of the report command >> >> > generated from the binary trace.dat file... >> >> >> >> Well yes. I did RTFM :-) trace.dat is 15GB. >> > >> > OK, that's a lot larger than I expected for a hung filesystem.... >> > >> >> >> I stopped the trace >> >> >> automatically 10 secs after the the xlog_... trace showed up in syslog >> >> >> so effectively some 130+ secs after the hang occured. >> > >> > Can you look at the last timestamp in the report file, and trim off >> > anything from the start that is older than, say, 180s before that? >> >> Cut the trace down to 180 secs which brought the filesize down to >> 93MB: https://region-a.geo-1.objects.hpcloudsvc.com:443/v1.0/AUTH_9630ead2-6194-40df-afd3-7395448d4536/xfs-hang/report-2012-04-24-180secs.tgz > > I see the problem - the trace.dat file is hosted on an XFS > filesystem, so all the writes to the trace.dat file are causing > events to be logged, which causes writes to the trace.dat file.... > > taking out al the trace-cmd events: > > $ grep -v trace-cmd trace_report_180secs.txt > t.t > $ ls -l trace_report_180secs.txt t.t > -rw-r--r-- 1 dave dave 2136941443 Apr 27 18:52 trace_report_180secs.txt > -rw-r--r-- 1 dave dave 3280629 Apr 27 20:12 t.t > > Brings the event trace for that 180s down ifrom 2.1GB to 3.2MB, > which is much more like I'd expect from a hung filesystem.... > > Ok, so it looks like there's lots of noise from other XFS > filesystems to, and from the info.log, the xfs-hang filesystem is on > device 252:2 (/dev/vg00/tmp): > > $ grep "dev 252:2" t.t > $ > > And there are no events from that filesystem in the log at all. Ok, > so what you need to do is find out if there are *any* events from > that device in the larger log file you have..... > > If not, then it is time for advanced trace-cmd mojo. We can tell it > to filter events only from the PID of the test script and all it's > children using: > > # trace-cmd record -e xfs\* -P <parent-pid> -c > > But better would be to use the device number of the relevant > filesystem to filter the events. The device is 252:2, which means in > kernel terms is it: > > dev = (major << 20) | minor > = 0xfc00002 > > So you should be able to get just the xfs-hang events via: > > # trace-cmd record -e xfs\* -d 'dev == 0xfc00002' > > and as long as you don't host log files on /xfs-hang, it'll only > record the workload running on the xfs-hang filesystem. Third try: https://region-a.geo-1.objects.hpcloudsvc.com:443/v1.0/AUTH_9630ead2-6194-40df-afd3-7395448d4536/xfs-hang/report-2012-04-27-180secs.tgz Filtered by device, trace events go to a different filesystem. > BTW, how often do you see this sort of thing: > > [ 220.571551] ------------[ cut here ]------------ > [ 220.571562] WARNING: at fs/inode.c:280 drop_nlink+0x49/0x50() > [ 220.571564] Hardware name: SE2170s > [ 220.571565] Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler ip6table_filter ip6_tables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables bridge 8021q garp stp coretemp ghash_clmulni_intel aesni_intel cryptd usbhid i7core_edac lp edac_core hid aes_x86_64 parport serio_raw microcode xfs igb hpsa dca > [ 220.571594] Pid: 4637, comm: copy-files Not tainted 3.4.0-rc4 #2 > [ 220.571595] Call Trace: > [ 220.571603] [<ffffffff810508cf>] warn_slowpath_common+0x7f/0xc0 > [ 220.571605] [<ffffffff8105092a>] warn_slowpath_null+0x1a/0x20 > [ 220.571607] [<ffffffff81193319>] drop_nlink+0x49/0x50 > [ 220.571628] [<ffffffffa00701ef>] xfs_droplink+0x2f/0x60 [xfs] > [ 220.571640] [<ffffffffa0072d58>] xfs_remove+0x2e8/0x3c0 [xfs] > [ 220.571645] [<ffffffff8163aeee>] ? _raw_spin_lock+0xe/0x20 > [ 220.571656] [<ffffffffa0068248>] xfs_vn_unlink+0x48/0x90 [xfs] > [ 220.571659] [<ffffffff8118684f>] vfs_unlink+0x9f/0x100 > [ 220.571662] [<ffffffff811893ef>] do_unlinkat+0x1af/0x1e0 > [ 220.571668] [<ffffffff810a8eab>] ? sys_futex+0x7b/0x180 > [ 220.571670] [<ffffffff8118a9a6>] sys_unlink+0x16/0x20 > [ 220.571675] [<ffffffff816431a9>] system_call_fastpath+0x16/0x1b > > You might want to run xfs-repair over your filesystems to find out > how many inodes have bad link counts.... First time I saw it was when I started using 3.4-rc4. I repaired the fs before I rerun the test that produced the above data. ...Juerg > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs