Re: Still seeing hangs in xlog_grant_log_space

Juerg Haefliger <juergh@xxxxxxxxx> · Fri, 27 Apr 2012 15:07:45 +0200

On Fri, Apr 27, 2012 at 1:09 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Fri, Apr 27, 2012 at 11:04:33AM +0200, Juerg Haefliger wrote:
>> On Fri, Apr 27, 2012 at 1:07 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>> > On Fri, Apr 27, 2012 at 01:00:08AM +0200, Juerg Haefliger wrote:
>> >> On Fri, Apr 27, 2012 at 12:44 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>> >> > On Thu, Apr 26, 2012 at 02:37:50PM +0200, Juerg Haefliger wrote:
>> >> >> > I'm assuming it is the event trace
>> >> >> > that is causing it to blow out? If so, just the 30-60s either side of
>> >> >> > the hang first showing up is probaby necessary, and that should cut
>> >> >> > the size down greatly....
>> >> >>
>> >> >> Can I shorten the existing trace.dat?
>
> Looks like you can - the "trace-cmd split" option.
>
>> >> >
>> >> > No idea, but that's likely the problem - I don't want the binary
>> >> > trace.dat file. I want the text output of the report command
>> >> > generated from the binary trace.dat file...
>> >>
>> >> Well yes. I did RTFM :-) trace.dat is 15GB.
>> >
>> > OK, that's a lot larger than I expected for a hung filesystem....
>> >
>> >> >> I stopped the trace
>> >> >> automatically 10 secs after the the xlog_... trace showed up in syslog
>> >> >> so effectively some 130+ secs after the hang occured.
>> >
>> > Can you look at the last timestamp in the report file, and trim off
>> > anything from the start that is older than, say, 180s before that?
>>
>> Cut the trace down to 180 secs which brought the filesize down to
>> 93MB: https://region-a.geo-1.objects.hpcloudsvc.com:443/v1.0/AUTH_9630ead2-6194-40df-afd3-7395448d4536/xfs-hang/report-2012-04-24-180secs.tgz
>
> I see the problem - the trace.dat file is hosted on an XFS
> filesystem, so all the writes to the trace.dat file are causing
> events to be logged, which causes writes to the trace.dat file....
>
> taking out al the trace-cmd events:
>
> $ grep -v trace-cmd trace_report_180secs.txt > t.t
> $ ls -l trace_report_180secs.txt t.t
> -rw-r--r-- 1 dave dave 2136941443 Apr 27 18:52 trace_report_180secs.txt
> -rw-r--r-- 1 dave dave    3280629 Apr 27 20:12 t.t
>
> Brings the event trace for that 180s down ifrom 2.1GB to 3.2MB,
> which is much more like I'd expect from a hung filesystem....
>
> Ok, so it looks like there's lots of noise from other XFS
> filesystems to, and from the info.log, the xfs-hang filesystem is on
> device 252:2 (/dev/vg00/tmp):
>
> $ grep "dev 252:2" t.t
> $
>
> And there are no events from that filesystem in the log at all. Ok,
> so what you need to do is find out if there are *any* events from
> that device in the larger log file you have.....
>
> If not, then it is time for advanced trace-cmd mojo. We can tell it
> to filter events only from the PID of the test script and all it's
> children using:
>
> # trace-cmd record -e xfs\* -P <parent-pid> -c
>
> But better would be to use the device number of the relevant
> filesystem to filter the events. The device is 252:2, which means in
> kernel terms is it:
>
>        dev = (major << 20) | minor
>            = 0xfc00002
>
> So you should be able to get just the xfs-hang events via:
>
> # trace-cmd record -e xfs\* -d 'dev == 0xfc00002'
>
> and as long as you don't host log files on /xfs-hang, it'll only
> record the workload running on the xfs-hang filesystem.

Third try: https://region-a.geo-1.objects.hpcloudsvc.com:443/v1.0/AUTH_9630ead2-6194-40df-afd3-7395448d4536/xfs-hang/report-2012-04-27-180secs.tgz
Filtered by device, trace events go to a different filesystem.

> BTW, how often do you see this sort of thing:
>
> [  220.571551] ------------[ cut here ]------------
> [  220.571562] WARNING: at fs/inode.c:280 drop_nlink+0x49/0x50()
> [  220.571564] Hardware name: SE2170s
> [  220.571565] Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler ip6table_filter ip6_tables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables bridge 8021q garp stp coretemp ghash_clmulni_intel aesni_intel cryptd usbhid i7core_edac lp edac_core hid aes_x86_64 parport serio_raw microcode xfs igb hpsa dca
> [  220.571594] Pid: 4637, comm: copy-files Not tainted 3.4.0-rc4 #2
> [  220.571595] Call Trace:
> [  220.571603]  [<ffffffff810508cf>] warn_slowpath_common+0x7f/0xc0
> [  220.571605]  [<ffffffff8105092a>] warn_slowpath_null+0x1a/0x20
> [  220.571607]  [<ffffffff81193319>] drop_nlink+0x49/0x50
> [  220.571628]  [<ffffffffa00701ef>] xfs_droplink+0x2f/0x60 [xfs]
> [  220.571640]  [<ffffffffa0072d58>] xfs_remove+0x2e8/0x3c0 [xfs]
> [  220.571645]  [<ffffffff8163aeee>] ? _raw_spin_lock+0xe/0x20
> [  220.571656]  [<ffffffffa0068248>] xfs_vn_unlink+0x48/0x90 [xfs]
> [  220.571659]  [<ffffffff8118684f>] vfs_unlink+0x9f/0x100
> [  220.571662]  [<ffffffff811893ef>] do_unlinkat+0x1af/0x1e0
> [  220.571668]  [<ffffffff810a8eab>] ? sys_futex+0x7b/0x180
> [  220.571670]  [<ffffffff8118a9a6>] sys_unlink+0x16/0x20
> [  220.571675]  [<ffffffff816431a9>] system_call_fastpath+0x16/0x1b
>
> You might want to run xfs-repair over your filesystems to find out
> how many inodes have bad link counts....

First time I saw it was when I started using 3.4-rc4. I repaired the
fs before I rerun the test that produced the above data.

...Juerg

> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs