Re: Kernel crashes with trace ending in XFS code on RHEL6 variant kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Eric,

On 28 October 2014 18:42, Eric Sandeen <sandeen@xxxxxxxxxxx> wrote:
On 10/28/14 10:38 AM, Jan Kokoska wrote:
> Hi,
>
> I'm running OpenVZ (OS container) kernel variant of RHEL6 kernel on

... for which we have no source code? ;)

Right, I'm sorry, the source code patch on vanilla kernel is linked from
http://openvz.org/Download/kernel/rhel6/042stab084.20
and
http://openvz.org/Download/kernel/rhel6/042stab092.3
for the two kernel versions.

xfs.aops.c differs a bit between the older and the newer version (released 6 months apart), but both kernels crash. 

1031,1032c1031,1034
< * Just skip the page if it is fully outside i_size, e.g. due
< * to a truncate operation that is in progress.
---
> * Skip the page if it is fully outside i_size, e.g. due to a
> * truncate operation that is in progress. We must redirty the
> * page so that reclaim stops reclaiming it. Otherwise
> * xfs_vm_releasepage() is called on it and gets confused.
1034,1037c1036,1037
< if (page->index >= end_index + 1 || offset_into_page == 0) {
< unlock_page(page);
< return 0;
< }
---
> if (page->index >= end_index + 1 || offset_into_page == 0)
> goto redirty;

OpenVZ devs unfortunately don't publish their git tree anymore.


I don't know what's in "2.6.32-openvz-amd64" so can't help much.

What is at line 86 of xfs_aops.c in that kernel?

Stefan is right in that it's the line
bh = head = page_buffers(page);
from 
xfs_count_page_state()

Eric, thanks for the pointer to ef5d437f71afdf4afdbab99213add99f4b1318fd, I'll raise it with OpenVZ devs or with RHEL so the bug trickles downstream to OpenVZ. I simply didn't know how much difference there may be between XFS parts of the kernel trees that you maintain and that are e.g. in RHEL and thought it could be a generally occurring bug. Also wanted to get in touch with the mailing list as I've been using XFS mostly happily for a decade.

Jan
 

-Eric

> several amd64 machines by different manufacturers (HP and Supermicro)
> and different RAID cards (HP and Areca).
>
> I've started seeing kernel crashes in October, as per the netconsole
> logs attached, on two of the machines (one HP, one Supermicro). The
> traces look quite similar, the machine in question cannot write
> anything to its own filesystem when this happens so the logs are made
> over the network. The XFS filesystem is not root (that's ext4), but
> one for data (OS containers), on both machines. When I run xfs_check
> and xfs_repair on the filesystem after the kernel crash & reboot, no
> issue is ever found.
>
> This may very well have nothing to do with XFS kernel code you wrote
> and maintain, but in that case, could you, from looking at the traces,
> tell me whether it maybe looks like something issue related to
> vm/paging just ending up in XFS related code path?
>
> I'm happy to test any suggestions/fixes for this if it is XFS related.
>
> Thank you,
> --
> Jan Kokoska
> Glow Internet s.r.o.
>
>
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
>

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



--
S pozdravem

Jan Kokoska
Glow Internet s.r.o.
_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs

[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux