Re: XFS attempt to access beyond end of device

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Wed, 22 Mar 2017 10:51:06 +0100

On Wed, Mar 22, 2017 at 8:24 AM, Marcus Furlong <furlongm@xxxxxxxxx> wrote:
> Hi,
>
> I'm experiencing the same issue as outlined in this post:
>
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/013330.html
>
> I have also deployed this jewel cluster using ceph-deploy.
>
> This is the message I see at boot (happens for all drives, on all OSD nodes):
>
> [   92.938882] XFS (sdi1): Mounting V5 Filesystem
> [   93.065393] XFS (sdi1): Ending clean mount
> [   93.175299] attempt to access beyond end of device
> [   93.175304] sdi1: rw=0, want=19134412768, limit=19134412767
>
> and again while the cluster is in operation:
>
> [429280.254400] attempt to access beyond end of device
> [429280.254412] sdi1: rw=0, want=19134412768, limit=19134412767
>

We see these as well, and I'm also curious what's causing it. Perhaps
sgdisk is doing something wrong when creating the ceph-data partition?

> Eventually, there is a kernel oops (see below for full details).
>

We also see that oops, but very very rarely (only twice in the current
uptime epoch) and its not correlated with the access beyond end of
device errors.

Cheers, Dan

> This happens for all drives on all OSD nodes (eventually), so it's
> consistent at least.
>
>
> Similar to the original post, this RH article has some relevant info:
>
> https://access.redhat.com/solutions/2833571
>
>
> The article suggests looking at the following values:
>
> Error message disk size (EMDS) = "limit" value in error message * 512,
> Current device size (CDS) = `cat /proc/partitions | grep sdi1 | awk
> '{print $3}'` * 1024
> Filesystem size (FSS) = blocks * bsize (from xfs_info)
>
> # xfs_info /dev/sdi1 | grep data | grep blocks
> data     =                       bsize=4096   blocks=2391801595, imaxpct=5
>
> I end up with these values:
>
> EMDS = 19134412767 * 512 = 9796819336704
> CDS  = 9567206383 * 1024 = 9796819336192 (512 bytes less than EMDS)
> FSS  = 2391801595 * 4096 = 9796819333120 (3072 bytes less than CDS)
>
> FSS < CDS so that's fine, but EMDS != CDS. Apparently this shouldn't
> be the case, however these devices have not been renamed and this is
> 100% reproducible upon reinstallation, so I'm not sure why this is the
> case.
>
>
> The drives are 10TB 512e drives, so have a logical sector size of 512,
> and a physical sector size of 4096:
>
> # blockdev --getsz /dev/sdi
> 19134414848
> # blockdev --getsz /dev/sdi1
> 19134412767
> # blockdev --getss /dev/sdi
> 512
> # blockdev --getpbsz /dev/sdi
> 4096
> # blockdev --getbsz /dev/sdi
> 4096
>
> I'm not sure if that's relevant, but thought it might be worth a
> mention. (FSS + 4096 would exceed EMDS, whereas FSS + 512 would not).
>
> My question would be (as with the OP) - can these errors be ignored?
> Given the oops, I would think not?
>
> Has anybody else experienced this issue?
>
> Could it be related to the mkfs options used by ceph-disk in
> ceph-deploy? I didn't change these, so it used the defaults of:
>
> /usr/sbin/mkfs -t xfs -f -i size=2048 -- /dev/sdi1
>
> Any pointers for how to debug it further and/or fix it?
>
> Cheers,
> Marcus.
>
>
> [435339.965817] ------------[ cut here ]------------
> [435339.965874] WARNING: at fs/xfs/xfs_aops.c:1244
> xfs_vm_releasepage+0xcb/0x100 [xfs]()
> [435339.965876] Modules linked in: vfat fat uas usb_storage mpt3sas
> mpt2sas raid_class scsi_transport_sas mptctl mptbase iptable_filter
> dell_rbu team_mode_loadbalance team rpcrdma ib_isert iscsi_target_mod
> ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp
> scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
> rdma_cm ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp
> intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul
> ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper
> cryptd ipmi_devintf iTCO_wdt iTCO_vendor_support mxm_wmi dcdbas pcspkr
> ipmi_ssif sb_edac edac_core sg mei_me mei lpc_ich shpchp ipmi_si
> ipmi_msghandler wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd
> grace sunrpc ip_tables xfs sd_mod crc_t10dif crct10dif_generic mgag200
> i2c_algo_bit
> [435339.965942]  crct10dif_pclmul crct10dif_common drm_kms_helper
> crc32c_intel syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm
> bnx2x ahci libahci mlx5_core i2c_core libata mdio ptp megaraid_sas
> nvme pps_core libcrc32c fjes dm_mirror dm_region_hash dm_log dm_mod
> [435339.965991] CPU: 8 PID: 223 Comm: kswapd0 Not tainted
> 3.10.0-514.10.2.el7.x86_64 #1
> [435339.965993] Hardware name: Dell Inc. PowerEdge R730xd/072T6D, BIOS
> 2.3.4 11/08/2016
> [435339.965994]  0000000000000000 000000006ea9561d ffff881ffc2c7aa0
> ffffffff816863ef
> [435339.965998]  ffff881ffc2c7ad8 ffffffff81085940 ffffea00015d4e20
> ffffea00015d4e00
> [435339.966000]  ffff880f4d7c5af8 ffff881ffc2c7da0 ffffea00015d4e00
> ffff881ffc2c7ae8
> [435339.966003] Call Trace:
> [435339.966010]  [<ffffffff816863ef>] dump_stack+0x19/0x1b
> [435339.966015]  [<ffffffff81085940>] warn_slowpath_common+0x70/0xb0
> [435339.966018]  [<ffffffff81085a8a>] warn_slowpath_null+0x1a/0x20
> [435339.966060]  [<ffffffffa03be56b>] xfs_vm_releasepage+0xcb/0x100 [xfs]
> [435339.966120]  [<ffffffff81180662>] try_to_release_page+0x32/0x50
> [435339.966128]  [<ffffffff811965e6>] shrink_active_list+0x3d6/0x3e0
> [435339.966133]  [<ffffffff811969e1>] shrink_lruvec+0x3f1/0x770
> [435339.966138]  [<ffffffff81196dd6>] shrink_zone+0x76/0x1a0
> [435339.966143]  [<ffffffff8119807c>] balance_pgdat+0x48c/0x5e0
> [435339.966147]  [<ffffffff81198343>] kswapd+0x173/0x450
> [435339.966155]  [<ffffffff810b17d0>] ? wake_up_atomic_t+0x30/0x30
> [435339.966158]  [<ffffffff811981d0>] ? balance_pgdat+0x5e0/0x5e0
> [435339.966161]  [<ffffffff810b06ff>] kthread+0xcf/0xe0
> [435339.966165]  [<ffffffff810b0630>] ? kthread_create_on_node+0x140/0x140
> [435339.966170]  [<ffffffff81696958>] ret_from_fork+0x58/0x90
> [435339.966173]  [<ffffffff810b0630>] ? kthread_create_on_node+0x140/0x140
> [435339.966175] ---[ end trace 58233bbca77fd5e2 ]---
>
> --
> Marcus Furlong
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com