Attempt to access beyond end of device

Brady Deetz <bdeetz@xxxxxxxxx> · Wed, 28 Sep 2016 13:28:14 -0500

The question:
Is this something I need to investigate further, or am I being paranoid? Seems bad to me.
============================================================

I have a fairly new cluster built using ceph-deploy 1.5.34-0, ceph 10.2.2-0, and centos 7.2.1511.
I recently noticed on every one of my osd nodes alarming dmesg log entries for each osd on each node on some kind of periodic basis:
attempt to access beyond end of device
sda1: rw=0, want=11721043088, limit=11721043087

For instance one node had entries at times:
Sep 27 05:40:34
Sep 27 07:10:32
Sep 27 08:10:30
Sep 27 09:40:28
Sep 27 12:40:24
Sep 27 15:40:19

In every case, the "want" is 1 sector greater than the "limit"... My first thought was 'could this be an off-by-one bug somewhere in Ceph?' But, after thinking about the way stuff works and the data below, the seems unlikely.

Digging around I found and followed this redhat article:
https://access.redhat.com/solutions/21135

----------------------------------------------
Error Message Device Size:
11721043087 * 512 = 6001174060544

Current Device Size: 
cat /proc/partitions | grep sda1
8 1 5860521543 sda1

5860521543 * 1024 = 6001174060032

Filesystem Size:
sudo xfs_info /dev/sda1 | grep data | grep blocks
data = ""   bsize=4096 blocks=1465130385, imaxpct=5

1465130385 * 4096 = 6001174056960
----------------------------------------------

(EMDS != CDS) == true
Redhat says device naming may have change. All but 2 disks in the node are identical. Those 2 disks are md raided and not exhibiting the issue. So, I don't think this is the issue.

(FSS > CDS) == false
My filesystem is not larger than the device size or the error message device size.

Thanks,
Brady
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com