The question:
Is this something I need to investigate further, or am I being paranoid? Seems bad to me.
============================================================
I recently noticed on every one of my osd nodes alarming dmesg log entries for each osd on each node on some kind of periodic basis:
attempt to access beyond end of device
sda1: rw=0, want=11721043088, limit=11721043087
For instance one node had entries at times:
Sep 27 05:40:34
Sep 27 07:10:32
Sep 27 08:10:30
Sep 27 09:40:28
Sep 27 12:40:24
Sep 27 15:40:19
In every case, the "want" is 1 sector greater than the "limit"... My first thought was 'could this be an off-by-one bug somewhere in Ceph?' But, after thinking about the way stuff works and the data below, the seems unlikely.
Digging around I found and followed this redhat article:
----------------------------------------------
Error Message Device Size:
11721043087 * 512 = 6001174060544
Current Device Size:
cat /proc/partitions | grep sda1
8 1 5860521543 sda1
5860521543 * 1024 = 6001174060032
Filesystem Size:
sudo xfs_info /dev/sda1 | grep data | grep blocks
data = "" bsize=4096 blocks=1465130385, imaxpct=5
1465130385 * 4096 = 6001174056960
----------------------------------------------
(EMDS != CDS) == true
Redhat says device naming may have change. All but 2 disks in the node are identical. Those 2 disks are md raided and not exhibiting the issue. So, I don't think this is the issue.
(FSS > CDS) == false
My filesystem is not larger than the device size or the error message device size.
Thanks,
Brady
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com