Hello,
I'm using Ubuntu 12.04-x64 and Dumpling from Ceph's deb-repository.
I have a server with hot swappable SATA disks. When I remove HDD from a working server, OSD does not noice missing of HDD. ceph healt status write HEALTH_OK and all of OSD "in" and "up". When I run a swift client on another server to get an object which one of chunk is available on removed disk, radosgw returns with 404 Not Found. If I check osd's log:
2013-09-05T15:32:21+02:00 stor1 ceph-osd: 2013-09-05 15:32:21.907507 7fd415d93700 -1 filestore(/var/lib/ceph/osd/ceph-0) could not find dd997afb/default.6125.2__shadow__r2NQ0fgMPvMXi2SC8kd1E0IFrbjw-5g_2/head//12 in index: (19) No such device
2013-09-05T15:32:21+02:00 stor1 ceph-osd: 2013-09-05 15:32:21.907507 7fd415d93700 -1 filestore(/var/lib/ceph/osd/ceph-0) could not find dd997afb/default.6125.2__shadow__r2NQ0fgMPvMXi2SC8kd1E0IFrbjw-5g_2/head//12 in index: (19) No such device
And I can reproduce every time. swift client get false response. In this test the cluster does not get write operations at all from radosgw. Why OSD does not notice missing of it's HDD?
When I try to upload via swift, the OSD try to write chunk to HDD, but runs error (missing HDD), and ceph-osd daemon terminate; mon's notice OSD ping loss and update monmap. So it seems OSD can detect missing of HDD when try to write only and not read/write.
I'm using Ubuntu 12.04-x64 and Dumpling from Ceph's deb-repository.
Thank you,
Mihaly
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com