On Fri, Sep 6, 2013 at 12:06 AM, Mihály Árva-Tóth <mihaly.arva-toth@xxxxxxxxxxxxxxxxxxxxxx> wrote: > Hello, > > I have a server with hot swappable SATA disks. When I remove HDD from a > working server, OSD does not noice missing of HDD. ceph healt status write > HEALTH_OK and all of OSD "in" and "up". When I run a swift client on another > server to get an object which one of chunk is available on removed disk, > radosgw returns with 404 Not Found. If I check osd's log: > > 2013-09-05T15:32:21+02:00 stor1 ceph-osd: 2013-09-05 15:32:21.907507 > 7fd415d93700 -1 filestore(/var/lib/ceph/osd/ceph-0) could not find > dd997afb/default.6125.2__shadow__r2NQ0fgMPvMXi2SC8kd1E0IFrbjw-5g_2/head//12 > in index: (19) No such device > > And I can reproduce every time. swift client get false response. In this > test the cluster does not get write operations at all from radosgw. Why OSD > does not notice missing of it's HDD? > > When I try to upload via swift, the OSD try to write chunk to HDD, but runs > error (missing HDD), and ceph-osd daemon terminate; mon's notice OSD ping > loss and update monmap. So it seems OSD can detect missing of HDD when try > to write only and not read/write. Well that's interesting. I thought we'd set up proper filters on all the error codes we can get back from the FS but some combination of this error and the read path must have gotten missed. I've created a ticket: http://tracker.ceph.com/issues/6250 -Greg _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com