On 5/7/14 13:40 , Sergey Malinin wrote: > Check dmesg and SMART data on both nodes. This behaviour is similar to > failing hdd. > > It does sound like a failing disk... but there's nothing in dmesg, and smartmontools hasn't emailed me about a failing disk. The same thing is happening to more than 50% of my OSDs, in both nodes. smartctl for osd.5 says: SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0005 136 136 054 Pre-fail Offline - 81 3 Spin_Up_Time 0x0007 100 100 024 Pre-fail Always - 606 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 5 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 119 119 020 Pre-fail Offline - 35 9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 4028 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 5 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 166 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 166 194 Temperature_Celsius 0x0002 166 166 000 Old_age Always - 36 (Min/Max 21/39) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 The weekly scheduled tests have all completed successfully: SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 3922 - # 2 Short offline Completed without error 00% 3754 - ... -- *Craig Lewis* Senior Systems Engineer Office +1.714.602.1309 Email clewis at centraldesktop.com <mailto:clewis at centraldesktop.com> *Central Desktop. Work together in ways you never thought possible.* Connect with us Website <http://www.centraldesktop.com/> | Twitter <http://www.twitter.com/centraldesktop> | Facebook <http://www.facebook.com/CentralDesktop> | LinkedIn <http://www.linkedin.com/groups?gid=147417> | Blog <http://cdblog.centraldesktop.com/> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140507/6f117095/attachment.htm>