Okay, thanks for the information. Sam has walked through trying to fix this before and I don't know if he came up with anything, but the use of btrfs compression has been a common theme among those who have reproduced this bug. I updated the ticket, but for now I'd recommend leaving it off with the rest of your machines. John, can you add a warning to whatever install/configuration/whatever docs are appropriate? -Greg On Tue, Oct 9, 2012 at 12:50 PM, Dave (Bob) <dave@xxxxxxxxxxxxxxxxxx> wrote: > Greg, > > Thank you very much for your prompt rely. > > Yes, I am using lzo compression, and autodefrag. > > David > > > On 09/10/2012 20:45, Gregory Farnum wrote: >> I'm going to have to leave most of these questions for somebody else, >> but I do have one question. Are you using btrfs compression on your >> OSD backing filesystems? >> -Greg >> >> On Tue, Oct 9, 2012 at 12:43 PM, Dave (Bob) <dave@xxxxxxxxxxxxxxxxxx> wrote: >>> I have a problem with this leveldb corruption issue. My logs show the >>> same failure as is shown in Ceph's redmine as bug #2563. >>> >>> I am using linux-3.6.0 (x86_64) and ceph-0.52. >>> >>> I am using btrfs on my 4 osd's. Each osd is using a partition on a disk drive, >>> there are 4 disk drives, all on the same machine. >>> >>> Each of these osd partitions is the bulk of the disk. There are also >>> partitions that provide for booting and a root filesystem from which >>> linux runs. >>> >>> The mon and mds are running on the same machine. >>> >>> I have been tracking Ceph releases for about a year, this is my ceph >>> test machine. >>> >>> Ceph clearly hammers the disk system; btrfs; and linux. Things have >>> moved so far over the past six months, from a time when things would >>> crash horribly in a short time to the point where it almost works. >>> >>> I have had a lot of trouble with the 'slow response' messages associated >>> with the osd's, but linux-3.6.0 seems to have brought about improvements >>> in btrfs that are noticeable. I am also tuning the >>> 'dirty_background_ratio' and I think that this will help. >>> >>> With my current configuration, I can leave ceph and my osds churning >>> data for days on end, and the only errors that I get are the leveldb >>> 'std::__throw_length_error' pattern. The osd's go down and can't be >>> brought back up. >>> >>> I have compiled the 'check.cc' program that I found following the bug >>> #2563 links. I copy the omap directory from my broken osd (current or >>> snaps) and run the check on it and get: >>> >>> terminate called after throwing an instance of 'std::length_error' >>> >>> In the past, I've had only one osd at a time go down in this way, and >>> I've re-created a btrfs filesystem and allowed ceph to regenerate. Now I >>> have been working with only 3 osds and two have gone down >>> simultaneously. I've been amazed at ceph's ability to repair itself, but >>> I think that this is not going to be recoverable. >>> >>> On the ceph redmine, it says: >>> >>> * *Status* changed from /New/ to /Can't reproduce/ >>> >>> I can reproduce this time and time again. From my perspective it looks >>> like the final block to my being confident that all I have to do is >>> optimise my hardware and configuration to make things faster. >>> >>> What can we do to fix this problem? >>> >>> Is there anything that I can do to recover my broken osd's without >>> recreating them afresh and loosing the data? >>> >>> David Humphreys >>> Datatone Ltd >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html