On 21/07/2012 2:12 AM, Tommi Virtanen wrote:
On Fri, Jul 20, 2012 at 9:17 AM, Vladimir Bashkirtsev
<vladimir@xxxxxxxxxxxxxxx> wrote:
not running. So I ended up rebooting hosts and that's where fun begin: btrfs
has failed to umount , on boot up it spit out "btrfs: free space inode
generation (0) did not match free space cache generation (177431)". I have
not started ceph and made an attempt to umount and umount just froze.
Another reboot: same stuff. I have rebooted second host and it came back
with the same error. So in effect I was unable to mount btrfs and read it:
no wonder that ceph was unable to run. Actually according to mons ceph was
The btrfs developers tend to be good about bug reports that severe --
I think you should email that mailing list and ask if that sounds like
known bug, and ask what information you should capture if it happens
again (assuming the workload is complex enough that you can't easily
capture/reproduce all of that).
Well... Work load was fairly high - not something usually happening on
MySQL. Our client keeps imagery in MySQL and his system was regenerating
images (it takes hi-res image and produces five or six images which are
of smaller size + watermark). Stuff runs imagemagick which keeps its
temporary data on disk (and to ceph it is not really temporary data - it
is data which must be committed to osds) and then innodb in MySQL stores
results - which of course creates number of pages and so it appears as
random writes to underlying file system. And from what I have seen write
traffic created by this process was in TB range (my whole ceph cluster
is just 3.3TB). So it was considerable amount of changes on filesystem.
I guess if we will start that process again we will end up with the
similar result in few days - but by some reason I don't want to try it
in production system :)
I can scavenge something from logs and post it to btrfs devs. Thanks for
a tip.
But it leaves me with very final question: should we rely on btrfs at this
point given it is having such major faults? What if I will use well tested
by time ext4?
You might want to try xfs. We hear/see problems with all three, but
xfs currently seems to have the best long-term performance and
reliability.
I'm not sure if anyone's run detailed tests with ext4 after the
xattrs-in-leveldb feature; before that, we ran into fs limitations.
That's what I was thinking: before xattrs-in-leveldb I even did not
consider ext4 as viable alternative but now it may be reasonable to give
it a go. Or even may be have a mix of osds backed by different file
systems? What is devs opinion on this?
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html