We recently hit an issue with xattrs on different underlying filesystems. The basic problem is that we use xattrs for various metadata information that we attach to the different objects, and for some cases it tends to grow to relatively large sizes. It appears that the various underlying filesystems that we currently use (btrfs, ext3, ext4) have some limits on the xattrs sizes. For ext3/4 the total sizes of all xattrs on a single file is limited and can't go beyond a single block (e.g., typically 4k), whereas in btrfs, the limitation is per each xattr but not globally. This problem was discovered as we changed some error handling code, which previously just ignored those issues and now crashed the osds in a very noticeable way. For btrfs we've worked around the problem by splitting larger xattrs into chunks, so that we never write a large xattr that the underlying filesystem can't digest. For ext3/4 this can't really work, as there is a total maximum limit that can be used on a single file, so splitting doesn't help there. The best solution for this problem, is just having it fixed on ext4, however, at this point it's not something that we want to dive into. Another solution to this problem would be by having a separate file that holds the metadata, or at least the parts of the metadata that wouldn't fit the xattrs. This is not an optimal solution, as it will add complexity to the filestore layer, and slows it down. There is more than just reading/writing xattrs for a single object, and we'd need to handle operations like object cloning/snapshotting/removal, etc. There will be another file to take care of for each object, and it has its complexities. A possible workaround would be to just identify the specific cases where the xattrs inflate and to work around these cases, but it isn't an optimal solution. As Sage pointed out on the 0.22.1 release notes, at this point we partially reverted to the silent-ignore handling on ext3/4 so that we don't crash the osds when we hit that. It seems we only hit that on specific cases where we think it is safe, as we only saw it on stray directories that aren't going to be read anyway. But there is still a lingering problem. A user that sets large xattrs via librados or through the rados gateway will hit the same issues. Ted, is this is a set ext4 problem that isn't going to change in the future? It'll help us to know on what solution to focus our efforts on. Yehuda -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html