Hello, On Fri, 06 Oct 2017 03:30:41 +0000 David Turner wrote: > You're missing most all of the important bits. What the osds in your > cluster look like, your tree, and your cache pool settings. > > ceph df > ceph osd df > ceph osd tree > ceph osd pool get cephfs_cache all > Especially the last one. My money is on not having set target_max_objects and target_max_bytes to sensible values along with the ratios. In short, not having read the (albeit spotty) documentation. > You have your writeback cache on 3 nvme drives. It looks like you have > 1.6TB available between them for the cache. I don't know the behavior of a > writeback cache tier on cephfs for large files, but I would guess that it > can only hold full files and not flush partial files. I VERY much doubt that, if so it would be a massive flaw. One assumes that cache operations work on the RADOS object level, no matter what. > That would mean your > cache needs to have enough space for any file being written to the cluster. > In this case a 1.3TB file with 3x replication would require 3.9TB (more > than double what you have available) of available space in your writeback > cache. > > There are very few use cases that benefit from a cache tier. The docs for > Luminous warn as much. You keep repeating that like a broken record. And while certainly not false I for one wouldn't be able to use (justify using) Ceph w/o cache tiers in our main use case. In this case I assume they were following and old cheat sheet or such, suggesting the previously required cache tier with EC pools. Christian >What is your goal by implementing this cache? If the > answer is to utilize extra space on the nvmes, then just remove it and say > thank you. The better use of nvmes in that case are as a part of the > bluestore stack and give your osds larger DB partitions. Keeping your > metadata pool on nvmes is still a good idea. > > On Thu, Oct 5, 2017, 7:45 PM Shawfeng Dong <shaw@xxxxxxxx> wrote: > > > Dear all, > > > > We just set up a Ceph cluster, running the latest stable release Ceph > > v12.2.0 (Luminous): > > # ceph --version > > ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous > > (rc) > > > > The goal is to serve Ceph filesystem, for which we created 3 pools: > > # ceph osd lspools > > 1 cephfs_data,2 cephfs_metadata,3 cephfs_cache, > > where > > * cephfs_data is the data pool (36 OSDs on HDDs), which is erased-coded; > > * cephfs_metadata is the metadata pool > > * cephfs_cache is the cache tier (3 OSDs on NVMes) for cephfs_data. The > > cache-mode is writeback. > > > > Everything had worked fine, until today when we tried to copy a 1.3TB file > > to the CephFS. We got the "No space left on device" error! > > > > 'ceph -s' says some OSDs are full: > > # ceph -s > > cluster: > > id: e18516bf-39cb-4670-9f13-88ccb7d19769 > > health: HEALTH_ERR > > full flag(s) set > > 1 full osd(s) > > 1 pools have many more objects per pg than average > > > > services: > > mon: 3 daemons, quorum pulpo-admin,pulpo-mon01,pulpo-mds01 > > mgr: pulpo-mds01(active), standbys: pulpo-admin, pulpo-mon01 > > mds: pulpos-1/1/1 up {0=pulpo-mds01=up:active} > > osd: 39 osds: 39 up, 39 in > > flags full > > > > data: > > pools: 3 pools, 2176 pgs > > objects: 347k objects, 1381 GB > > usage: 2847 GB used, 262 TB / 265 TB avail > > pgs: 2176 active+clean > > > > io: > > client: 19301 kB/s rd, 2935 op/s rd, 0 op/s wr > > > > And indeed the cache pool is full: > > # rados df > > POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND > > DEGRADED RD_OPS RD > > WR_OPS WR > > cephfs_cache 1381G 355385 0 710770 0 0 > > 0 10004954 15 > > 22G 1398063 1611G > > cephfs_data 0 0 0 0 0 0 > > 0 0 > > 0 0 0 > > cephfs_metadata 8515k 24 0 72 0 0 > > 0 3 3 > > 072 3953 10541k > > > > total_objects 355409 > > total_used 2847G > > total_avail 262T > > total_space 265T > > > > However, the data pool is completely empty! So it seems that data has only > > been written to the cache pool, but not written back to the data pool. > > > > I am really at a loss whether this is due to a setup error on my part, or > > a Luminous bug. Could anyone shed some light on this? Please let me know if > > you need any further info. > > > > Best, > > Shaw > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Rakuten Communications _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com