Hi, just for the record: A reboot of the osd node solved the issue, now the wal is fully purged and the extra 790MB are gone. Sorry for the noise. Dietmar On 01/27/2018 11:08 AM, Dietmar Rieder wrote: > Hi, > > replying to my own message. > > After I restarted the OSD it seems some of the wal partition got purged. > However there are still ~790MB used. As far as I think, it should get > completely emptied. At least this is what happens when I restart another > OSD, there its associated wal gets copletely flushed. > Is it somhow possible to reinitialize the wal for that OSD in question? > > Thanks > Dietmar > > > On 01/26/2018 05:11 PM, Dietmar Rieder wrote: >> Hi all, >> >> I've a question regarding bluestore wal.db: >> >> >> We are running a 10 OSD node + 3 MON/MDS node cluster (luminous 12.2.2). >> Each OSD node has 22xHDD (8TB) OSDs, 2xSSD (1.6TB) OSDs and 2xNVME (800 >> GB) for bluestore wal and db. >> >> We have separated wal and db partitions >> wal partitions are 1GB >> db partitions are 64GB >> >> The cluster is providing cephfs from one HDD (EC 6+3) and one SSD >> (3xrep) pool. >> Since the cluster is "new" we have not much data ~30TB (HDD EC) and >> ~140GB (SSD rep) stored on it yet. >> >> I just noticed that the wal db usage for the SSD OSDs is all more or >> less equal ~518MB. The wal db usage for the HDD OSDs is as well quite >> balanced at 284-306MB, however there is one OSD whose wal db usage is ~ 1GB >> >> >> "bluefs": { >> "gift_bytes": 0, >> "reclaim_bytes": 0, >> "db_total_bytes": 68719468544, >> "db_used_bytes": 1114636288, >> "wal_total_bytes": 1073737728, >> "wal_used_bytes": 1072693248, >> "slow_total_bytes": 320057901056, >> "slow_used_bytes": 0, >> "num_files": 16, >> "log_bytes": 862326784, >> "log_compactions": 0, >> "logged_bytes": 850575360, >> "files_written_wal": 2, >> "files_written_sst": 9, >> "bytes_written_wal": 744469265, >> "bytes_written_sst": 568855830 >> }, >> >> >> and I got the following log entries: >> >> 2018-01-26 16:31:05.484284 7f65ea28a700 1 bluefs _allocate failed to >> allocate 0x400000 on bdev 0, free 0xff000; fallback to bdev 1 >> >> Is there any reason for this difference ~300MB vs 1GB? >> I have in mind that 1GB of wal should be enough, and old logs should be >> purged to free space. (can this be triggered manually?) >> >> Could this be related to the fact that the HDD OSD in question was >> failing some week ago and we replaced it with with a new HDD? >> >> Do we have to expect problems/performace reductions, with the falling >> back to bdev 1? >> >> Thanks for any clarifying comment >> Dietmar >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com