Hi, > I don't think the default osd_min_pg_log_entries has changed recently. > In https://tracker.ceph.com/issues/47775 I proposed that we limit the > pg log length by memory -- if it is indeed possible for log entries to > get into several MB, then this would be necessary IMHO. I've had a surprising crash course on pg_log in the last 36 hours. But for the size of each entry, you're right. I counted pg log * ODS, and did not take into factor pg log * OSDs * pgs on the OSD. Still the total memory use that an OSD uses for pg_log was ~22GB / OSD process. > But you said you were trimming PG logs with the offline tool? How long > were those logs that needed to be trimmed? The logs we are trimming were ~3000, we trimmed them to the new size of 500. After restarting the OSDs, it dropped the pg_log memory usage from ~22GB, to what we guess is 2-3GB but with the cluster at this state, it's hard to be specific. Cheers, Kalle > -- dan > > > On Tue, Nov 17, 2020 at 11:58 AM Kalle Happonen <kalle.happonen@xxxxxx> wrote: >> >> Another idea, which I don't know if has any merit. >> >> If 8 MB is a realistic log size (or has this grown for some reason?), did the >> enforcement (or default) of the minimum value change lately >> (osd_min_pg_log_entries)? >> >> If the minimum amount would be set to 1000, at 8 MB per log, we would have >> issues with memory. >> >> Cheers, >> Kalle >> >> >> >> ----- Original Message ----- >> > From: "Kalle Happonen" <kalle.happonen@xxxxxx> >> > To: "Dan van der Ster" <dan@xxxxxxxxxxxxxx> >> > Cc: "ceph-users" <ceph-users@xxxxxxx> >> > Sent: Tuesday, 17 November, 2020 12:45:25 >> > Subject: Re: osd_pglog memory hoarding - another case >> >> > Hi Dan @ co., >> > Thanks for the support (moral and technical). >> > >> > That sounds like a good guess, but it seems like there is nothing alarming here. >> > In all our pools, some pgs are a bit over 3100, but not at any exceptional >> > values. >> > >> > cat pgdumpfull.txt | jq '.pg_map.pg_stats[] | >> > select(.ondisk_log_size > 3100)' | egrep "pgid|ondisk_log_size" >> > "pgid": "37.2b9", >> > "ondisk_log_size": 3103, >> > "pgid": "33.e", >> > "ondisk_log_size": 3229, >> > "pgid": "7.2", >> > "ondisk_log_size": 3111, >> > "pgid": "26.4", >> > "ondisk_log_size": 3185, >> > "pgid": "33.4", >> > "ondisk_log_size": 3311, >> > "pgid": "33.8", >> > "ondisk_log_size": 3278, >> > >> > I also have no idea what the average size of a pg log entry should be, in our >> > case it seems it's around 8 MB (22GB/3000 entires). >> > >> > Cheers, >> > Kalle >> > >> > ----- Original Message ----- >> >> From: "Dan van der Ster" <dan@xxxxxxxxxxxxxx> >> >> To: "Kalle Happonen" <kalle.happonen@xxxxxx> >> >> Cc: "ceph-users" <ceph-users@xxxxxxx>, "xie xingguo" <xie.xingguo@xxxxxxxxxx>, >> >> "Samuel Just" <sjust@xxxxxxxxxx> >> >> Sent: Tuesday, 17 November, 2020 12:22:28 >> >> Subject: Re: osd_pglog memory hoarding - another case >> > >> >> Hi Kalle, >> >> >> >> Do you have active PGs now with huge pglogs? >> >> You can do something like this to find them: >> >> >> >> ceph pg dump -f json | jq '.pg_map.pg_stats[] | >> >> select(.ondisk_log_size > 3000)' >> >> >> >> If you find some, could you increase to debug_osd = 10 then share the osd log. >> >> I am interested in the debug lines from calc_trim_to_aggressively (or >> >> calc_trim_to if you didn't enable pglog_hardlimit), but the whole log >> >> might show other issues. >> >> >> >> Cheers, dan >> >> >> >> >> >> On Tue, Nov 17, 2020 at 9:55 AM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: >> >>> >> >>> Hi Kalle, >> >>> >> >>> Strangely and luckily, in our case the memory explosion didn't reoccur >> >>> after that incident. So I can mostly only offer moral support. >> >>> >> >>> But if this bug indeed appeared between 14.2.8 and 14.2.13, then I >> >>> think this is suspicious: >> >>> >> >>> b670715eb4 osd/PeeringState: do not trim pg log past last_update_ondisk >> >>> >> >>> https://github.com/ceph/ceph/commit/b670715eb4 >> >>> >> >>> Given that it adds a case where the pg_log is not trimmed, I wonder if >> >>> there could be an unforeseen condition where `last_update_ondisk` >> >>> isn't being updated correctly, and therefore the osd stops trimming >> >>> the pg_log altogether. >> >>> >> >>> Xie or Samuel: does that sound possible? >> >>> >> >>> Cheers, Dan >> >>> >> >>> On Tue, Nov 17, 2020 at 9:35 AM Kalle Happonen <kalle.happonen@xxxxxx> wrote: >> >>> > >> >>> > Hello all, >> >>> > wrt: >> >>> > https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/7IMIWCKIHXNULEBHVUIXQQGYUDJAO2SF/ >> >>> > >> >>> > Yesterday we hit a problem with osd_pglog memory, similar to the thread above. >> >>> > >> >>> > We have a 56 node object storage (S3+SWIFT) cluster with 25 OSD disk per node. >> >>> > We run 8+3 EC for the data pool (metadata is on replicated nvme pool). >> >>> > >> >>> > The cluster has been running fine, and (as relevant to the post) the memory >> >>> > usage has been stable at 100 GB / node. We've had the default pg_log of 3000. >> >>> > The user traffic doesn't seem to have been exceptional lately. >> >>> > >> >>> > Last Thursday we updated the OSDs from 14.2.8 -> 14.2.13. On Friday the memory >> >>> > usage on OSD nodes started to grow. On each node it grew steadily about 30 >> >>> > GB/day, until the servers started OOM killing OSD processes. >> >>> > >> >>> > After a lot of debugging we found that the pg_logs were huge. Each OSD process >> >>> > pg_log had grown to ~22GB, which we naturally didn't have memory for, and then >> >>> > the cluster was in an unstable situation. This is significantly more than the >> >>> > 1,5 GB in the post above. We do have ~20k pgs, which may directly affect the >> >>> > size. >> >>> > >> >>> > We've reduced the pg_log to 500, and started offline trimming it where we can, >> >>> > and also just waited. The pg_log size dropped to ~1,2 GB on at least some >> >>> > nodes, but we're still recovering, and have a lot of ODSs down and out still. >> >>> > >> >>> > We're unsure if version 14.2.13 triggered this, or if the osd restarts triggered >> >>> > this (or something unrelated we don't see). >> >>> > >> >>> > This mail is mostly to figure out if there are good guesses why the pg_log size >> >>> > per OSD process exploded? Any technical (and moral) support is appreciated. >> >>> > Also, currently we're not sure if 14.2.13 triggered this, so this is also to >> >>> > put a data point out there for other debuggers. >> >>> > >> >>> > Cheers, >> >>> > Kalle Happonen >> >>> > _______________________________________________ >> >>> > ceph-users mailing list -- ceph-users@xxxxxxx >> >> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ >> > ceph-users mailing list -- ceph-users@xxxxxxx > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx