Hi Marc, 1. if you can, yes, create a tracker issue on tracker.ceph.com? 2. you might be able to get more throughput with (some number) of additional threads; the first thing I would try is prioritization (nice) regards, Matt On Fri, Jan 26, 2024 at 6:08 AM Marc Singer <marc@singer.services> wrote: > Hi Matt > > Thanks for your answer. > > Should I open a bug report then? > > How would I be able to read more from it? Have multiple threads access > it and read from it simultaneously? > > Marc > > On 1/25/24 20:25, Matt Benjamin wrote: > > Hi Marc, > > > > No, the only thing you need to do with the Unix socket is to keep > > reading from it. So it probably is getting backlogged. And while you > > could arrange things to make that less likely, you likely can't make > > it impossible, so there's a bug here. > > > > Matt > > > > On Thu, Jan 25, 2024 at 10:52 AM Marc Singer <marc@singer.services> > wrote: > > > > Hi > > > > I am using a unix socket client to connect with it and read the data > > from it. > > Do I need to do anything like signal the socket that this data has > > been > > read? Or am I not reading fast enough and data is backing up? > > > > What I am also noticing that at some point (probably after something > > with the ops socket happens), the log level seems to increase for > > some > > reason? I did not find anything in the logs yet why this would be > > the case. > > > > *Normal:* > > > > 2024-01-25T15:47:58.444+0000 7fe98a5c0b00 1 ====== starting new > > request > > req=0x7fe98712c720 ===== > > 2024-01-25T15:47:58.548+0000 7fe98b700b00 1 ====== req done > > req=0x7fe98712c720 op status=0 http_status=200 > > latency=0.104001537s ====== > > 2024-01-25T15:47:58.548+0000 7fe98b700b00 1 beast: 0x7fe98712c720: > > redacted - redacted [25/Jan/2024:15:47:58.444 +0000] "PUT > > /redacted/redacted/chunks/27/27242/27242514_10_4194304 HTTP/1.1" 200 > > 4194304 - "redacted" - latency=0.104001537s > > > > *Close before crashing: > > * > > > > -509> 2024-01-25T14:54:31.588+0000 7f5186648b00 1 ====== starting > > new request req=0x7f517ffca720 ===== > > -508> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s initializing for trans_id = > > tx0000023a42eb7515dcdc0-0065b27627-823feaa-central > > -507> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s getting op 1 > > -506> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s s3:put_obj verifying requester > > -505> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s s3:put_obj normalizing buckets > > and tenants > > -504> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s s3:put_obj init permissions > > -503> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s s3:put_obj recalculating target > > -502> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s s3:put_obj reading permissions > > -501> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s s3:put_obj init op > > -500> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s s3:put_obj verifying op mask > > -499> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s s3:put_obj verifying op permissions > > -498> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > > 2568229052387020224 0.000000000s s3:put_obj Searching permissions for > > identity=rgw::auth::SysReqApplier -> > > rgw::auth::LocalApplier(acct_user=redacted, acct_name=redacted, > > subuser=, perm_mask=15, is_admin=0) mask=50 > > -497> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > > 2568229052387020224 0.000000000s s3:put_obj Searching permissions for > > uid=redacted > > -496> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > > 2568229052387020224 0.000000000s s3:put_obj Found permission: 15 > > -495> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > > 2568229052387020224 0.000000000s s3:put_obj Searching permissions for > > group=1 mask=50 > > -494> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > > 2568229052387020224 0.000000000s s3:put_obj Permissions for group > > not found > > -493> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > > 2568229052387020224 0.000000000s s3:put_obj Searching permissions for > > group=2 mask=50 > > -492> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > > 2568229052387020224 0.000000000s s3:put_obj Permissions for group > > not found > > -491> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > > 2568229052387020224 0.000000000s s3:put_obj -- Getting permissions > > done > > for identity=rgw::auth::SysReqApplier -> > > rgw::auth::LocalApplier(acct_user=redacted, acct_name=redacted, > > subuser=, perm_mask=15, is_admin=0), owner=redacted, perm=2 > > -490> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s s3:put_obj verifying op params > > -489> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s s3:put_obj pre-executing > > -488> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s s3:put_obj check rate limiting > > -487> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > > 2568229052387020224 0.000000000s s3:put_obj executing > > -486> 2024-01-25T14:54:31.624+0000 7f5183898b00 5 req > > 2568229052387020224 0.036000550s s3:put_obj NOTICE: call to > > do_aws4_auth_completion > > -485> 2024-01-25T14:54:31.624+0000 7f5183898b00 5 req > > 2568229052387020224 0.036000550s s3:put_obj NOTICE: call to > > do_aws4_auth_completion > > -484> 2024-01-25T14:54:31.680+0000 7f5185bc8b00 2 req > > 2568229052387020224 0.092001401s s3:put_obj completing > > -483> 2024-01-25T14:54:31.680+0000 7f5185bc8b00 2 req > > 2568229052387020224 0.092001401s s3:put_obj op status=0 > > -482> 2024-01-25T14:54:31.680+0000 7f5185bc8b00 2 req > > 2568229052387020224 0.092001401s s3:put_obj http status=200 > > > > -481> 2024-01-25T14:54:31.680+0000 7f5185bc8b00 1 ====== req done > > req=0x7f517ffca720 op status=0 http_status=200 > > latency=0.092001401s ====== > > > > Thanks for your help. > > > > Marc Singer > > > > On 1/25/24 16:22, Matt Benjamin wrote: > > > Hi Marc, > > > > > > The ops log code is designed to discard data if the socket is > > > flow-controlled, iirc. Maybe we just need to handle the signal. > > > > > > Of course, you should have something consuming data on the > > socket, but it's > > > still a problem if radosgw exits unexpectedly. > > > > > > Matt > > > > > > On Thu, Jan 25, 2024 at 10:08 AM Marc > > Singer<marc@singer.services> wrote: > > > > > >> Hi Ceph Users > > >> > > >> I am encountering a problem with the RGW Admin Ops Socket. > > >> > > >> I am setting up the socket as follows: > > >> > > >> rgw_enable_ops_log = true > > >> rgw_ops_log_socket_path = /tmp/ops/rgw-ops.socket > > >> rgw_ops_log_data_backlog = 16Mi > > >> > > >> Seems like the socket fills up over time and it doesn't seem to > get > > >> flushed, at some point the process runs out of file space. > > >> > > >> Do I need to configure something or send something for the > > socket to flush? > > >> > > >> See the log here: > > >> > > >> 0> 2024-01-25T13:10:13.908+0000 7f247b00eb00 -1 *** Caught > > signal (File > > >> size limit exceeded) ** > > >> in thread 7f247b00eb00 thread_name:ops_log_file > > >> > > >> ceph version 18.2.0 > > (5dd24139a1eada541a3bc16b6941c5dde975e26d) reef > > >> (stable) > > >> NOTE: a copy of the executable, or `objdump -rdS > > <executable>` is > > >> needed to interpret this. > > >> > > >> --- logging levels --- > > >> 0/ 5 none > > >> 0/ 1 lockdep > > >> 0/ 1 context > > >> 1/ 1 crush > > >> 1/ 5 mds > > >> 1/ 5 mds_balancer > > >> 1/ 5 mds_locker > > >> 1/ 5 mds_log > > >> 1/ 5 mds_log_expire > > >> 1/ 5 mds_migrator > > >> 0/ 1 buffer > > >> 0/ 1 timer > > >> 0/ 1 filer > > >> 0/ 1 striper > > >> 0/ 1 objecter > > >> 0/ 5 rados > > >> 0/ 5 rbd > > >> 0/ 5 rbd_mirror > > >> 0/ 5 rbd_replay > > >> 0/ 5 rbd_pwl > > >> 0/ 5 journaler > > >> 0/ 5 objectcacher > > >> 0/ 5 immutable_obj_cache > > >> 0/ 5 client > > >> 1/ 5 osd > > >> 0/ 5 optracker > > >> 0/ 5 objclass > > >> 1/ 3 filestore > > >> 1/ 3 journal > > >> 0/ 0 ms > > >> 1/ 5 mon > > >> 0/10 monc > > >> 1/ 5 paxos > > >> 0/ 5 tp > > >> 1/ 5 auth > > >> 1/ 5 crypto > > >> 1/ 1 finisher > > >> 1/ 1 reserver > > >> 1/ 5 heartbeatmap > > >> 1/ 5 perfcounter > > >> 1/ 5 rgw > > >> 1/ 5 rgw_sync > > >> 1/ 5 rgw_datacache > > >> 1/ 5 rgw_access > > >> 1/ 5 rgw_dbstore > > >> 1/ 5 rgw_flight > > >> 1/ 5 javaclient > > >> 1/ 5 asok > > >> 1/ 1 throttle > > >> 0/ 0 refs > > >> 1/ 5 compressor > > >> 1/ 5 bluestore > > >> 1/ 5 bluefs > > >> 1/ 3 bdev > > >> 1/ 5 kstore > > >> 4/ 5 rocksdb > > >> 4/ 5 leveldb > > >> 1/ 5 fuse > > >> 2/ 5 mgr > > >> 1/ 5 mgrc > > >> 1/ 5 dpdk > > >> 1/ 5 eventtrace > > >> 1/ 5 prioritycache > > >> 0/ 5 test > > >> 0/ 5 cephfs_mirror > > >> 0/ 5 cephsqlite > > >> 0/ 5 seastore > > >> 0/ 5 seastore_onode > > >> 0/ 5 seastore_odata > > >> 0/ 5 seastore_omap > > >> 0/ 5 seastore_tm > > >> 0/ 5 seastore_t > > >> 0/ 5 seastore_cleaner > > >> 0/ 5 seastore_epm > > >> 0/ 5 seastore_lba > > >> 0/ 5 seastore_fixedkv_tree > > >> 0/ 5 seastore_cache > > >> 0/ 5 seastore_journal > > >> 0/ 5 seastore_device > > >> 0/ 5 seastore_backref > > >> 0/ 5 alienstore > > >> 1/ 5 mclock > > >> 0/ 5 cyanstore > > >> 1/ 5 ceph_exporter > > >> 1/ 5 memstore > > >> -2/-2 (syslog threshold) > > >> 99/99 (stderr threshold) > > >> --- pthread ID / name mapping for recent threads --- > > >> 7f2472a89b00 / safe_timer > > >> 7f2472cadb00 / radosgw > > >> ... > > >> log_file > > >> > > >> > > > /var/lib/ceph/crash/2024-01-25T13:10:13.909546Z_01ee6e6a-e946-4006-9d32-e17ef2f9df74/log > > >> --- end dump of recent events --- > > >> reraise_fatal: default handler for signal 25 didn't terminate > > the process? > > >> > > >> Thank you for your help. > > >> > > >> Marc > > >> _______________________________________________ > > >> ceph-users mailing list --ceph-users@xxxxxxx > > >> To unsubscribe send an email toceph-users-leave@xxxxxxx > > >> > > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > > > > > -- > > > > Matt Benjamin > > Red Hat, Inc. > > 315 West Huron Street, Suite 140A > > Ann Arbor, Michigan 48103 > > > > http://www.redhat.com/en/technologies/storage > > > > tel. 734-821-5101 > > fax. 734-769-8938 > > cel. 734-216-5309 > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx