Hi
I am using a unix socket client to connect with it and read the data
from it.
Do I need to do anything like signal the socket that this data has been
read? Or am I not reading fast enough and data is backing up?
What I am also noticing that at some point (probably after something
with the ops socket happens), the log level seems to increase for some
reason? I did not find anything in the logs yet why this would be the case.
*Normal:*
2024-01-25T15:47:58.444+0000 7fe98a5c0b00 1 ====== starting new request
req=0x7fe98712c720 =====
2024-01-25T15:47:58.548+0000 7fe98b700b00 1 ====== req done
req=0x7fe98712c720 op status=0 http_status=200 latency=0.104001537s ======
2024-01-25T15:47:58.548+0000 7fe98b700b00 1 beast: 0x7fe98712c720:
redacted - redacted [25/Jan/2024:15:47:58.444 +0000] "PUT
/redacted/redacted/chunks/27/27242/27242514_10_4194304 HTTP/1.1" 200
4194304 - "redacted" - latency=0.104001537s
*Close before crashing:
*
-509> 2024-01-25T14:54:31.588+0000 7f5186648b00 1 ====== starting
new request req=0x7f517ffca720 =====
-508> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s initializing for trans_id =
tx0000023a42eb7515dcdc0-0065b27627-823feaa-central
-507> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s getting op 1
-506> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s s3:put_obj verifying requester
-505> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s s3:put_obj normalizing buckets and tenants
-504> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s s3:put_obj init permissions
-503> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s s3:put_obj recalculating target
-502> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s s3:put_obj reading permissions
-501> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s s3:put_obj init op
-500> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s s3:put_obj verifying op mask
-499> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s s3:put_obj verifying op permissions
-498> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req
2568229052387020224 0.000000000s s3:put_obj Searching permissions for
identity=rgw::auth::SysReqApplier ->
rgw::auth::LocalApplier(acct_user=redacted, acct_name=redacted,
subuser=, perm_mask=15, is_admin=0) mask=50
-497> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req
2568229052387020224 0.000000000s s3:put_obj Searching permissions for
uid=redacted
-496> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req
2568229052387020224 0.000000000s s3:put_obj Found permission: 15
-495> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req
2568229052387020224 0.000000000s s3:put_obj Searching permissions for
group=1 mask=50
-494> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req
2568229052387020224 0.000000000s s3:put_obj Permissions for group not found
-493> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req
2568229052387020224 0.000000000s s3:put_obj Searching permissions for
group=2 mask=50
-492> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req
2568229052387020224 0.000000000s s3:put_obj Permissions for group not found
-491> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req
2568229052387020224 0.000000000s s3:put_obj -- Getting permissions done
for identity=rgw::auth::SysReqApplier ->
rgw::auth::LocalApplier(acct_user=redacted, acct_name=redacted,
subuser=, perm_mask=15, is_admin=0), owner=redacted, perm=2
-490> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s s3:put_obj verifying op params
-489> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s s3:put_obj pre-executing
-488> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s s3:put_obj check rate limiting
-487> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req
2568229052387020224 0.000000000s s3:put_obj executing
-486> 2024-01-25T14:54:31.624+0000 7f5183898b00 5 req
2568229052387020224 0.036000550s s3:put_obj NOTICE: call to
do_aws4_auth_completion
-485> 2024-01-25T14:54:31.624+0000 7f5183898b00 5 req
2568229052387020224 0.036000550s s3:put_obj NOTICE: call to
do_aws4_auth_completion
-484> 2024-01-25T14:54:31.680+0000 7f5185bc8b00 2 req
2568229052387020224 0.092001401s s3:put_obj completing
-483> 2024-01-25T14:54:31.680+0000 7f5185bc8b00 2 req
2568229052387020224 0.092001401s s3:put_obj op status=0
-482> 2024-01-25T14:54:31.680+0000 7f5185bc8b00 2 req
2568229052387020224 0.092001401s s3:put_obj http status=200
-481> 2024-01-25T14:54:31.680+0000 7f5185bc8b00 1 ====== req done
req=0x7f517ffca720 op status=0 http_status=200 latency=0.092001401s ======
Thanks for your help.
Marc Singer
On 1/25/24 16:22, Matt Benjamin wrote:
Hi Marc,
The ops log code is designed to discard data if the socket is
flow-controlled, iirc. Maybe we just need to handle the signal.
Of course, you should have something consuming data on the socket, but it's
still a problem if radosgw exits unexpectedly.
Matt
On Thu, Jan 25, 2024 at 10:08 AM Marc Singer<marc@singer.services> wrote:
Hi Ceph Users
I am encountering a problem with the RGW Admin Ops Socket.
I am setting up the socket as follows:
rgw_enable_ops_log = true
rgw_ops_log_socket_path = /tmp/ops/rgw-ops.socket
rgw_ops_log_data_backlog = 16Mi
Seems like the socket fills up over time and it doesn't seem to get
flushed, at some point the process runs out of file space.
Do I need to configure something or send something for the socket to flush?
See the log here:
0> 2024-01-25T13:10:13.908+0000 7f247b00eb00 -1 *** Caught signal (File
size limit exceeded) **
in thread 7f247b00eb00 thread_name:ops_log_file
ceph version 18.2.0 (5dd24139a1eada541a3bc16b6941c5dde975e26d) reef
(stable)
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 rbd_pwl
0/ 5 journaler
0/ 5 objectcacher
0/ 5 immutable_obj_cache
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 0 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 1 reserver
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/ 5 rgw_sync
1/ 5 rgw_datacache
1/ 5 rgw_access
1/ 5 rgw_dbstore
1/ 5 rgw_flight
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
1/ 5 fuse
2/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
1/ 5 prioritycache
0/ 5 test
0/ 5 cephfs_mirror
0/ 5 cephsqlite
0/ 5 seastore
0/ 5 seastore_onode
0/ 5 seastore_odata
0/ 5 seastore_omap
0/ 5 seastore_tm
0/ 5 seastore_t
0/ 5 seastore_cleaner
0/ 5 seastore_epm
0/ 5 seastore_lba
0/ 5 seastore_fixedkv_tree
0/ 5 seastore_cache
0/ 5 seastore_journal
0/ 5 seastore_device
0/ 5 seastore_backref
0/ 5 alienstore
1/ 5 mclock
0/ 5 cyanstore
1/ 5 ceph_exporter
1/ 5 memstore
-2/-2 (syslog threshold)
99/99 (stderr threshold)
--- pthread ID / name mapping for recent threads ---
7f2472a89b00 / safe_timer
7f2472cadb00 / radosgw
...
log_file
/var/lib/ceph/crash/2024-01-25T13:10:13.909546Z_01ee6e6a-e946-4006-9d32-e17ef2f9df74/log
--- end dump of recent events ---
reraise_fatal: default handler for signal 25 didn't terminate the process?
Thank you for your help.
Marc
_______________________________________________
ceph-users mailing list --ceph-users@xxxxxxx
To unsubscribe send an email toceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx