On Wed, Mar 23, 2022 at 2:37 PM Matt Benjamin <mbenjami@xxxxxxxxxx> wrote: > > inline > > On Wed, Mar 23, 2022 at 2:12 PM Casey Bodley <cbodley@xxxxxxxxxx> wrote: > > > > On Wed, Mar 23, 2022 at 11:17 AM Matt Benjamin <mbenjami@xxxxxxxxxx> wrote: > > > > > > On Wed, Mar 23, 2022 at 10:38 AM Casey Bodley <cbodley@xxxxxxxxxx> wrote: > > > > > > > > thanks Yehuda, > > > > > > > > On Wed, Mar 23, 2022 at 9:46 AM Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx> wrote: > > > > > > > > > > On Tue, Mar 22, 2022 at 2:14 PM Adam C. Emerson <aemerson@xxxxxxxxxx> wrote: > > > > > > > > under a consistent write workload, rgw will currently broadcast these > > > > notifications every 200ms by default (rgw_data_notify_interval_msec), > > > > which seems excessively spammy to me - especially if data sync is > > > > behind and we don't need the wakeups. if responsiveness on the order > > > > of 5-10 seconds is sufficient, isn't it better to just increase the > > > > polling frequency to match? > > > > > > > > > > As I noted earlier in the thread, continuous polling *is* inconsistent > > > with constantly notifying. I also agree that broadcasting every 200ms > > > is questionable tuning, and so is the hard-coded 20s polling cycle on > > > the other side. Again, why doesn't polling activity tend toward > > > quiescence when there is no data change? > > > > ok, so you're suggesting that we add some backoff to data sync's > > polling as it keeps finding nothing to do? in that model, i agree that > > notifications are required if we want to guarantee some degree of > > responsiveness. but is this really a better model? > > you make some good points that may mean it isn't for us right now, > though in general, polling avoidance is attractive--but, to be fair, > it's most appropriate when activation is intermittent > > > > > the zone sending notifications has very little information about the > > target zone's sync processes. all it has is a list of endpoints that > > it can send messages to. it doesn't know which of those endpoints are > > actually running sync, let alone which one is running sync for a given > > datalog shard's notification. it doesn't even know whether an endpoint > > is really an rgw instance! it may just be a load balancer, in which > > case we can neither broadcast a notification to every rgw, nor can we > > send a notification to any specific rgw. because of this, any > > push-based model is going to be problematic > > agree, we would want resolutions to these issues (appearling ones) > > > > > even if all remote rgw endpoints are reachable from the source zone, > > we'd still have to broadcast every notification to every remote > > endpoint for this to work. in contrast, the polling only happens in > > the single rgw instance that owns the cls_lock on a given datalog > > shard, so only needs a single http request per polling interval > > whatever the algorithm in the end, I think it would be nice to make > it's parameters configurable--rather than #define INTERVAL 20 (sp) in > the code sure, i've opened https://tracker.ceph.com/issues/55026 to track that on that note, there are several other hard-coded values in data sync that deserve knobs - especially the ones that control concurrency windows. in rgw_data_sync.cc alone: #define READ_DATALOG_MAX_CONCURRENT 10 #define READ_DATALOG_MAX_CONCURRENT 10 #define DATA_SYNC_UPDATE_MARKER_WINDOW 1 #define BUCKET_SHARD_SYNC_SPAWN_WINDOW 20 #define DATA_SYNC_MAX_ERR_ENTRIES 10 #define RETRY_BACKOFF_SECS_MIN 60 #define RETRY_BACKOFF_SECS_DEFAULT 60 #define RETRY_BACKOFF_SECS_MAX 600 #define OMAP_GET_MAX_ENTRIES 100 #define INCREMENTAL_INTERVAL 20 #define MAX_RACE_RETRIES_OBJ_FETCH 10 #define OMAP_READ_MAX_ENTRIES 10 #define BUCKET_SYNC_UPDATE_MARKER_WINDOW 10 #define BUCKET_SYNC_SPAWN_WINDOW 20 > > Matt > > > > > > > > > Matt > > > > > > > > > -- > > > > > > Matt Benjamin > > > Red Hat, Inc. > > > 315 West Huron Street, Suite 140A > > > Ann Arbor, Michigan 48103 > > > > > > http://www.redhat.com/en/technologies/storage > > > > > > tel. 734-821-5101 > > > fax. 734-769-8938 > > > cel. 734-216-5309 > > > > > > > > -- > > Matt Benjamin > Red Hat, Inc. > 315 West Huron Street, Suite 140A > Ann Arbor, Michigan 48103 > > http://www.redhat.com/en/technologies/storage > > tel. 734-821-5101 > fax. 734-769-8938 > cel. 734-216-5309 > _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx