On Wed, Mar 23, 2022 at 11:17 AM Matt Benjamin <mbenjami@xxxxxxxxxx> wrote: > > On Wed, Mar 23, 2022 at 10:38 AM Casey Bodley <cbodley@xxxxxxxxxx> wrote: > > > > thanks Yehuda, > > > > On Wed, Mar 23, 2022 at 9:46 AM Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx> wrote: > > > > > > On Tue, Mar 22, 2022 at 2:14 PM Adam C. Emerson <aemerson@xxxxxxxxxx> wrote: > > > > under a consistent write workload, rgw will currently broadcast these > > notifications every 200ms by default (rgw_data_notify_interval_msec), > > which seems excessively spammy to me - especially if data sync is > > behind and we don't need the wakeups. if responsiveness on the order > > of 5-10 seconds is sufficient, isn't it better to just increase the > > polling frequency to match? > > > > As I noted earlier in the thread, continuous polling *is* inconsistent > with constantly notifying. I also agree that broadcasting every 200ms > is questionable tuning, and so is the hard-coded 20s polling cycle on > the other side. Again, why doesn't polling activity tend toward > quiescence when there is no data change? ok, so you're suggesting that we add some backoff to data sync's polling as it keeps finding nothing to do? in that model, i agree that notifications are required if we want to guarantee some degree of responsiveness. but is this really a better model? the zone sending notifications has very little information about the target zone's sync processes. all it has is a list of endpoints that it can send messages to. it doesn't know which of those endpoints are actually running sync, let alone which one is running sync for a given datalog shard's notification. it doesn't even know whether an endpoint is really an rgw instance! it may just be a load balancer, in which case we can neither broadcast a notification to every rgw, nor can we send a notification to any specific rgw. because of this, any push-based model is going to be problematic even if all remote rgw endpoints are reachable from the source zone, we'd still have to broadcast every notification to every remote endpoint for this to work. in contrast, the polling only happens in the single rgw instance that owns the cls_lock on a given datalog shard, so only needs a single http request per polling interval > > Matt > > > -- > > Matt Benjamin > Red Hat, Inc. > 315 West Huron Street, Suite 140A > Ann Arbor, Michigan 48103 > > http://www.redhat.com/en/technologies/storage > > tel. 734-821-5101 > fax. 734-769-8938 > cel. 734-216-5309 > _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx