On Sat 27-02-21 10:31:52, Amir Goldstein wrote: > On Tue, Feb 2, 2021 at 6:20 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > Event merges are expensive when event queue size is large. > > Limit the linear search to 128 merge tests. > > In combination with 128 hash lists, there is a potential to > > merge with up to 16K events in the hashed queue. > > > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> > > --- > > fs/notify/fanotify/fanotify.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c > > index 12df6957e4d8..6d3807012851 100644 > > --- a/fs/notify/fanotify/fanotify.c > > +++ b/fs/notify/fanotify/fanotify.c > > @@ -129,11 +129,15 @@ static bool fanotify_should_merge(struct fsnotify_event *old_fsn, > > return false; > > } > > > > +/* Limit event merges to limit CPU overhead per event */ > > +#define FANOTIFY_MAX_MERGE_EVENTS 128 > > + > > /* and the list better be locked by something too! */ > > static int fanotify_merge(struct list_head *list, struct fsnotify_event *event) > > { > > struct fsnotify_event *test_event; > > struct fanotify_event *new; > > + int i = 0; > > > > pr_debug("%s: list=%p event=%p\n", __func__, list, event); > > new = FANOTIFY_E(event); > > @@ -147,6 +151,8 @@ static int fanotify_merge(struct list_head *list, struct fsnotify_event *event) > > return 0; > > > > list_for_each_entry_reverse(test_event, list, list) { > > + if (++i > FANOTIFY_MAX_MERGE_EVENTS) > > + break; > > if (fanotify_should_merge(test_event, event)) { > > FANOTIFY_E(test_event)->mask |= new->mask; > > return 1; > > -- > > 2.25.1 > > > > Jan, > > I was thinking that this patch or a variant thereof should be applied to stable > kernels, but not the entire series. > > OTOH, I am concerned about regressing existing workloads that depend on > merging events on more than 128 inodes. Honestly, I don't think pushing anything to stable for this is really worth it. 1) fanotify() is limited to CAP_SYS_ADMIN (in init namespace) so this is hardly a security issue. 2) We have cond_resched() in the merge code now so the kernel doesn't lockup anymore. So this is only about fanotify becoming slow if you have lots of events. 3) I haven't heard any complaints since we've added the cond_resched() patch so the performance issue seems to be really rare. If I get complaits from real users about this, we can easily reconsider, it is not a big deal. But I just don't think preemptive action is warranted... Honza > I thought of this compromise between performance and functional regressions: > > /* > * Limit event merges to limit CPU overhead per new event. > * For legacy mode, avoid unlimited CPU overhead, but do not regress the event > * merge ratio in heavy concurrent workloads with default queue size. > * For new FAN_REPORT_FID modes, make sure that CPU overhead is low. > */ > #define FANOTIFY_MAX_MERGE_OLD_EVENTS 16384 > #define FANOTIFY_MAX_MERGE_FID_EVENTS 128 > > static inline int fanotify_max_merge_events(struct fsnotify_group *group) > { > if (FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS)) > return FANOTIFY_MAX_MERGE_FID_EVENTS; > else > return FANOTIFY_MAX_MERGE_OLD_EVENTS; > } > > I can start the series with this patch and change that to: > > #define FANOTIFY_MAX_MERGE_FID_EVENTS 128 > > static inline int fanotify_max_merge_events(struct fsnotify_group *group) > { > return FANOTIFY_MAX_MERGE_EVENTS; > } > > At the end of the series. > > What do you think? > > Thanks, > Amir. -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR