On Thu, Feb 18, 2021 at 12:04:56PM -0600, Jonathon Jongsma wrote: > On Wed, 17 Feb 2021 14:35:51 +0100 > Erik Skultety <eskultet@xxxxxxxxxx> wrote: > > > > +static void > > > +mdevctlEventHandleCallback(GFileMonitor *monitor G_GNUC_UNUSED, > > > + GFile *file, > > > + GFile *other_file G_GNUC_UNUSED, > > > + GFileMonitorEvent event_type, > > > + gpointer user_data) > > > +{ > > > + udevEventDataPtr priv = user_data; > > > + /* if a new directory appears, monitor that directory for > > > changes */ > > > + if (event_type == G_FILE_MONITOR_EVENT_CREATED && > > > + g_file_query_file_type(file, G_FILE_QUERY_INFO_NONE, NULL) > > > == > > > + G_FILE_TYPE_DIRECTORY) { > > > + GList *newmonitors = monitorFileRecursively(priv, file); > > > + priv->mdevctlMonitors = > > > g_list_concat(priv->mdevctlMonitors, newmonitors); > > > > priv->mdevctlMonitors is shared among threads, but you only acquire > > the mutex just before exec'ing mdevctl in the mdevctlHandler thread, > > so ^this can yield unexpected results in terms of directories > > monitored. > > A) It doesn't really affect the directories being monitored, it only > affects our ability to free these monitors later in the destructor. > > B) Aside from the very first time this variable is set in > nodeStateInitialize(), the only place that accesses the mdevctlMonitors > variable is this glib signal callback (to add new monitors to the list) > and the destructor (to free the monitors). And I believe those should > always run in the main thread. So I don't think there's really much of > a thread-safety issue here. Perhaps there's a slight race in > nodeStateInitialize() which appears to be called from a non-main > thread. If a directory gets created immediately after connecting to the > file monitor signal but before the function returns and the GList gets > assigned to priv->mdevctlMonitors, I suppose the signal handler could > run and it could get corrupted. Would you like me to protect all > accesses with a Mutex? If for some reason we need to extend the code in the future, it's so easy to miss a single spot. So, I'd say - better safe than sorry - let's protect the variable the way it's supposed to be even though the race might not exist right now. Erik > > > > > > > + } > > > + > > > + /* When mdevctl creates a device, it can result in multiple > > > notify events > > > + * emitted for a single logical change (e.g. several CHANGED > > > events, or a > > > + * CREATED and CHANGED event followed by CHANGES_DONE_HINT). > > > To avoid > > > + * spawning a mdevctl thread multiple times for a single > > > logical > > > + * configuration change, try to coalesce these changes by > > > waiting for the > > > + * CHANGES_DONE_HINT event. As a fallback, add a timeout to > > > trigger the > > > + * signal if that event never comes */ > > > > So there's no guaranteed pattern to monitor, is that so? IOW Even the > > CHANGES_DONE_HINT may not actually arrive? That's a very poor design. > > I'd like to ditch the timer as much as possible. > > Maybe we can simply rely on the CHANGES_DONE_HINT. I tend to be > conservative and was trying to handle the case where the > CHANGES_DONE_HINT might not be delivered (it is called a "hint" after > all). Part of this was caused by my reading this comment in a test > case within the glib repository: > > /* Sometimes the emission of 'CHANGES_DONE_HINT' may be late because > * it depends on the ability of file monitor implementation to report > * 'CHANGES_DONE_HINT' itself. If the file monitor implementation > * doesn't report 'CHANGES_DONE_HINT' itself, it may be emitted by > * GLocalFileMonitor after a few seconds, which causes the event to > * mix with results from different steps. Since 'CHANGES_DONE_HINT' > * is just a hint, we don't require it to be reliable and we simply > * ignore unexpected 'CHANGES_DONE_HINT' events here. */ > > So it is not reliably communicated if your file monitor backend doesn't > implement it. But the above comment seems to suggest that if the file monitor > backend doesn't support it, the event will be generated by glib itself and will > simply be delayed by a couple of seconds rather than missing altogether. I'd > have to read the glib code to tell whether it's possible for it to be omitted > altogether. I guess I don't see the timer as a horrible workaround, but I can > ditch it if you want. > > Alternatively I suppose I could simply kick off a new thread for each file > change event and not bother trying to coalesce them at all. This will result in > perhaps a an extra thread or two being launched when mdevctl modifies > the mdev registry behind our backs, but that's unlikely to be a common > occurrence and so shouldn't really be a significant performance issue. If there may be multiple events connected to a single change, which one would trigger the actual device list update? I guess I'm starting to be okay with the timeout in the end. Erik