Sounds great. I'll send out that patch shortly. Thanks! On Thu, Apr 19, 2018 at 2:23 AM, Jan Kara <jack@xxxxxxx> wrote: > On Wed 18-04-18 17:05:25, Robert Kolchmeyer wrote: >> fsnotify() acquires a reference to a fsnotify_mark_connector through >> the SRCU-protected pointer to_tell->i_fsnotify_marks. However, it >> appears that no precautions are taken in fsnotify_put_mark() to >> ensure that fsnotify() drops its reference to this >> fsnotify_mark_connector before assigning a value to its 'destroy_next' >> field. This can result in fsnotify_put_mark() assigning a value >> to a connector's 'destroy_next' field right before fsnotify() tries to >> traverse the linked list referenced by the connector's 'list' field. >> Since these two fields are members of the same union, this behavior >> results in a kernel panic. > > Ahh, I was reading this code just two days ago trying to find some issues > (due to reports of softlockups in fsnotify()) but this didn't occur to me. > Great spotting! > >> This issue is resolved by calling synchronize_srcu() before updating >> the connector's 'destroy_next' field in fsnotify_put_mark(). Since the >> relevant section of fsnotify() occurs in a SRCU read-side critical >> section, this will force fsnotify_put_mark() to wait for fsnotify() >> to finish operating on the connector before updating its 'destroy_next' >> field. Since fsnotify_put_mark() removes references to the connector >> before assigning its 'destroy_next' field, it shouldn't be possible for >> another thread to acquire a reference to the connector while >> fsnotify_put_mark() waits for fsnotify() to finish its work. >> >> The offending behavior here is extremely unlikely; since >> fsnotify_put_mark() removes references to a connector (specifically, >> it ensures that the connector is unreachable from the inode it was >> formerly attached to) before updating its 'destroy_next' field, a >> sizeable chunk of code in fsnotify_put_mark() has to execute in the >> short window between when fsnotify() acquires the connector reference >> and saves the value of its 'list' field. On the HEAD kernel, I've only >> been able to reproduce this by inserting a udelay(1) in fsnotify(). >> However, I've been able to reproduce this issue without inserting a >> udelay(1) anywhere on older unmodified release kernels, so I believe >> it's worth fixing at HEAD. > > Oh, the bug definitely needs fixing. I somewhat dislike synchronize_srcu() > when destroying the connector though. The whole point of > connector_destroy_list is to batch synchronize_srcu() calls and move them > out of the relative fast path. So what I'd do instead is to move > 'destroy_next' from a union with 'list' to a union with 'inode' and 'mnt'. > Those are not safe to access under srcu anyway - you have to hold > conn->lock to get them and use conn->flags to determine the type under the > same lock. So in that union we are actually safe to use the space for > anything once conn->flags are cleared by > fsnotify_detach_connector_from_object(). Will you send a patch for that so > that you get a proper credit for all the debugging you did? :) Thanks! > > Honza > >> Fixes bug 199437: https://bugzilla.kernel.org/show_bug.cgi?id=199437 >> >> Fixes: 08991e83b7286635167bab40927665a90fb00d81 >> Signed-off-by: Robert Kolchmeyer <rkolchmeyer@xxxxxxxxxx> >> --- >> fs/notify/mark.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/fs/notify/mark.c b/fs/notify/mark.c >> index e9191b416434..358fc7de1e86 100644 >> --- a/fs/notify/mark.c >> +++ b/fs/notify/mark.c >> @@ -227,6 +227,7 @@ void fsnotify_put_mark(struct fsnotify_mark *mark) >> iput(inode); >> >> if (free_conn) { >> + synchronize_srcu(&fsnotify_mark_srcu); >> spin_lock(&destroy_lock); >> conn->destroy_next = connector_destroy_list; >> connector_destroy_list = conn; >> -- >> 2.17.0.484.g0c8726318c-goog >> > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR