Re: [RFC][v3] documentation on filesystem exposure to RCU pathwalk from fs maintainers' POV

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Al,

I'm not a VFS/FS person, but just wanted to suggest proper formatting
of footnotes in ReST.

On Tue, 27 Feb 2024 03:04:59 -0500, Al Viro wrote:
...
> On Mon, Feb 26, 2024 at 01:35:12PM -0500, Al Viro wrote:
>> Updated variant follows.  Changes since the previous:
>> * beginning has been rewritten
>> * typos spotted by Randy should be fixed
>> * dumb braino in "opt out" part fixed (->d_automount is irrelevant, it's
>> ->d_manage one needs to watch out for)
>> * a stale bit in discussion of ->permission() ("currently (6.5) run afoul")
>> updated (to "did (prior to 6.8-rc6) run afoul").
>> * I gave up on ReST footnotes and did something similar manually.

But first, one of enumerated lists here:

> Opting out
> ==========
> 
> To large extent a filesystem can opt out of RCU pathwalk; that loses all
> scalability benefits whenever your filesystem gets involved in pathname
> resolution, though.  If that's the way you choose to go, just make sure
> that
> 
> 1. any non-default ->d_revalidate(), ->permission(), ->get_link() and
> ->get_inode_acl() instance bails out if called by RCU pathwalk (see below
> for details).  Costs a couple of lines of boilerplate in each.
> 
> 2. if some symlink inodes have ->i_link set to a dynamically allocated
> object, that object won't be freed without an RCU delay.  Anything
> coallocated with inode is fine, so's anything freed from ->free_inode().
> Usually comes for free, just remember to avoid freeing directly
> from ->destroy_inode().
> 
> 3. any ->d_hash() and ->d_compare() instances (if you have those) do
> not access any filesystem objects.
> 
> 4. there's no ->d_manage() instances in your filesystem.
>
> If your case does not fit the above, the easy opt-out is not for you.
> If so, you'll have to keep reading...

is not recognized by Sphinx due to missing indents.

You need to indent each item as shown bellow:

----------8-<---------------
Opting out
==========

To large extent a filesystem can opt out of RCU pathwalk; that loses all
scalability benefits whenever your filesystem gets involved in pathname
resolution, though.  If that's the way you choose to go, just make sure
that

1. any non-default ->d_revalidate(), ->permission(), ->get_link() and
   ->get_inode_acl() instance bails out if called by RCU pathwalk (see below
   for details).  Costs a couple of lines of boilerplate in each.

2. if some symlink inodes have ->i_link set to a dynamically allocated
   object, that object won't be freed without an RCU delay.  Anything
   coallocated with inode is fine, so's anything freed from ->free_inode().
   Usually comes for free, just remember to avoid freeing directly
   from ->destroy_inode().

3. any ->d_hash() and ->d_compare() instances (if you have those) do
   not access any filesystem objects.

4. there's no ->d_manage() instances in your filesystem.

If your case does not fit the above, the easy opt-out is not for you.
If so, you'll have to keep reading...
----------8-<---------------

Missing indents is also the reason why your footnotes didn't work as you
expected.

Diff below partially reverts your v2 changes WRT footnotes and does
proper indents in footnote contents area.

----------8-<---------------
--- a/Documentation/filesystems/rcu-exposure.rst
+++ b/Documentation/filesystems/rcu-exposure.rst
@@ -42,7 +42,7 @@ caller deliberately does *not* take steps that would normally protect
 the object from being torn down by another thread while the method is
 trying to work with it.  That applies not just to dentries - associated
 inodes and even the filesystem instance the object belongs to could be
-in process of getting torn down (yes, really - see [0] for details).
+in process of getting torn down (yes, really).\ [#f0]_
 Of course, from the filesystem POV every call like that is a potential
 source of headache.
 
@@ -188,8 +188,8 @@ when called for dying dentry - an incorrect return value won't harm the
 caller in such case.  False positives and false negatives alike - the
 callers take care of that.  To be pedantic, make that "false positives
 do not cause problems unless they have ->d_manage()", but ->d_manage()
-is present only on autofs and there's no autofs ->d_compare() instances.
-See [1] for details, if you are curious.
+is present only on autofs and there's no autofs ->d_compare()
+instances.\ [#f1]_
 
 There is no indication that ->d_compare() is called in RCU mode;
 the majority of callers are such, anyway, so we need to cope with that.
@@ -321,8 +321,7 @@ goes there [this is NOT a suggestion, folks].
 
 Again, this can be called in RCU mode.  Even if your ->d_revalidate()
 always returns -ECHILD in RCU mode and kicks the pathwalk out of it,
-you can't assume that ->get_link() won't be reached (see [2] for
-details).
+you can't assume that ->get_link() won't be reached.\ [#f2]_
 
 NULL dentry argument is an indicator of unsafe call; if you can't handle
 it, just return ERR_PTR(-ECHILD).  Any allocations you need to do (and
@@ -350,13 +349,14 @@ bailout is done by returning ERR_PTR(-CHILD) and the usual considerations
 apply for any access to data structures you might need to do.
 
 
-Footnotes
-=========
+.. rubric:: Footnotes
 
-[0]  The fast path of pathname resolution really can run into a dentry on
-a filesystem that is getting shut down.
+.. [#f0]
 
-Here's one of the scenarios for that to happen:
+  The fast path of pathname resolution really can run into a dentry on
+  a filesystem that is getting shut down.
+
+  Here's one of the scenarios for that to happen:
 
 	1. have two threads sharing fs_struct chdir'ed on that filesystem.
 	2. lazy-umount it, so that the only thing holding it alive is
@@ -366,54 +366,54 @@ Here's one of the scenarios for that to happen:
 	4. at the same time the second thread does fchdir(), moving to
 	   different directory.
 
-In fchdir(2) we get to set_fs_pwd(), which set the current directory
-to the new place and does mntput() on the old one.  No RCU delays here,
-we calculate the refcount of that mount and see that we are dropping
-the last reference.  We make sure that the pathwalk in progress in
-the first thread will fail when it comes to legitimize_mnt() and do this
-(in mntput_no_expire())::
+  In fchdir(2) we get to set_fs_pwd(), which set the current directory
+  to the new place and does mntput() on the old one.  No RCU delays here,
+  we calculate the refcount of that mount and see that we are dropping
+  the last reference.  We make sure that the pathwalk in progress in
+  the first thread will fail when it comes to legitimize_mnt() and do this
+  (in mntput_no_expire())::
 
 	init_task_work(&mnt->mnt_rcu, __cleanup_mnt);
 	if (!task_work_add(task, &mnt->mnt_rcu, TWA_RESUME))
 		return;
 
-As we leave the syscall, we have __cleanup_mnt() run; it calls cleanup_mnt()
-on our mount, which hits deactivate_super().  That was the last reference to
-superblock.
+  As we leave the syscall, we have __cleanup_mnt() run; it calls cleanup_mnt()
+  on our mount, which hits deactivate_super().  That was the last reference to
+  superblock.
 
-Voila - we have a filesystem shutdown right under the nose of a thread
-running in ->d_hash() of something on that filesystem.  Mutatis mutandis,
-one can arrange the same for other methods called by rcu pathwalk.
+  Voila - we have a filesystem shutdown right under the nose of a thread
+  running in ->d_hash() of something on that filesystem.  Mutatis mutandis,
+  one can arrange the same for other methods called by rcu pathwalk.
 
-It's not easy to hit (especially if you want to get through the
-entire ->kill_sb() before the first thread gets through ->d_hash()),
-and it's probably impossible on the real hardware; on KVM it might be
-borderline doable.  However, it is possible and I would not swear that
-other ways of arranging the same thing are equally hard to hit.
+  It's not easy to hit (especially if you want to get through the
+  entire ->kill_sb() before the first thread gets through ->d_hash()),
+  and it's probably impossible on the real hardware; on KVM it might be
+  borderline doable.  However, it is possible and I would not swear that
+  other ways of arranging the same thing are equally hard to hit.
 
-The bottom line: methods that can be called in RCU mode need to
-be careful about the per-superblock objects destruction.
+  The bottom line: methods that can be called in RCU mode need to
+  be careful about the per-superblock objects destruction.
 
-[1]
+.. [#f1]
 
-Some callers prevent being called for dying dentry (holding ->d_lock and
-having verified !d_unhashed() or finding it in the list of inode's aliases
-under ->i_lock).  For those the scenario in question simply cannot arise.
+  Some callers prevent being called for dying dentry (holding ->d_lock and
+  having verified !d_unhashed() or finding it in the list of inode's aliases
+  under ->i_lock).  For those the scenario in question simply cannot arise.
 
-Some follow the match with lockref_get_not_dead() and treat the failure
-as mismatch.  That takes care of false positives, and false negatives on
-dying dentry are still correct - we simply pretend to have lost the race.
+  Some follow the match with lockref_get_not_dead() and treat the failure
+  as mismatch.  That takes care of false positives, and false negatives on
+  dying dentry are still correct - we simply pretend to have lost the race.
 
-The only caller that does not fit into the classes above is
-__d_lookup_rcu_op_compare().  There we sample ->d_seq and verify
-!d_unhashed() before calling ->d_compare().  That is not enough to
-prevent dentry from starting to die right under us; however, the sampled
-value of ->d_seq will be rechecked when the caller gets to step_into(),
-so for a false positive we will end up with a mismatch.  The corner case
-around ->d_manage() is due to the handle_mounts() done before step_into()
-gets to ->d_seq validation...
+  The only caller that does not fit into the classes above is
+  __d_lookup_rcu_op_compare().  There we sample ->d_seq and verify
+  !d_unhashed() before calling ->d_compare().  That is not enough to
+  prevent dentry from starting to die right under us; however, the sampled
+  value of ->d_seq will be rechecked when the caller gets to step_into(),
+  so for a false positive we will end up with a mismatch.  The corner case
+  around ->d_manage() is due to the handle_mounts() done before step_into()
+  gets to ->d_seq validation...
 
-[2]
+.. [#f2]
 
-binding a symlink on top of a regular file on another filesystem is possible
-and that's all it takes for RCU pathwalk to get there.
+  binding a symlink on top of a regular file on another filesystem is possible
+  and that's all it takes for RCU pathwalk to get there.
----------8-<---------------

This should generate footnotes both in PDF and HTML outputs.

Note that the pattern "foo bar.\ [#fn]_" in the referencing sites is to
place the footnote markers just next to the punctuation. In PDF, it should
prevent line breaks in front of the markers.  HTML doesn't behave that
way, it seems.

HTH, Akira





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux