Re: [PATCH v3 00/13] VFS hot tracking

Zhi Yong Wu <zwu.kernel@xxxxxxxxx> · Mon, 8 Jul 2013 20:44:35 +0800

HI, Al Viro

  Can you give us some suggestions when you are available? In order
that we can get agreements on how to design the following points.
thanks.

On Wed, Jul 3, 2013 at 11:16 PM, Zhi Yong Wu <zwu.kernel@xxxxxxxxx> wrote:
> HI, Al Viro,
>
>    Thanks for your comments and patiently explaining my questions. I
> still have some questions as below now:
> 1.)  Since you think the whole refcount + "DELETING" flag approach is
> a bad idea, do you have any other idea? or do you want me to get rid
> of ref count totally?
>       the ref count is introduced mainly to take hot relocation
> support, etc. into account.
>
> 2.) For debugfs, I had a lot of racy issue along the way, and had
> fixed a lot, since you think that it is completely unsuitable for
> dynamic objects, i want to remove debugfs, what do you think of it?
>
> 3.) For hot_comm_item struct, if it is killed totally, it will
> introduce a lot of duplicate codes for the hot_update_worker function,
> so i want to keep it there, but split those functions which take it.
> This will make those functions clear. What do you think of it?
>
> I will rework all your other comments and repost the next version soon.
>
> On Wed, Jul 3, 2013 at 9:30 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>> On Mon, Jul 01, 2013 at 09:19:08PM +0800, Zhi Yong Wu wrote:
>>
>>> >         * lookups (both for inode and range) leak on race with unlink.
>>> > You grab a reference, drop the lock, check if HOT_DELETING is set and
>>> > return an error if it is; how the hell could the caller possibly undo
>>> > that reference grab, when it has no idea what object had been grabbed
>>> I don't get what you mean, do you elaborate it with more details?
>>> which caller? which scenario will it take place in?
>>
>> Sigh...  Suppose that test has failed and you've returned ERR_PTR(-ENOENT).
>> If that ever happens (and I don't see what would prevent that, seeing
>> that e.g. unlink(2) can happen at any point, that you are not holding any
>> locks at that point and that unlink(2) wouldn't care for any of your locks
>> anyway) you are going to have a leak - you've grabbed a reference a few
>> lines above and once you return, there's no way to tell which object had
>> been grabbed, let alone do the matching kref_put().
> i got it now, and will fix it, thanks for what you explain.
>
> With all these additional issues i decided to not push the debugfs part for now.
>>
>>> > debugfs-related issues - debugfs is completely unsuitable for dynamic
>>> > objects and you step into that big way, in addition to races of your
>>> **** Is debugfs also completely unsuitable for dynamic objects even
>>> though we fix the following issues listed by you? Do you have any
>>> better way about this?
>>
>> The last issue in that list (IO hours after object removal) is just about
>> unfixable without debugfs overhaul.
> OK, i see. I had fixed a lot of issues, but it still has a lot raised by you.
>>
>>> >         * creation of hot_debugfs_root is racy - WTF prevents
>>> hot_debugfs_root is public for all disk volumes, and is debugfs root
>>> for hot tracking.
>>> Why do you think it is racy?
>>> > two hot_track_init() in parallel?
>>> I think that mount() will make sure hot_track_init() will be done once
>>> for the same super block, right? hot_track_init will be done *only*
>>> when mount is issued.
>>> That is, mount() can not make sure hot_track_init() is done serially?
>>
>> What the devil would serialize mount on completely unrelated devices?
>> And what for?
>>
>>> >         * what guarantees that sb->s_id is unique?
>>> sorry, can you let me know where sb->s_id will be not unique?
>>
>> On any number of filesystem types it isn't; are you making that a restriction
>> on fs types that can use your stuff?
>>
>>> >         * removal on failure - screwed; first of all, list_empty()
>>> On failure, it must be removed? no.
>>
>> Then what would eventually remove it?  And why bother with lazy creation,
>> if so?
>>
>>> >         * hot_debugfs_exit() - screwed in the same way.
>>> Ditto.
>>
>> Again, if you don't mind that thing sticking around indefinitely, just
>> because somebody happened to do ls on it at the moment it would've been
>> removed otherwise, WTF remove it at all?
>>
>>> >         * debugfs files get ->i_private set to corresponding sb->s_hot_root.
>>> > It's copied to seq->private on open() and used by iterators.  WTF prevents
>>> > open on debugfs, followed by umount of corresponding btrfs volume, freeing of
>>> > sb->s_hot_root and then read() on our file stepping into kfree'd hot_root
>>> > in ->start()?
>>> Good catch, will fix it.
>>
>> How?
>>
>> As for the lifetime rules suggestions - depends on what you want to achieve.
>> How long should those objects live?  In which cases can we get an attempt
>> of ...unlink... more than once on the same object?
>
> --
> Regards,
>
> Zhi Yong Wu

--
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html