Hi Jaegeuk, Agreed. If Chao agrees with this policy, I will implement it. Thanks for the comment. On 12/15/2017 11:06 AM, Jaegeuk Kim wrote: > On 12/14, Hyunchul Lee wrote: >> Hi Jaegeuk, >> >> I need your comment about the fs_iohint mount option. >> >> a) w/o fs_iohint, propagate user hints to low layer. >> b) w/ fs_iohint, ignore user hints, and use hints which is generated >> with F2FS. >> >> Chao suggests this option. because user hints are more accurate than >> file system. >> >> This is resonable, But I have some concerns about this option. >> The first thing is that blocks of a segments have different hints. This >> could make GC less effective. >> The second is that the separation between LIFE_MEDIUM and LIFE_LONG is >> really needed. I think that difference between them is a little ambigous >> for users, and LIFE_SHORT and LIFE_EXTREME is converted to different >> hints by F2FS. > > I think what we really can do would assign many user hints to our 3 DATA > logs likewise rw_hint_to_seg_type(), since it's just hints for user data. > Then, we can decide how to keep that as much as possible, since we have > another filesystem metadata such as meta and nodes. In addition, I don't > think we have to keep the original user-hints which makes F2FS logs be > messed up. > > With that mind, I can think of the below cases. Especially, if user wants > to keep their io_hints, we'd better recommend to use direct_io w/o fs_iohints. > In order to keep this policy, I think fs_iohints would be better to be a > feature set by mkfs.f2fs and detected by sysfs entries for users. > > 1) w/ fs_iohints > > User F2FS Block > ------------------------------------------------------------------- > Meta WRITE_LIFE_MEDIUM > HOT_NODE WRITE_LIFE_NOTSET > WARM_NODE -' > COLD_NODE WRITE_LIFE_NONE > ioctl(cold) COLD_DATA WRITE_LIFE_EXTREME > extention list -' -' > WRITE_LIFE_EXTREME -' -' > WRITE_LIFE_SHORT HOT_DATA WRITE_LIFE_SHORT > > -- buffered_io > WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_LONG > WRITE_LIFE_NONE -' -' > WRITE_LIFE_MEDIUM -' -' > WRITE_LIFE_LONG -' -' > > -- direct_io (Not recommendable) > WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_NOT_SET > WRITE_LIFE_NONE -' WRITE_LIFE_NONE > WRITE_LIFE_MEDIUM -' WRITE_LIFE_MEDIUM > WRITE_LIFE_LONG -' WRITE_LIFE_LONG > > 2) w/o fs_iohints > > User F2FS Block > ------------------------------------------------------------------- > Meta - > HOT_NODE - > WARM_NODE - > COLD_NODE - > ioctl(cold) COLD_DATA - > extention list -' - > > -- buffered_io > WRITE_LIFE_EXTREME COLD_DATA - > WRITE_LIFE_SHORT HOT_DATA - > WRITE_LIFE_NOT_SET WARM_DATA - > WRITE_LIFE_NONE -' - > WRITE_LIFE_MEDIUM -' - > WRITE_LIFE_LONG -' - > > -- direct_io > WRITE_LIFE_EXTREME COLD_DATA WRITE_LIFE_EXTREME > WRITE_LIFE_SHORT HOT_DATA WRITE_LIFE_SHORT > WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_NOT_SET > WRITE_LIFE_NONE -' WRITE_LIFE_NONE > WRITE_LIFE_MEDIUM -' WRITE_LIFE_MEDIUM > WRITE_LIFE_LONG -' WRITE_LIFE_LONG > > > Note that, I don't much care about how to manipulate streamid in nvme driver > in terms of LIFE_NONE or LIFE_NOTSET, since other drivers can handle them > in different ways. Taking a look at the definition, at least, we don't need > to assume that those are same at all. For example, if we can expolit this in > UFS driver, we can pass all the stream ids to the device as context ids. > > Thanks, > >> >> Thanks. >> >> On 12/12/2017 11:45 AM, Chao Yu wrote: >>> Hi Hyunchul, >>> >>> On 2017/12/12 10:15, Hyunchul Lee wrote: >>>> Hi Chao, >>>> >>>> On 12/11/2017 10:15 PM, Chao Yu wrote: >>>>> Hi Hyunchul, >>>>> >>>>> On 2017/12/1 16:28, Hyunchul Lee wrote: >>>>>> Hi Chao, >>>>>> >>>>>> On 11/30/2017 04:06 PM, Chao Yu wrote: >>>>>>> Hi Hyunchul, >>>>>>> >>>>>>> On 2017/11/28 8:23, Hyunchul Lee wrote: >>>>>>>> From: Hyunchul Lee <cheol.lee@xxxxxxx> >>>>>>>> >>>>>>>> This implements which hint is passed down to block layer >>>>>>>> for datas from the specific segment type. >>>>>>>> >>>>>>>> segment type hints >>>>>>>> ------------ ----- >>>>>>>> COLD_NODE & COLD_DATA WRITE_LIFE_EXTREME >>>>>>>> WARM_DATA WRITE_LIFE_NONE >>>>>>>> HOT_NODE & WARM_NODE WRITE_LIFE_LONG >>>>>>>> HOT_DATA WRITE_LIFE_MEDIUM >>>>>>>> META_DATA WRITE_LIFE_SHORT >>>>>>> >>>>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can >>>>>>> provider hint to lower layer according to hot/cold separation ability, >>>>>>> it will be okay. But once user give his hint which may be more accurate >>>>>>> than filesystem, hint converted by f2fs may be wrong. >>>>>>> >>>>>>> So what do you think of adding an option to control whether filesystem >>>>>>> can convert hint user given? >>>>>>> >>>>>> >>>>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are >>>>>> converted to different hints. >>>>> >>>>> What I mean is introducing a mount option, e.g. fs_iohint, >>>>> a) w/o fs_iohint, propagate file/inode io_hint to low layer. >>>>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated >>>>> with filesystem's private rule. >>>>> >>>> >>>> Okay, I will implement this option and send this patch again. >>> >>> Let's wait for Jaegeuk's comments first? >>> >>>> >>>> Without fs_iohint, Even if data blocks are moved due to GC, >>>> we should keep user hints. And if user hints are not given, >>>> any hints are not passed down to block layer, right? >>> >>> Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode >>> layout, so later when we trigger GC, we can use the last io_hint in inode rather >>> than giving no hint or fs' hint. >>> >>> I think it needs to discuss with original author of IO hint, what is the IO hint >>> policy when filesystem move block by itself after inode has been released in system. >>> >>> Thanks, >>> >>>> >>>> Thank you for comments. >>>> >>>>> Thanks, >>>>> >>>>>> >>>>>> file hint segment type io hint >>>>>> --------- ------------ ------- >>>>>> LIFE_SHORT HOT_DATA LIFE_MEDIUM >>>>>> LIFE_MEDIUM WARM_DATA LIFE_NONE >>>>>> LIFE_LONG WARM_DATA LIFE_NONE >>>>>> LIFE_EXTREME COLD_DATA LIFE_EXTREME >>>>>> >>>>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to >>>>>> the same hint, LIFE_NONE. I am not sure that the seperation between >>>>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the >>>>>> difference between them is a little ambigous for users, and if WARM_DATA >>>>>> segment has two different hints, it can makes GC non-efficient. >>>>>> >>>>>> I wonder your thought about this. >>>>>> >>>>>> Thanks. >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Check out the vibrant tech community on one of the world's most >>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>>>> _______________________________________________ >>>>>> Linux-f2fs-devel mailing list >>>>>> Linux-f2fs-devel@xxxxxxxxxxxxxxxxxxxxx >>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel >>>>>> >>>>> >>>> >>>> . >>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> _______________________________________________ >>> Linux-f2fs-devel mailing list >>> Linux-f2fs-devel@xxxxxxxxxxxxxxxxxxxxx >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel >>> > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Linux-f2fs-devel mailing list > Linux-f2fs-devel@xxxxxxxxxxxxxxxxxxxxx > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel >