Hi Vyacheslav, > I think that this task hides many difficult questions. How does it > define what files fragmented or not? How does it measure the > fragmentation degree? What fragmentation degree should be a basis for > defragmentation activity? When does it need to detect fragmentation and > how to keep this knowledge? How does it make defragmentation without > performance degradation? > > As I understand, when we are talking about defragmentation then we > expect a performance enhancement as a result. But defragmenter activity > can be a background reason of performance degradation. Not every > workload or I/O pattern can be a reason of significant fragmentation. > > Also, it is a very important to choose a point of defragmentation. I > mean that it is possible to try to prevent fragmentation or to correct > fragmentation after flushing on the volume. It is possible to have a > some hybrid technique, I think. An I/O pattern or file type can be a > basis for such decision, I think. Yes I agree. It is of course a good idea to reorder the data before flushing and probably also to reorder it with the cleaner, but I thought, that was already implemented and optimized. Is it? Instead I imagined a tool like xfs_fsr for XFS. So the user can decide when to defragment the file system, by running it manually or with a cron job. Maybe this is a bit naive, since I probably don't know enough about NILFS. Couldn't we just calculate the number of segments a file uses if it is stored optimally and compare that to the actual number of segments the file is spread out. For example, file A has 16MB. Lets assume segments are of size 8MB. So (ignoring the metadata) file A should use 2 segments. Now we count the different segments where the blocks of file A really are, lets say 10, and calculate 1-(2/10)=0.8 So it is 80% fragmented. I wouldn't do that in the cleaner or in the background. Just a tool like xfs_fsr, that the user can run once a month in the middle of the night with a cron job. The tool would go through every file, calculate the fragmentation and collect other statistics and decide if it is worth defragmenting it or not. If the user has a SSD he/she can decide not to defragment at all. > As I understand, F2FS [1] has some defragmenting approaches. I think > that it needs to discuss more deeply about technique of detecting > fragmented files and fragmentation degree. But maybe hot data tracking > patch [2,3] will be a basis for such discussion. I did a quick search for F2FS defragmentation, but I couldn't find anything. Did you mean this section of the article? "...it provides large-scale write gathering so that when lots of blocks need to be written at the same time they are collected into large sequential writes..." Maybe I missed something, but isn't this just the inherent property of a log-structured file system and not defragmentation? Hot data tracking could be extremely useful for the cleaner. This paper [1] suggests, that the best cleaner performance can be achieved by distinguishing between hot and cold data. Is something like that already implemented? Maybe I could do that for my masters thesis instead of the defragmentation task... ;) Thanks for the links. best regards, Andreas Rohner [1] http://www.cs.berkeley.edu/~brewer/cs262/LFS.pdf -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html