Hi Artem and Andreas, I had worked on this feature for a few days. Now I meet some problems, I want to discuss them with you and wish you could give me some advises. On 2015/2/3 13:45, Andreas Dilger wrote: > On Feb 2, 2015, at 2:33 AM, Artem Bityutskiy <dedekind1@xxxxxxxxx> wrote: >> >> Yes, no easy way, but I think implementing what you need is possible. I >> do not have plans and time to work on this, but I can help by giving >> advises and review. >> >> The question has 2 major parts. >> >> 1. The interface >> 2. The implementation >> >> For the former, one need to carefully investigate if there is something >> like this already implemented for other file-systems. I think btrfs may >> have it. If it is, then UBIFS should use similar interface, probably. >> >> And whatever is the interface choice, it should be discussed in the >> linux-fsdevel@xxxxxxxxxxxxxxx mailing list, which I am CCing. > First, talking about the interface. > One option that was discussed for btrfs was to use the first fe_reserved > field for the FIEMAP ioctl struct fiemap_extent to fe_phys_length to > hold the compressed size of each extent in the file. > I don't think fiemap is a good interface for UBIFS and for compressed size reporting feature. The contents of files in UBIFS are sorted into *ubifs_data_node* which contains as much as UBIFS_BLOCK_SIZE(4096) bytes of data. A single file may contain lots of data nodes and these data nodes may locate on flash in order or in disorder because of out of place update. An fiemap ioctl from userspace need lots of memory to store discontinuous data mapping and copy these fiemap_extent may cost a lot of time. > http://lwn.net/Articles/607552/ > http://thread.gmane.org/gmane.comp.file-systems.btrfs/37312 > > I'm not sure what happened to that patch series - I was looking forward > to it landing, and it was in very good shape I think. > Since the *fe_phys_length* of fiemap_extent is not import to mainline, current fiemap can only report logical data length of an extent. Regret to say, it's no use for getting the compressed size of a file in UBIFS. Then, looking at the implement. >> >> a. 'struct ubifs_ino_node' has unused space, use it to add the >> compressed size field. >> b. maintain this field >> c. this field will only be correct for the part of the file which are on >> the media. The dirty data in the page cache has not yet been compressed, >> so we do not know its compressed size yet. >> e. when user asks for the compressed size, you have to sync the inode >> first, in order to make sure the compressed size is correct. >> >> And the implementation should be backward-compatible. That is, if new >> driver mounts the old media, you return something predicatable. I guess >> uncompressed size could be it. >> I'm worry about power cut recovery of compressed size if we introduce it into 'struct ubifs_ino_node'. We can't write a new metadata node after each changing of data node. Dirty data may not change the logical size of a file, but it must change the compressed size. How to keep the consistency between real compressed size(amount of each data nodes) and the record in metadata node? In logic size case, we could solve this problem by block number, because the size of each blocks are UBIFS_BLOCK_SIZE, each exist data node could tell the logic size of a file. Actually we use this method to fix the logic size of a file in recovery path. Since the physic size of each data node are not equal, we couldn't get the physic size of a file by a single data node in journal. And we couldn't record the total compressed size of a file in data node because it doesn't have enough reserve space, we couldn't use the same functionality for compressed size. Since metadata node(ubifs_ino_node) and data node(ubifs_data_node) are stored in different journals, I didn't find a easy way to keep consistency when a power cut happen. Seems a rebuilding whole scan can not be avoid in this case. So could we use a simply method? just a private ioctl which scan the tnc tree and report the compressed size of a UBIFS file? I found an old patch for Btrfs, but it is not import to mainline. https://patchwork.kernel.org/patch/117782/ Since the files in UBIFS are not too large, maybe we could test if the cost of time is acceptable for ordinary use case. Further, for both fiemap or this private ioctl method, current tnc tree lookup mechanism seems always copying the whole ubifs node, but only the header of a node is used in this case. Do we have a way to only read part of a node from tnc tree? Thanks, Hu -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html