Hi Ryusuke, This is my improved vision of possible approach to change in-place update of superblock on COW policy. I suppose that current description includes all that we discussed previously. And we can continue to deepen this discussion. Approach is based on necessity to have two areas at the begin and at the end of a NILFS2 volume. Every such area should have capacity is equal to segment size. The goal of these two areas is to provide a FTL-friendly way of storing information about latest log and modified superblock's fields by means of COW (Copy-On-Write) policy. At the begin of a NILFS2 volume is located primary superblock area. Primary superblock area begins from static superblock is created during NILFS2 volume creation by means of mkfs tool. This superblock (primary superblock) is located on 1024 bytes from the volume begin (as it placed currently). The primary superblock leaves untouched during filling primary superblock area by modified information. Initial state of superblock can be rewritten at the moment of beginning next iteration of filling of primary superblock area (because this area lives likewise circular buffer). ------------------------------------------------------------ | Primary superblock | Modifiable area | ------------------------------------------------------------ |<---- 4 KB ------>| |<-------------------- segment size ---------------------->| On the opposite side of the volume (at the volume's end) is located secondary (or back) superblock area. This area begins from modifiable area and it ends with secondary superblock (as it is located currently). Modifiable area of secondary superblock area lives likewise of modifiable area in first superblock area. ------------------------------------------------------------ | Modifiable area | Secondary superblock | ------------------------------------------------------------ |<------ 4 KB ------>| |<-------------------- segment size ---------------------->| Primary and secondary superblock areas have goal to keep copies of super roots. And, firstly, namely these areas are used for searching a latest log. These areas should keep as super root as physical block of this super root's placement. Moreover, primary and secondary superblock areas have different frequency of updating. Secondary superblock area is updated during every umount or once at several hours (if we have significant system uptime). Primary superblock area is updated more frequently. The frequency of primary superblock area's update can be based on timeout or count of constructed segments. But, anyway, it makes sense to take into account only full segments instead of partial segments. Maybe, it makes sense to keep more complex combination in modifiable area: super root + diff to superblock state + physical block of super root's placement. Modifiable area should have special filling policy. This policy doesn't contradict with COW policy but it implements not in sequential manner. Namely, modifiable area should be divided on several groups (the count of groups can be configurable option). Moreover, primary and secondary superblock areas would have different values of groups count. Thereby, every group will contain some blocks count. ------------------------------------------------------------- | Group1 | Group2 | Group3 | **** | GroupN | ------------------------------------------------------------- |<-------------------- Modifiable area -------------------->| Saving blocks are distributed between groups by means of policy that every next block should be saved in next group on every iteration. If all groups in modifiable area have equal count of saved blocks then it begins the next iteration which starts from the first group. FIRST ITERATION [A phase]: (1) first block ------------------------------------------------------------- |A1| | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------- |<-- Group1 -->|<-- Group2 -->|<-- ****** -->|<-- GroupN -->| (2) second block ------------------------------------------------------------- |A1| | | | |A2| | | | | | | | | | | | | | | ------------------------------------------------------------- |<-- Group1 -->|<-- Group2 -->|<-- ****** -->|<-- GroupN -->| (N) Nth block ------------------------------------------------------------- |A1| | | | |A2| | | | |A3| | | | |An| | | | | ------------------------------------------------------------- |<-- Group1 -->|<-- Group2 -->|<-- ****** -->|<-- GroupN -->| SECOND ITERATION [B phase]: ------------------------------------------------------------- |A1|B1| | | |A2|B2| | | |A3|B3| | | |An|Bn| | | | ------------------------------------------------------------- |<-- Group1 -->|<-- Group2 -->|<-- ****** -->|<-- GroupN -->| Nth ITERATION [E phase]: ------------------------------------------------------------- |A1|B1|C1|D1|E1|A2|B2|C2|D2|E2|A3|B3|C3|D3|E3|An|Bn|Cn|Dn|En| ------------------------------------------------------------- |<-- Group1 -->|<-- Group2 -->|<-- ****** -->|<-- GroupN -->| Finally, when modifiable area is completely filled then it is possible to discard area's content and to begin filling iterations again. We will have two modifiable areas are filling with different frequencies and some state of replication of information. Thereby, it provides basis for safe and independent discarding of modifiable areas. The special filling policy has goal to provide a basis for efficient search. Namely, first group contains blocks differ by some period from each other. We have such sequence during saving: [A1,A2,A3,..,An], [B1,B2,B3,..,Bn], ..., [E1,E2,E3,..,En]. But first group will contain (A1,B1,C1,D1,E1). Thereby, passing item-by-item through first group means jumping with some period. Moreover, in the case of some failure it is possible to start the searching from any group (with decreasing search efficiency). It needs to take into account magic signature, header checksum and timestamps during comparison of items in group. It provides opportunity to distinguish valid blocks from empty and invalid ones and to distinguish older blocks from latest ones. Searching in dedicated area gives opportunity to use read-ahead technique. Moreover, if group contains many items then it is possible to increase step between current and next items during search. For example, it is possible to use such sequence of steps during searching: 0, 1, 3, 5, 7, and so on. If we have found latest item in first group, for example, then it is possible to find a latest item in he whole sequence by means of jumping on group period (count of blocks in a group). Two modifiable areas are filled with different frequencies and it gives opportunity to use special searching algorithm. Such algorithm can use, for example, secondary superblock area for rough, preliminary search (because this modifiable area is changed rarely). Then, further, algorithm can continue search in first superblock area (because this modifiable area is changed more frequently). Moreover, segctor thread has knowledge about all dirty files and it can predict, theoretically, how many segments will be constructed. Thereby, it is possible to save in items of modifiable area's groups such prediction in the form of hint that it can be used during search for improving search algorithm efficiency. With the best regards, Vyacheslav Dubeyko. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html