On Wed, Aug 05, 2015 at 11:40:06PM -0700, Ming Lin wrote: > On Tue, 2015-07-28 at 11:45 -0700, Ming Lin wrote: > > On Tue, Jul 28, 2015 at 11:41 AM, Ming Lin <mlin@xxxxxxxxxx> wrote: > > > On Fri, Jul 24, 2015 at 1:47 PM, Ming Lin <mlin@xxxxxxxxxx> wrote: > > >> > > >> And I want to learn how the btree node insert/delete/update happens on > > >> disk. These maybe too detail. I'm going to write a small tool to dump > > >> the file system. Then I could understand better the on disk btree > > >> format. > > > > > > Here is my simple tool to dump parts of the on-disk format. > > > http://www.minggr.net/cgit/cgit.cgi/bcache-tools/commit/?id=deb258e2 > > > > Actually: http://www.minggr.net/cgit/cgit.cgi/bcache-tools/commit/?id=3121eec > > > > > > > > It's not in good shape, but simple enough to learn the on-disk format. > > Hi Kent, > > I'm trying to understand how the root inode is stored in the inode > btree. > > dd if=/dev/zero of=fs.img bs=10M count=1 > bcacheadm format -C fs.img > mount -t bcache -o loop fs.img /mnt > umount /mnt > hexdump -C fs.img > fs.hex > > From my simple tool, I know that the inode btree starts from offset > 0xec000 The root node of the inode btree? Are you handling trees with multiple nodes yet? > > 000ec000 43 ef f3 df ff ff ff ff 86 c1 47 1e 99 25 51 35 |C.........G..%Q5| > 000ec010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > 000ec020 00 00 00 00 00 00 00 00 ff ff ff ff ff ff ff ff |................| > 000ec030 ff ff ff ff ff ff ff ff 01 05 00 00 00 00 00 00 |................| > 000ec040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > * > 000ec070 88 b5 38 e2 45 36 eb f6 00 00 00 00 00 00 00 00 |..8.E6..........| > 000ec080 01 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 |................| > 000ec090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > * > 000ed000 31 66 fd 31 ff ff ff ff 88 b5 38 e2 45 36 eb f6 |1f.1......8.E6..| > 000ed010 02 00 00 00 00 00 00 00 01 00 00 00 03 00 0b 00 |................| > 000ed020 0b 01 80 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > 000ed030 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 |................| > 000ed040 ed 41 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |.A..............| > 000ed050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > * > 000ed070 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > 000ed080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > * > > btree_node (0xec000) > bset (0xed008) ---> bset->u64s = 0x0b = 11 > bkey_packed (0xed020) > bkey (0xed020) > bch_inode (0xed040 to 0xed077) ---> root inode > > Is the decode above correct? I think so. The code that deals with reading in a btree node disk and interpreting the contents is mainly in bch_btree_node_read_done(), btree_io.c - it looks like you found that? > I found the root inode manually. But how is it actually found by code? The root inode is the inode with inode number BCACHE_ROOT_INO (4096) - http://evilpiepirate.org/git/linux-bcache.git/tree/drivers/md/bcache/fs.c?h=bcache-dev&id=5cf7fb11d124839eea2191fd7e8eddecb296d67d#n2285 So to do it correctly, you'll need the bkey packing code in order to unpack the key (if it was packed) so that you can get the actual inode number of the key. You'll also need to do something like the mergesort algorithm (or something equivalent; you don't need to do the actual mergesort if you're just doing a linear search for one key). That is - if there's multiple bsets, they will likely contain duplicates and keys in newer bsets overwrite keys in older bsets. > Could you help to explain what it is from 0xec070 to 0xed007? > Are they also bsets? Without knowing your block size and spending a fair amount of time staring at the hexdump, I don't know what starts there - but quite possibly yes; bsets that aren't at the start of the btree node are embeddedd in a struct btree_node_entry, not a struct btree_node. To tell if it's a valid bset, you compare bset->seq against the seq in the first bset - it's a random number generated for each new btree node; if they match then the bset there goes with that btree node. -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html