Re: [ANNOUNCE] bcachefs!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2015-08-06 at 16:11 -0700, Kent Overstreet wrote:
> On Wed, Aug 05, 2015 at 11:40:06PM -0700, Ming Lin wrote:
> > On Tue, 2015-07-28 at 11:45 -0700, Ming Lin wrote:
> > > On Tue, Jul 28, 2015 at 11:41 AM, Ming Lin <mlin@xxxxxxxxxx> wrote:
> > > > On Fri, Jul 24, 2015 at 1:47 PM, Ming Lin <mlin@xxxxxxxxxx> wrote:
> > > >>
> > > >> And I want to learn how the btree node insert/delete/update happens on
> > > >> disk. These maybe too detail. I'm going to write a small tool to dump
> > > >> the file system. Then I could understand better the on disk btree
> > > >> format.
> > > >
> > > > Here is my simple tool to dump parts of the on-disk format.
> > > > http://www.minggr.net/cgit/cgit.cgi/bcache-tools/commit/?id=deb258e2
> > > 
> > > Actually: http://www.minggr.net/cgit/cgit.cgi/bcache-tools/commit/?id=3121eec
> > > 
> > > >
> > > > It's not in good shape, but simple enough to learn the on-disk format.
> > 
> > Hi Kent,
> > 
> > I'm trying to understand how the root inode is stored in the inode
> > btree.
> > 
> > dd if=/dev/zero of=fs.img bs=10M count=1
> > bcacheadm format -C fs.img
> > mount -t bcache -o loop fs.img /mnt
> > umount /mnt
> > hexdump -C fs.img > fs.hex
> > 
> > From my simple tool, I know that the inode btree starts from offset
> > 0xec000
> 
> The root node of the inode btree? Are you handling trees with multiple nodes
> yet?

Yes and no.

> 
> > 
> > 000ec000  43 ef f3 df ff ff ff ff  86 c1 47 1e 99 25 51 35  |C.........G..%Q5|
> > 000ec010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 000ec020  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
> > 000ec030  ff ff ff ff ff ff ff ff  01 05 00 00 00 00 00 00  |................|
> > 000ec040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 000ec070  88 b5 38 e2 45 36 eb f6  00 00 00 00 00 00 00 00  |..8.E6..........|
> > 000ec080  01 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 000ec090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 000ed000  31 66 fd 31 ff ff ff ff  88 b5 38 e2 45 36 eb f6  |1f.1......8.E6..|
> > 000ed010  02 00 00 00 00 00 00 00  01 00 00 00 03 00 0b 00  |................|
> > 000ed020  0b 01 80 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 000ed030  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
> > 000ed040  ed 41 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |.A..............|
> > 000ed050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 000ed070  02 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 000ed080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 
> > btree_node (0xec000)
> >     bset (0xed008)  ---> bset->u64s = 0x0b = 11
> >         bkey_packed (0xed020)
> >             bkey (0xed020)
> >             bch_inode (0xed040 to 0xed077)  ---> root inode
> > 
> > Is the decode above correct?
> 
> I think so. The code that deals with reading in a btree node disk and
> interpreting the contents is mainly in bch_btree_node_read_done(), btree_io.c -
> it looks like you found that?

I haven't dig into the code yet.
Firstly to understand the on-disk structure by hexdump.

> 
> > I found the root inode manually. But how is it actually found by code?
> 
> The root inode is the inode with inode number BCACHE_ROOT_INO (4096) -
> http://evilpiepirate.org/git/linux-bcache.git/tree/drivers/md/bcache/fs.c?h=bcache-dev&id=5cf7fb11d124839eea2191fd7e8eddecb296d67d#n2285
> 
> So to do it correctly, you'll need the bkey packing code in order to unpack the
> key (if it was packed) so that you can get the actual inode number of the key.
> 
> You'll also need to do something like the mergesort algorithm (or something
> equivalent; you don't need to do the actual mergesort if you're just doing a
> linear search for one key). That is - if there's multiple bsets, they will
> likely contain duplicates and keys in newer bsets overwrite keys in older bsets.

Don't understand this part for now. I'll learn it.

> 
> > Could you help to explain what it is from 0xec070 to 0xed007?
> > Are they also bsets?
> 
> Without knowing your block size and spending a fair amount of time staring at
> the hexdump, I don't know what starts there - but quite possibly yes; bsets that
> aren't at the start of the btree node are embeddedd in a struct
> btree_node_entry, not a struct btree_node.
> 
> To tell if it's a valid bset, you compare bset->seq against the seq in the first
> bset - it's a random number generated for each new btree node; if they match
> then the bset there goes with that btree node.

The block size is 4K.

OK, now I can interpret the hexdump.

000ec000  43 ef f3 df ff ff ff ff  86 c1 47 1e 99 25 51 35  |C.........G..%Q5|
000ec010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000ec020  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
000ec030  ff ff ff ff ff ff ff ff  01 05 00 00 00 00 00 00  |................|
000ec040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000ec070  88 b5 38 e2 45 36 eb f6  00 00 00 00 00 00 00 00  |..8.E6..........|
000ec080  01 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  |................|
000ec090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000ed000  31 66 fd 31 ff ff ff ff  88 b5 38 e2 45 36 eb f6  |1f.1......8.E6..|
000ed010  02 00 00 00 00 00 00 00  01 00 00 00 03 00 0b 00  |................|
000ed020  0b 01 80 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000ed030  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
000ed040  ed 41 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |.A..............|
000ed050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000ed070  02 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000ed080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000ee000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

There are 2 bsets: bset->seq "88 b5 38 e2 45 36 eb f6"

btree_node (0xec000)
    bset_1 (0xec070)  ---> bset->u64s = 0 (a empty bset?)

btree_node_entry (0xed000)
    bset_2 (0xed008)  ---> bset->u64s = 0x0b = 11
        bkey_packed (0xed020)
            bkey (0xed020)
            bch_inode (0xed040 to 0xed077)  ---> root inode

Why is there a empty bset at the start of the btree node?

--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux