Re: The question about hfs+ patch (hfsplus: fix BUG on bnode parent update)

Viacheslav Dubeyko <slava@xxxxxxxxxxx> · Tue, 05 Mar 2019 09:41:11 -0800

On Tue, 2019-03-05 at 09:49 +0800, tchou wrote:
> Viacheslav Dubeyko 於 2019-03-05 01:57 寫到:
> > 
> > On Mon, 2019-03-04 at 15:45 +0800, tchou wrote:
> > > 
> > > 

[skipped]

> > > > > > > > > > 
> > > > > > > Could you please share more details about the environment
> > > > > > > of
> > > > > > > the bug?
> > > > > > > Do you know what operation trigger the bug? How had
> > > > > > > volume
> > > > > > > been
> > > > > > > created? Can you reproduce the issue?
> > > > > > > 
> > > > > > > It looks like the file deletion operation took place. Do
> > > > > > > you
> > > > > > > have any
> > > > > > > idea what file is under deletion and what features it
> > > > > > > has?
> > > > > > > Does this
> > > > > > > file contain any xattr?
> > > > > > Ok, the following description is my situation. The Linux
> > > > > > versions of
> > > > > > our products are 3.10 and 4.4.
> > > > > > 
> > > > > > Users may plug-in the external USB drive, whose hfs+ is
> > > > > > formatted on
> > > > > > their macOS device, to our device.  They can do all file
> > > > > > system
> > > > > > operations(etc create, remove, rename files, and so on) on
> > > > > > both
> > > > > > macOS side and Linux side.
> > > > > > 
> > > > > > The files created on macOS have the default xattr:
> > > > > > com.apple.FinderInfo=0sAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> > > > > > AAAK
> > > > > > rmU=
> > > > > > The files created on Linux have no xattr.
> > > > > > 
> > > > > > Some users seem enconter the call trace when removing the
> > > > > > file
> > > > > > on
> > > > > > our device.And it will stock when we unmount it and cause
> > > > > > the
> > > > > > unmount fail.
> > > > > > 
> > > > > > We cannot reproduce it by ourselves. The following link is
> > > > > > the
> > > > > > only one I can find that have the same situation of mine:
> > > > > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/164656
> > > > > > 5/co
> > > > > > mments/5
> > > > > > 
> > > > > > I try some reproduce ways:
> > > > > > 1. Format the USB drive on Linux and macOS.
> > > > > > 2. Use fsstress to stress create and unlink operations on
> > > > > > Linux.
> > > > > > 3. Create and remove the 100,000 files on Linux.
> > > > > > 4. Create 10,000 ~ 500,000 files on MacOS and remove all on
> > > > > > Linux.
> > > > > > All of ways failed.
> > > > > > 
> > > > > > There are about 10+ users enconter this situation so I try
> > > > > > to
> > > > > > fix it.
> > > > > > Any Idea about it?
> > > > > OK. I see the point. Let's achieve the stable reproduction of
> > > > > the 
> > > > > issue
> > > > > at first. The issue is triggered by operations in the
> > > > > Attributes
> > > > > Tree
> > > > > but not in the Catalog Tree. So, it will be enough to create
> > > > > the 
> > > > > several
> > > > > files. The key trick is to create many xattrs for one file.
> > > > > It
> > > > > will be
> > > > > better to create xattrs by native way under Macx OS X. I
> > > > > believe
> > > > > that
> > > > > Attributes Tree's node size could be about 8 KB by default
> > > > > (but
> > > > > maybe 
> > > > > 4
> > > > > KB only). It is better to check the size in superblock's
> > > > > dump,
> > > > > for
> > > > > example. So, it needs to create a lot of xattrs for one file
> > > > > (or 
> > > > > several
> > > > > files) with the goal to create the Attributes Tree with
> > > > > enough
> > > > > number 
> > > > > of
> > > > > nodes. The best case will be to create the Attributes Tree
> > > > > with
> > > > > height
> > > > > of 2 or 3 with the goal to have the index nodes too. As far
> > > > > as I
> > > > > can
> > > > > judge, the issue can be reproduce during the deletion of the
> > > > > xattrs or
> > > > > file with xattrs under Linux. And it needs to have the
> > > > > Attributes
> > > > > Tree
> > > > > with many nodes because the issue should be triggered during
> > > > > the
> > > > > operation of the b-tree node deletion.
> > > > > 
> > > > > So, I hope my vision could help. Could you please try to
> > > > > reproduce the
> > > > > issue and to share the results?
> > > > Thanks for your advice! I will try to reproduce it. And we have
> > > > a 
> > > > four-day
> > > > vacations in our country from tomorrow on. I will try it at 3/4
> > > > ~
> > > > 3/5.
> > > > Please forgive the delay.
> > > > 
> > > > 
> > > Sorry for delay, I finish the reproduce steps. And it works!
> > > I try it on our product with kernel 3.10 and ubuntu with kernel
> > > 4.19
> > > Both environmnets can reproduce the bug.
> > > 
> > > I use two ways to reproduce:
> > > =================================================================
> > > ====
> > > =========
> > > 1). mkfs the hfs+ image on linux
> > > 1. touch file on it.
> > > 2. add enouth xattrs in the file
> > > for x in $(seq 1 1000)
> > >    do setfattr -n user.$x -v
> > > "gggg${x}gggg${x}qqqqq${x}pleaseggggg" 
> > > /mnt/1
> > > done
> > > 3. rm the file
> > > 4. segmentation fault and get the same call trace
> > > 5. the fsck.hfsplus result:
> > > ** img2 (NO WRITE)
> > > ** Checking HFS Plus volume.
> > > ** Checking Extents Overflow file.
> > > ** Checking Catalog file.
> > >     Invalid leaf record count
> > >     (It should be 4 instead of 6)
> > > ** Checking Catalog hierarchy.
> > >     Invalid directory item count
> > >     (It should be 1 instead of 2)
> > > ** Checking Extended Attributes file.
> > >     Invalid index key
> > > (8, 1)
> > > ** The volume untitled needs to be repaired.
> > > =================================================================
> > > ====
> > > =========
> > > 2). format hfs+ on mac
> > > 1. touch file on it.
> > > 2.add enouth xattrs in the file
> > > for x in $(seq 1 1000)
> > >    do xattr -w user.$x "gggg${x}gggg${x}qqqqq${x}pleaseggggg" 
> > > /Volumes/test/1
> > > done
> > > 3. move the usb disk to linux
> > > 4. rm the file
> > > 5. segmentation fault and get the same call trace
> > > 6. the fsck.hfsplus result:
> > > ** /dev/sdq1 (NO WRITE)
> > > ** Checking HFS Plus volume.
> > > ** Checking Extents Overflow file.
> > > ** Checking Catalog file.
> > > ** Checking Catalog hierarchy.
> > > ** Checking Extended Attributes file.
> > > ** Checking volume bitmap.
> > > ** Checking volume information.
> > >     Volume Header needs minor repair
> > > (2, 0)
> > > ** The volume test needs to be repaired.
> > > =================================================================
> > > ====
> > > =========
> > > 
> > > It seems that the guess it correct. The Attributes Tree with
> > > enough 
> > > number of
> > > node can trigger the bug.
> > Do you see the same call trace? Could you share the call trace in
> > your
> > case? Could you identify the code line in the source code that
> > trigger
> > the bug?
> > 
> Here is my call trace:
> general protection fault: 0000 [#1] SMP
> CPU: 1 PID: 26527 Comm: rm Tainted: PF        C O 3.10.108 #40283
> Hardware name: Synology Inc. DS916+/Type2 - Board Product Name, BIOS 
> M.215 3/2/2016
> task: ffff880078b05040 ti: ffff880072b7c000 task.ti: ffff880072b7c000
> RIP: 0010:[<ffffffffa025764f>]  [<ffffffffa025764f>] 
> hfsplus_bnode_write+0xaf/0x230 [hfsplus]
> RSP: 0018:ffff880072b7fbf0  EFLAGS: 00010202
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000f0ff
> RDX: ffff880000000000 RSI: ffff880072b7fc2e RDI: 27c54210957d7000
> RBP: ffff88006d94b4a0 R08: 0000000000000002 R09: 0000000000000002
> R10: 0000000000000002 R11: ffff88006d94b498 R12: 0000000000000002
> R13: ffff880072b7fc2e R14: 0000000000000002 R15: 0000000000000002
> FS:  00007fef30a9c500(0000) GS:ffff880079e80000(0000) 
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00000000006fdb94 CR3: 0000000066794000 CR4: 00000000001007e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Stack:
>   ffff880072b7fca8 0000000000001ffc 0000000000001fe8 000000000000000e
>   000000000000001e ffff88006d94b440 ffffffffa02577ef f0ff00000000000b
>   ffffffffa0259c44 0000001e0000000c ffff880072b7fca8 00000000fffffffe
> Call Trace:
>   [<ffffffffa02577ef>] ? hfsplus_bnode_write_u16+0x1f/0x30 [hfsplus]
>   [<ffffffffa0259c44>] ? hfsplus_brec_remove+0x124/0x180 [hfsplus]
>   [<ffffffffa025c1f0>] ? __hfsplus_delete_attr+0x70/0xc0 [hfsplus]
>   [<ffffffffa025c9b9>] ? hfsplus_delete_all_attrs+0x49/0xb0 [hfsplus]
>   [<ffffffffa02555f0>] ? hfsplus_delete_cat+0x260/0x2b0 [hfsplus]
>   [<ffffffffa0255d0a>] ? hfsplus_unlink+0x7a/0x1d0 [hfsplus]
>   [<ffffffff8113da6d>] ? __inode_permission+0x1d/0xb0
>   [<ffffffff8114158b>] ? may_delete+0x4b/0x240
>   [<ffffffff81141b67>] ? vfs_unlink+0x87/0x110
>   [<ffffffff81141e9a>] ? do_unlinkat+0x2aa/0x2c0
>   [<ffffffff81490b48>] ? __do_page_fault+0x228/0x510
>   [<ffffffff81135d11>] ? SYSC_newfstatat+0x21/0x30
>   [<ffffffff8149513e>] ? system_call_fastpath+0x1c/0x21
> Code: 48 89 c7 48 01 df 49 83 fc 08 0f 83 f4 00 00 00 31 c0 41 f6 c2
> 04 
> 74 09 8b 06 89 07 b8 04 00 00 00 41 f6 c2 02 74 0c 0f b7 0c 06 <66>
> 89 
> 0c 07 48 8d 40 02 41 83 e2 01 74 07 0f b6 0c 06 88 0c 07
> RIP  [<ffffffffa025764f>] hfsplus_bnode_write+0xaf/0x230 [hfsplus]
>   RSP <ffff880072b7fbf0>
> ---[ end trace 459946076ce91423 ]---
> 
> 
> And the gdb says the code line trigger bug is memcpy:
> 
> void hfs_bnode_write(struct hfs_bnode *node, void *buf, int off, int 
> len)
> {
>          struct page **pagep;
>          int l;
> 
>          off += node->page_offset;
>          pagep = node->page + (off >> PAGE_CACHE_SHIFT);
>          off &= ~PAGE_CACHE_MASK;
> 
>          l = min(len, (int)PAGE_CACHE_SIZE - off);
>          memcpy(kmap(*pagep) + off, buf, l);
>          set_page_dirty(*pagep);
>          kunmap(*pagep);
> 
>          while ((len -= l) != 0) {
>                  buf += l;
>                  l = min(len, (int)PAGE_CACHE_SIZE);
> > 
> > > 
> > > > 
> > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > memcpy(kmap(*++pagep), buf,
> > > > > > > > > > > > > > > > > l);<<<<<<<<<<<<<<
>                  set_page_dirty(*pagep);
>                  kunmap(*pagep);
>          }
> }
> 

Yes, I can reproduce the issue too. As far as I can see, there is some
trouble with value of fd->record:

(1) hfs_bnode_dump(node) [1] showed such dump:

[  373.529906] hfsplus: bnode: 3
[  373.529907] hfsplus: 0, 0, 0, 2, 20
[  373.529907] hfsplus:  14 (26,1)
[  373.529908] hfsplus:  44 (30,4)
[  373.529909] hfsplus:  78 (30,5)
[  373.529910] hfsplus:  112 (30,6)
[  373.529911] hfsplus:  146 (30,7)
[  373.529912] hfsplus:  180 (28,8)
[  373.529913] hfsplus:  212 (30,9)
[  373.529914] hfsplus:  246 (30,10)
[  373.529915] hfsplus:  280 (30,11)
[  373.529916] hfsplus:  314 (26,2)
[  373.529917] hfsplus:  344 (30,12)
[  373.529918] hfsplus:  378 (30,13)
[  373.529919] hfsplus:  412 (30,14)
[  373.529925] hfsplus:  446 (30,15)
[  373.529930] hfsplus:  480 (28,16)
[  373.529937] hfsplus:  512 (30,17)
[  373.529943] hfsplus:  546 (30,18)
[  373.529949] hfsplus:  580 (30,19)
[  373.529955] hfsplus:  614 (30,20)
[  373.529961] hfsplus:  648 (30,21)
[  373.529968] hfsplus:  682

(2) But hfs_dbg(BNODE_MOD, "remove_rec: %d, %d\n", fd->record, fd->keylength + fd->entrylength) [2] showed the value:

[  373.529973] hfsplus: remove_rec: -1, 30

It means that fd->record has -1 value. This value is incorrect. I
believe that it is the reason of the issue. Because, -1 creates the
incorrect value of rec_off [3]:

rec_off = tree->node_size - (fd->record + 2) * 2;

I believe that it makes sense to add the check of fd->record value in
hfs_brec_remove(). But it is not the fix of the issue. Currently, it's
not completely clear for me why fd->record has incorrect value after
the search. I am going to check the search algorithm in hfs_brec_find()
[4].

Thanks,
Vyacheslav Dubeyko.

[1] https://elixir.bootlin.com/linux/latest/source/fs/hfsplus/brec.c#L195
[2] https://elixir.bootlin.com/linux/latest/source/fs/hfsplus/brec.c#L196
[3] https://elixir.bootlin.com/linux/latest/source/fs/hfsplus/brec.c#L188
[4] https://elixir.bootlin.com/linux/latest/source/fs/hfsplus/bfind.c#L164