Viacheslav Dubeyko 於 2019-03-05 01:57 寫到:
On Mon, 2019-03-04 at 15:45 +0800, tchou wrote:
tchou 於 2019-02-27 11:14 寫到:
>
> Viacheslav Dubeyko 於 2019-02-27 10:56 寫到:
> >
> > On Wed, 2019-02-27 at 09:46 +0800, tchou wrote:
> > >
> > > Viacheslav Dubeyko 於 2019-02-27 02:01 寫到:
> > > >
> > > > On Tue, 2019-02-26 at 11:32 +0800, tchou wrote:
> > > > >
> > > > > Ernesto A. Fernández 於 2019-02-24 08:44 寫到:
> > > > > >
> > > > > >
> > > > > >
> > > > [skipped]
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > [1]
> > > > > > > =======================================================
> > > > > > > ==========
> > > > > > > =================================
> > > > > > >
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.504049]
> > > > > > > hfsplus:
> > > > > > > trying to free free bnode 294912(2)
> > > > > > >
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.510017]
> > > > > > > hfsplus:
> > > > > > > trying to free free bnode 294912(2)
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.515983]
> > > > > > > hfsplus:
> > > > > > > trying to free free bnode 294912(2)
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.521949]
> > > > > > > general
> > > > > > > protection fault: 0000 [#1] SMP
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.621069]
> > > > > > > CPU: 1
> > > > > > > PID:
> > > > > > > 18715 Comm: SYNO.FileStatio Tainted: P C O 3.10.102
> > > > > > > #15152
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.630308]
> > > > > > > Hardware
> > > > > > > name:
> > > > > > > Synology Inc. DS1517+/Type2 - Board Product Name1, BIOS
> > > > > > > M.405
> > > > > > > 2017/05/09
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.640423]
> > > > > > > task:
> > > > > > > ffff8802753fa040 ti: ffff880270880000 task.ti:
> > > > > > > ffff880270880000
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.648779]
> > > > > > > RIP:
> > > > > > > 0010:[<ffffffffa051459e>] [<ffffffffa051459e>]
> > > > > > > hfsplus_bnode_write+0x9e/0x1e0 [hfsplus]
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.659489]
> > > > > > > RSP:
> > > > > > > 0018:ffff880270883c18 EFLAGS: 00010202
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.665415]
> > > > > > > RAX:
> > > > > > > 0000000000000000 RBX: 0000000000000002 RCX:
> > > > > > > 000000000000aeff
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.673391]
> > > > > > > RDX:
> > > > > > > 0000000000000000 RSI: ffff880270883c56 RDI:
> > > > > > > db73880000000000
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.681366]
> > > > > > > RBP:
> > > > > > > ffff88005f7b1920 R08: 0000000000000002 R09:
> > > > > > > 0000000000000002
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.689343]
> > > > > > > R10:
> > > > > > > ffff88005f7b18d0 R11: 0000000000000002 R12:
> > > > > > > 0000000000001ffc
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.697310]
> > > > > > > R13:
> > > > > > > ffff880270883c56 R14: 0000000000000002 R15:
> > > > > > > 0000000000000002
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.705286]
> > > > > > > FS:
> > > > > > > 00007f4fee0607c0(0000) GS:ffff88027fc40000(0000)
> > > > > > > knlGS:0000000000000000
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.714322]
> > > > > > > CS: 0010
> > > > > > > DS:
> > > > > > > 0000 ES: 0000 CR0: 000000008005003b
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.720744]
> > > > > > > CR2:
> > > > > > > 00007f4fee05d000 CR3: 0000000247210000 CR4:
> > > > > > > 00000000001007e0
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.728711]
> > > > > > > DR0:
> > > > > > > 0000000000000000 DR1: 0000000000000000 DR2:
> > > > > > > 0000000000000000
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.736687]
> > > > > > > DR3:
> > > > > > > 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > > > > > > 0000000000000400
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.744654]
> > > > > > > Stack:
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.746896]
> > > > > > > ffff88005f7b18c0 ffff880270883cd0 0000000000001ffc
> > > > > > > 0000000000001f9c
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.755181]
> > > > > > > 0000000000000060 000000000000000e ffffffffa05146ff
> > > > > > > aeff000000000031
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.763468]
> > > > > > > ffffffffa0516bf9 000000606228c340 ffff880270883cd0
> > > > > > > 00000000fffffffe
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.771763]
> > > > > > > Call
> > > > > > > Trace:
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.774497]
> > > > > > > [<ffffffffa05146ff>] ?
> > > > > > > hfsplus_bnode_write_u16+0x1f/0x30
> > > > > > > [hfsplus]
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.782671]
> > > > > > > [<ffffffffa0516bf9>] ? hfsplus_brec_remove+0x129/0x190
> > > > > > > [hfsplus]
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.790650]
> > > > > > > [<ffffffffa05191d0>] ? __hfsplus_delete_attr+0x90/0xf0
> > > > > > > [hfsplus]
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.798629]
> > > > > > > [<ffffffffa0519979>] ?
> > > > > > > hfsplus_delete_all_attrs+0x49/0xb0
> > > > > > > [hfsplus]
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.806900]
> > > > > > > [<ffffffffa0512482>] ? hfsplus_delete_cat+0x1c2/0x2b0
> > > > > > > [hfsplus]
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.814782]
> > > > > > > [<ffffffffa0512d90>] ? hfsplus_unlink+0x1d0/0x1e0
> > > > > > > [hfsplus]
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.822277]
> > > > > > > [<ffffffff811066bd>] ? __inode_permission+0x1d/0xb0
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.828992]
> > > > > > > [<ffffffff8110a72a>] ? vfs_unlink+0x8a/0x100
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.835025]
> > > > > > > [<ffffffff8110a9c3>] ? do_unlinkat+0x223/0x230
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.841255]
> > > > > > > [<ffffffff8111d853>] ? mntput_no_expire+0x13/0x130
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.847873]
> > > > > > > [<ffffffff8104d1bc>] ? task_work_run+0x9c/0xe0
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.854102]
> > > > > > > [<ffffffff81002901>] ? do_notify_resume+0x61/0x90
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.860624]
> > > > > > > [<ffffffff810fb827>] ? fput+0x57/0xb0
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.865978]
> > > > > > > [<ffffffff8149dd32>] ? system_call_fastpath+0x16/0x1b
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.872884]
> > > > > > > Code: 48
> > > > > > > 63 ca
> > > > > > > 48 01 cf 48 83 fb 08 0f 83 fd 00 00 00 31 c0 41 f6 c3
> > > > > > > 04 74 09 8b
> > > > > > > 06
> > > > > > > 89 07 b8 04 00 00 00 41 f6 c3 02 74 0c 0f b7 0c 06 <66>
> > > > > > > 89 0c 07
> > > > > > > 48 8d
> > > > > > > 40 02 41 83 e3 01 74 07 0f b6 0c 06 88 0c 07
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.894293]
> > > > > > > RIP
> > > > > > > [<ffffffffa051459e>] hfsplus_bnode_write+0x9e/0x1e0
> > > > > > > [hfsplus]
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.902375]
> > > > > > > RSP
> > > > > > > <ffff880270883c18>
> > > > > > >
> > > > > > > 2017-08-30T10:32:30-04:00 BS-SAN kernel: [ 5471.906350]
> > > > > > > ---[ end
> > > > > > > trace
> > > > > > > 0e65d1ee34a1e12e ]---
> > > > > > >
> > > > > > >
> > > > > > > =======================================================
> > > > > > > ==========
> > > > > > > =================================
> > > > > > >
> > > >
> > > > Could you please share more details about the environment of
> > > > the bug?
> > > > Do you know what operation trigger the bug? How had volume
> > > > been
> > > > created? Can you reproduce the issue?
> > > >
> > > > It looks like the file deletion operation took place. Do you
> > > > have any
> > > > idea what file is under deletion and what features it has?
> > > > Does this
> > > > file contain any xattr?
> > > Ok, the following description is my situation. The Linux
> > > versions of
> > > our products are 3.10 and 4.4.
> > >
> > > Users may plug-in the external USB drive, whose hfs+ is
> > > formatted on
> > > their macOS device, to our device. They can do all file system
> > > operations(etc create, remove, rename files, and so on) on both
> > > macOS side and Linux side.
> > >
> > > The files created on macOS have the default xattr:
> > > com.apple.FinderInfo=0sAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAK
> > > rmU=
> > > The files created on Linux have no xattr.
> > >
> > > Some users seem enconter the call trace when removing the file
> > > on
> > > our device.And it will stock when we unmount it and cause the
> > > unmount fail.
> > >
> > > We cannot reproduce it by ourselves. The following link is the
> > > only one I can find that have the same situation of mine:
> > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1646565/co
> > > mments/5
> > >
> > > I try some reproduce ways:
> > > 1. Format the USB drive on Linux and macOS.
> > > 2. Use fsstress to stress create and unlink operations on
> > > Linux.
> > > 3. Create and remove the 100,000 files on Linux.
> > > 4. Create 10,000 ~ 500,000 files on MacOS and remove all on
> > > Linux.
> > > All of ways failed.
> > >
> > > There are about 10+ users enconter this situation so I try to
> > > fix it.
> > > Any Idea about it?
> > OK. I see the point. Let's achieve the stable reproduction of
> > the
> > issue
> > at first. The issue is triggered by operations in the Attributes
> > Tree
> > but not in the Catalog Tree. So, it will be enough to create the
> > several
> > files. The key trick is to create many xattrs for one file. It
> > will be
> > better to create xattrs by native way under Macx OS X. I believe
> > that
> > Attributes Tree's node size could be about 8 KB by default (but
> > maybe
> > 4
> > KB only). It is better to check the size in superblock's dump,
> > for
> > example. So, it needs to create a lot of xattrs for one file (or
> > several
> > files) with the goal to create the Attributes Tree with enough
> > number
> > of
> > nodes. The best case will be to create the Attributes Tree with
> > height
> > of 2 or 3 with the goal to have the index nodes too. As far as I
> > can
> > judge, the issue can be reproduce during the deletion of the
> > xattrs or
> > file with xattrs under Linux. And it needs to have the Attributes
> > Tree
> > with many nodes because the issue should be triggered during the
> > operation of the b-tree node deletion.
> >
> > So, I hope my vision could help. Could you please try to
> > reproduce the
> > issue and to share the results?
> Thanks for your advice! I will try to reproduce it. And we have a
> four-day
> vacations in our country from tomorrow on. I will try it at 3/4 ~
> 3/5.
> Please forgive the delay.
>
>
Sorry for delay, I finish the reproduce steps. And it works!
I try it on our product with kernel 3.10 and ubuntu with kernel 4.19
Both environmnets can reproduce the bug.
I use two ways to reproduce:
=====================================================================
=========
1). mkfs the hfs+ image on linux
1. touch file on it.
2. add enouth xattrs in the file
for x in $(seq 1 1000)
do setfattr -n user.$x -v "gggg${x}gggg${x}qqqqq${x}pleaseggggg"
/mnt/1
done
3. rm the file
4. segmentation fault and get the same call trace
5. the fsck.hfsplus result:
** img2 (NO WRITE)
** Checking HFS Plus volume.
** Checking Extents Overflow file.
** Checking Catalog file.
Invalid leaf record count
(It should be 4 instead of 6)
** Checking Catalog hierarchy.
Invalid directory item count
(It should be 1 instead of 2)
** Checking Extended Attributes file.
Invalid index key
(8, 1)
** The volume untitled needs to be repaired.
=====================================================================
=========
2). format hfs+ on mac
1. touch file on it.
2.add enouth xattrs in the file
for x in $(seq 1 1000)
do xattr -w user.$x "gggg${x}gggg${x}qqqqq${x}pleaseggggg"
/Volumes/test/1
done
3. move the usb disk to linux
4. rm the file
5. segmentation fault and get the same call trace
6. the fsck.hfsplus result:
** /dev/sdq1 (NO WRITE)
** Checking HFS Plus volume.
** Checking Extents Overflow file.
** Checking Catalog file.
** Checking Catalog hierarchy.
** Checking Extended Attributes file.
** Checking volume bitmap.
** Checking volume information.
Volume Header needs minor repair
(2, 0)
** The volume test needs to be repaired.
=====================================================================
=========
It seems that the guess it correct. The Attributes Tree with enough
number of
node can trigger the bug.
Do you see the same call trace? Could you share the call trace in your
case? Could you identify the code line in the source code that trigger
the bug?
Here is my call trace:
general protection fault: 0000 [#1] SMP
CPU: 1 PID: 26527 Comm: rm Tainted: PF C O 3.10.108 #40283
Hardware name: Synology Inc. DS916+/Type2 - Board Product Name, BIOS
M.215 3/2/2016
task: ffff880078b05040 ti: ffff880072b7c000 task.ti: ffff880072b7c000
RIP: 0010:[<ffffffffa025764f>] [<ffffffffa025764f>]
hfsplus_bnode_write+0xaf/0x230 [hfsplus]
RSP: 0018:ffff880072b7fbf0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000f0ff
RDX: ffff880000000000 RSI: ffff880072b7fc2e RDI: 27c54210957d7000
RBP: ffff88006d94b4a0 R08: 0000000000000002 R09: 0000000000000002
R10: 0000000000000002 R11: ffff88006d94b498 R12: 0000000000000002
R13: ffff880072b7fc2e R14: 0000000000000002 R15: 0000000000000002
FS: 00007fef30a9c500(0000) GS:ffff880079e80000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000006fdb94 CR3: 0000000066794000 CR4: 00000000001007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffff880072b7fca8 0000000000001ffc 0000000000001fe8 000000000000000e
000000000000001e ffff88006d94b440 ffffffffa02577ef f0ff00000000000b
ffffffffa0259c44 0000001e0000000c ffff880072b7fca8 00000000fffffffe
Call Trace:
[<ffffffffa02577ef>] ? hfsplus_bnode_write_u16+0x1f/0x30 [hfsplus]
[<ffffffffa0259c44>] ? hfsplus_brec_remove+0x124/0x180 [hfsplus]
[<ffffffffa025c1f0>] ? __hfsplus_delete_attr+0x70/0xc0 [hfsplus]
[<ffffffffa025c9b9>] ? hfsplus_delete_all_attrs+0x49/0xb0 [hfsplus]
[<ffffffffa02555f0>] ? hfsplus_delete_cat+0x260/0x2b0 [hfsplus]
[<ffffffffa0255d0a>] ? hfsplus_unlink+0x7a/0x1d0 [hfsplus]
[<ffffffff8113da6d>] ? __inode_permission+0x1d/0xb0
[<ffffffff8114158b>] ? may_delete+0x4b/0x240
[<ffffffff81141b67>] ? vfs_unlink+0x87/0x110
[<ffffffff81141e9a>] ? do_unlinkat+0x2aa/0x2c0
[<ffffffff81490b48>] ? __do_page_fault+0x228/0x510
[<ffffffff81135d11>] ? SYSC_newfstatat+0x21/0x30
[<ffffffff8149513e>] ? system_call_fastpath+0x1c/0x21
Code: 48 89 c7 48 01 df 49 83 fc 08 0f 83 f4 00 00 00 31 c0 41 f6 c2 04
74 09 8b 06 89 07 b8 04 00 00 00 41 f6 c2 02 74 0c 0f b7 0c 06 <66> 89
0c 07 48 8d 40 02 41 83 e2 01 74 07 0f b6 0c 06 88 0c 07
RIP [<ffffffffa025764f>] hfsplus_bnode_write+0xaf/0x230 [hfsplus]
RSP <ffff880072b7fbf0>
---[ end trace 459946076ce91423 ]---
And the gdb says the code line trigger bug is memcpy:
void hfs_bnode_write(struct hfs_bnode *node, void *buf, int off, int
len)
{
struct page **pagep;
int l;
off += node->page_offset;
pagep = node->page + (off >> PAGE_CACHE_SHIFT);
off &= ~PAGE_CACHE_MASK;
l = min(len, (int)PAGE_CACHE_SIZE - off);
memcpy(kmap(*pagep) + off, buf, l);
set_page_dirty(*pagep);
kunmap(*pagep);
while ((len -= l) != 0) {
buf += l;
l = min(len, (int)PAGE_CACHE_SIZE);
memcpy(kmap(*++pagep), buf, l);<<<<<<<<<<<<<<
set_page_dirty(*pagep);
kunmap(*pagep);
}
}
Thanks,
Vyacheslav Dubeyko.