bcache deadlock with /dev/ramX + partition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi: We've been using bcache to fashion an "overflow" ramdisc: We use
/dev/ramX as the cache, and an actual block device on spinning rust
as the backing device. This works pretty well, but we've bumped into
what appears to be a locking problem in the kernel:

When the backing device is a top level block device ( eg /dev/sdb )
everything seems to work just fine. When the backing device is a partition ( eg /dev/sda11 ) processes seem to end up in the D state
when they fsync.

Under kernel 3.11.2 this can be triggered with a simple fsync. Under kernels 3.12.9, 3.13 and 3.13 running with the nosmp command line
argument it is slightly harder to provoke ( a simple fsync won't always
trigger it but our chroot install script triggers it every time ).

Further investigation hinted at this being a memory alignment problem: We haven't confirmed it yet, but certain offsets for partitions don't trigger the deadlock:

[ In all cases it's an msdos partition table, the device is bcache0 and is
  mounted under /srv which happens to be on sda10. The cache device is
  always /dev/ram0 ]:

Deadlocking configurations:

   backing: sda11;  default offset from sda (NOT a multiple of 4 KiB)
   backing: sdb1;   default offset from sdb (!= 4 KiB)
   backing: sda11;  offset from sda is:
                    - a multiple of 4  KiB
                    - a multiple of 4  MiB
                    - a multplie of 16 MiB

Non-deadlocking configurations:

   backing: sdb
   backing: sdb1;   offset from start of sdb is 4096 B

So, to sum up: a top level block device as a backing device never
seems to deadlock, some offsets for some partitions (ok, PAGE_SIZE
alignment for sdb1) _also_ do not deadlock. All other cases of
use of a partition have deadlocked so far.

I'm currently testiung under 3.12.9 and get the following trace from the
deadlocked process:

[82440.244111] INFO: task dpkg:14059 blocked for more than 120 seconds.
[82440.244141]       Not tainted 3.12-0.bpo.1-amd64 #1
[82440.244153] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[82440.244167] dpkg            D ffff88021fc14300     0 14059  14056 0x00000000
[82440.244173]  ffff8800c0edd800 0000000000000086 0000000000000088 ffffffff81813480
[82440.244177]  ffff8801d5d5bfd8 ffff8801d5d5bfd8 ffff8801d5d5bfd8 ffff8800c0edd800
[82440.244181]  0000000000000000 ffff88021fc14b40 ffff8800c0edd800 ffffffff8111f800
[82440.244185] Call Trace:
[82440.244195]  [<ffffffff8111f800>] ? __lock_page+0x70/0x70
[82440.244201]  [<ffffffff814c2797>] ? io_schedule+0x87/0xd0
[82440.244204]  [<ffffffff8111f809>] ? sleep_on_page+0x9/0x10
[82440.244208]  [<ffffffff814c00c2>] ? __wait_on_bit+0x52/0x80
[82440.244212]  [<ffffffff8111f943>] ? wait_on_page_bit+0x73/0x80
[82440.244217]  [<ffffffff81082d80>] ? wake_atomic_t_function+0x30/0x30
[82440.244220]  [<ffffffff8111fa46>] ? filemap_fdatawait_range+0xf6/0x170
[82440.244225]  [<ffffffff81121058>] ? filemap_write_and_wait_range+0x48/0x90
[82440.244230]  [<ffffffff811a948d>] ? generic_file_fsync+0x2d/0xa0
[82440.244247]  [<ffffffffa0159543>] ? ext4_sync_file+0x203/0x320 [ext4]
[82440.244251]  [<ffffffff811b3298>] ? do_fsync+0x58/0x90
[82440.244255]  [<ffffffff811b362b>] ? SyS_fsync+0xb/0x20
[82440.244259]  [<ffffffff814cb7b9>] ? system_call_fastpath+0x16/0x1b

Other traces from previous tests:

3.11.2:
[501840.292105] Call Trace:
[501840.292141]  [<ffffffff8110b270>] ? wait_on_page_read+0x60/0x60
[501840.292159]  [<ffffffff814787e4>] ? io_schedule+0x94/0x120
[501840.292167]  [<ffffffff8110b275>] ? sleep_on_page+0x5/0x10
[501840.292171]  [<ffffffff81476824>] ? __wait_on_bit+0x54/0x80
[501840.292178]  [<ffffffff8110b08f>] ? wait_on_page_bit+0x7f/0x90
[501840.292194]  [<ffffffff81078c40>] ? wake_atomic_t_function+0x30/0x30
[501840.292207]  [<ffffffff811175e8>] ? pagevec_lookup_tag+0x18/0x20
[501840.292211]  [<ffffffff8110b178>] ? filemap_fdatawait_range+0xd8/0x150
[501840.292217]  [<ffffffff8110c765>] ? filemap_write_and_wait_range+0x35/0x60
[501840.292229]  [<ffffffff8118cd8b>] ? generic_file_fsync+0x1b/0x90
[501840.292259]  [<ffffffffa01750ea>] ? ext4_sync_file+0x10a/0x2e0 [ext4]
[501840.292264]  [<ffffffff8119515c>] ? do_fsync+0x4c/0x80
[501840.292267]  [<ffffffff811953d7>] ? SyS_fsync+0x7/0x10
[501840.292275]  [<ffffffff81481de9>] ? system_call_fastpath+0x16/0x1b
[501960.292133] INFO: task dpkg:17778 blocked for more than 120 seconds.

3.13:
[ 1560.256491] Call Trace:
[ 1560.256502]  [<ffffffff81120410>] ? __lock_page+0x70/0x70
[ 1560.256509]  [<ffffffff814c96d8>] ? io_schedule+0x88/0xd0
[ 1560.256513]  [<ffffffff81120419>] ? sleep_on_page+0x9/0x10
[ 1560.256517]  [<ffffffff814c9c52>] ? __wait_on_bit+0x52/0x80
[ 1560.256521]  [<ffffffff81120adb>] ? find_get_pages_tag+0xcb/0x180
[ 1560.256526]  [<ffffffff81120533>] ? wait_on_page_bit+0x73/0x80
[ 1560.256531]  [<ffffffff8109c230>] ? wake_atomic_t_function+0x30/0x30
[ 1560.256535]  [<ffffffff81120610>] ? filemap_fdatawait_range+0xd0/0x150
[ 1560.256540]  [<ffffffff8112193c>] ? __filemap_fdatawrite_range+0x4c/0x60
[ 1560.256544]  [<ffffffff81121997>] ? filemap_write_and_wait_range+0x47/0x90
[ 1560.256549]  [<ffffffff811abf8d>] ? generic_file_fsync+0x2d/0xa0
[ 1560.256570]  [<ffffffffa018d3e3>] ? ext4_sync_file+0x153/0x300 [ext4]
[ 1560.256576]  [<ffffffff811b5be3>] ? do_fsync+0x53/0x90
[ 1560.256580]  [<ffffffff811b5e9b>] ? SyS_fsync+0xb/0x20
[ 1560.256586]  [<ffffffff814d4279>] ? system_call_fastpath+0x16/0x1b

3.13 + nosmp:
[  240.236653]  [<ffffffff81120410>] ? __lock_page+0x70/0x70
[  240.236660]  [<ffffffff814c96d8>] ? io_schedule+0x88/0xd0
[  240.236664]  [<ffffffff81120419>] ? sleep_on_page+0x9/0x10
[  240.236668]  [<ffffffff814c9c52>] ? __wait_on_bit+0x52/0x80
[  240.236672]  [<ffffffff81120adb>] ? find_get_pages_tag+0xcb/0x180
[  240.236676]  [<ffffffff81120533>] ? wait_on_page_bit+0x73/0x80
[  240.236681]  [<ffffffff8109c230>] ? wake_atomic_t_function+0x30/0x30
[  240.236685]  [<ffffffff81120610>] ? filemap_fdatawait_range+0xd0/0x150
[  240.236691]  [<ffffffff811b602b>] ? SyS_sync_file_range+0x15b/0x1a0
[  240.236696]  [<ffffffff814d4279>] ? system_call_fastpath+0x16/0x1b
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux