problems running a vol over IPoIB, and qemu off it?

lejeczek <peljasz@xxxxxxxxxxx> · Mon, 23 Oct 2017 16:59:05 +0100

hi people

I wonder if anybody experience any problems with vols in 
replica mode that run across IPoIB links and libvirt stores 
qcow image on such a volume?

I wonder if maybe devel could confirm it should just work, 
and then hardware/Infiniband I should blame.

I have a direct IPoIB link between two hosts, gluster 
replica volume, libvirt store disk images there.

I start a guest on hostA and I get below on hostB(which is 
IB subnet manager):

[Mon Oct 23 16:43:32 2017] Workqueue: ipoib_wq 
ipoib_cm_tx_start [ib_ipoib]
[Mon Oct 23 16:43:32 2017]  0000000000008010 
00000000553c90b1 ffff880c1c6eb818 ffffffff816a3db1
[Mon Oct 23 16:43:32 2017]  ffff880c1c6eb8a8 
ffffffff81188810 0000000000000000 ffff88042ffdb000
[Mon Oct 23 16:43:32 2017]  0000000000000004 
0000000000008010 ffff880c1c6eb8a8 00000000553c90b1
[Mon Oct 23 16:43:32 2017] Call Trace:
[Mon Oct 23 16:43:32 2017]  [<ffffffff816a3db1>] 
dump_stack+0x19/0x1b
[Mon Oct 23 16:43:32 2017]  [<ffffffff81188810>] 
warn_alloc_failed+0x110/0x180
[Mon Oct 23 16:43:32 2017]  [<ffffffff8169fd8a>] 
__alloc_pages_slowpath+0x6b6/0x724
[Mon Oct 23 16:43:32 2017]  [<ffffffff8118cd85>] 
__alloc_pages_nodemask+0x405/0x420
[Mon Oct 23 16:43:32 2017]  [<ffffffff81030f8f>] 
dma_generic_alloc_coherent+0x8f/0x140
[Mon Oct 23 16:43:32 2017]  [<ffffffff81065c0d>] 
gart_alloc_coherent+0x2d/0x40
[Mon Oct 23 16:43:32 2017]  [<ffffffffc012e4d3>] 
mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core]
[Mon Oct 23 16:43:32 2017]  [<ffffffffc012e76b>] 
mlx4_buf_alloc+0x1cb/0x240 [mlx4_core]
[Mon Oct 23 16:43:32 2017]  [<ffffffffc04dd85e>] 
create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib]
[Mon Oct 23 16:43:32 2017]  [<ffffffffc04de44e>] 
mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib]
[Mon Oct 23 16:43:32 2017]  [<ffffffffc06df20c>] ? 
ipoib_cm_tx_init+0x5c/0x400 [ib_ipoib]
[Mon Oct 23 16:43:32 2017]  [<ffffffffc0639c3a>] 
ib_create_qp+0x7a/0x2f0 [ib_core]
[Mon Oct 23 16:43:32 2017]  [<ffffffffc06df2b3>] 
ipoib_cm_tx_init+0x103/0x400 [ib_ipoib]
[Mon Oct 23 16:43:32 2017]  [<ffffffffc06e1608>] 
ipoib_cm_tx_start+0x268/0x3f0 [ib_ipoib]
[Mon Oct 23 16:43:32 2017]  [<ffffffff810a881a>] 
process_one_work+0x17a/0x440
[Mon Oct 23 16:43:32 2017]  [<ffffffff810a94e6>] 
worker_thread+0x126/0x3c0
[Mon Oct 23 16:43:32 2017]  [<ffffffff810a93c0>] ? 
manage_workers.isra.24+0x2a0/0x2a0
[Mon Oct 23 16:43:32 2017]  [<ffffffff810b098f>] 
kthread+0xcf/0xe0
[Mon Oct 23 16:43:32 2017]  [<ffffffff810b08c0>] ? 
insert_kthread_work+0x40/0x40
[Mon Oct 23 16:43:32 2017]  [<ffffffff816b4f58>] 
ret_from_fork+0x58/0x90
[Mon Oct 23 16:43:32 2017]  [<ffffffff810b08c0>] ? 
insert_kthread_work+0x40/0x40
[Mon Oct 23 16:43:32 2017] Mem-Info:
[Mon Oct 23 16:43:32 2017] active_anon:2389656 
inactive_anon:17792 isolated_anon:0
 active_file:14294829 inactive_file:14609973 isolated_file:0
 unevictable:24185 dirty:11846 writeback:9907 unstable:0
 slab_reclaimable:1024309 slab_unreclaimable:127961
 mapped:74895 shmem:28096 pagetables:30088 bounce:0
 free:142329 free_pcp:249 free_cma:0
[Mon Oct 23 16:43:32 2017] Node 0 DMA free:15320kB min:24kB 
low:28kB high:36kB active_anon:0kB inactive_anon:0kB 
active_file:0kB inactive_file:0kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:15984kB 
managed:15900kB mlocked:0kB dirty:0kB writeback:0kB 
mapped:0kB shmem:0kB slab_reclaimable:0kB 
slab_unreclaimable:64kB kernel_stack:0kB pagetables:0kB 
unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB 
free_cma:0kB writeback_tmp:0kB pages_scanned:0 
all_unreclaimable? yes

To clarify - other volumes which use that IPoIB link do not 
seem to case that, or any other problem.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users