Re: Assertion failed: lru_inode_ctx->block_num > 0

qingwei wei <tchengwee@xxxxxxxxx> · Wed, 7 Dec 2016 09:42:09 +0800

Hi,

I did another test and this time FIO fails with

fio: io_u error on file /mnt/testSF-HDD1/test: Invalid argument: write
offset=114423242752, buflen=8192
fio: pid=10052, err=22/file:io_u.c:1582, func=io_u error, error=Invalid argument

test: (groupid=0, jobs=1): err=22 (file:io_u.c:1582, func=io_u error,
error=Invalid argument): pid=10052: Tue Dec  6 15:18:47 2016

Below is the client log:

[2016-12-06 05:19:31.261289] I [fuse-bridge.c:5171:fuse_graph_setup]
0-fuse: switched to graph 0
[2016-12-06 05:19:31.261355] I [MSGID: 114035]
[client-handshake.c:193:client_set_lk_version_cbk]
0-testSF-HDD-client-5: Server lk version = 1
[2016-12-06 05:19:31.261404] I [fuse-bridge.c:4083:fuse_init]
0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22
kernel 7.22
[2016-12-06 05:19:31.262901] I [MSGID: 108031]
[afr-common.c:2071:afr_local_discovery_cbk] 0-testSF-HDD-replicate-0:
selecting local read_child testSF-HDD-client-1
[2016-12-06 05:19:31.262930] I [MSGID: 108031]
[afr-common.c:2071:afr_local_discovery_cbk] 0-testSF-HDD-replicate-0:
selecting local read_child testSF-HDD-client-0
[2016-12-06 05:19:31.262948] I [MSGID: 108031]
[afr-common.c:2071:afr_local_discovery_cbk] 0-testSF-HDD-replicate-0:
selecting local read_child testSF-HDD-client-2
[2016-12-06 05:19:31.269592] I [MSGID: 108031]
[afr-common.c:2071:afr_local_discovery_cbk] 0-testSF-HDD-replicate-1:
selecting local read_child testSF-HDD-client-3
[2016-12-06 05:19:31.269795] I [MSGID: 108031]
[afr-common.c:2071:afr_local_discovery_cbk] 0-testSF-HDD-replicate-1:
selecting local read_child testSF-HDD-client-4
[2016-12-06 05:19:31.277763] I [MSGID: 108031]
[afr-common.c:2071:afr_local_discovery_cbk] 0-testSF-HDD-replicate-1:
selecting local read_child testSF-HDD-client-5
[2016-12-06 06:58:05.399244] W [MSGID: 101159]
[inode.c:1219:__inode_unlink] 0-inode:
be318638-e8a0-4c6d-977d-7a937aa84806/864c9ea1-3a7e-4d41-899b-f30604a7584e.16284:
dentry not found in 63af10b7-9dac-4a53-aab1-3cc17fff3255
[2016-12-06 15:17:43.311400] E
[shard.c:460:__shard_update_shards_inode_list]
(-->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(shard_common_lookup_shards_cbk+0x2d)
[0x7f5575680fdd]
-->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(shard_link_block_inode+0xdf)
[0x7f5575680f6f]
-->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(__shard_update_shards_inode_list+0x22e)
[0x7f557567c1ce] ) 0-: Assertion failed: lru_inode_ctx->block_num > 0
[2016-12-06 15:17:43.311472] W [inode.c:1232:inode_unlink]
(-->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(shard_link_block_inode+0xdf)
[0x7f5575680f6f]
-->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(__shard_update_shards_inode_list+0x14a)
[0x7f557567c0ea] -->/lib64/libglusterfs.so.0(inode_unlink+0x9c)
[0x7f558386ba0c] ) 0-testSF-HDD-shard: inode not found
[2016-12-06 15:17:43.333456] W [inode.c:1133:inode_forget]
(-->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(shard_link_block_inode+0xdf)
[0x7f5575680f6f]
-->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(__shard_update_shards_inode_list+0x154)
[0x7f557567c0f4] -->/lib64/libglusterfs.so.0(inode_forget+0x90)
[0x7f558386b800] ) 0-testSF-HDD-shard: inode not found
[2016-12-06 15:18:47.129794] W [fuse-bridge.c:2311:fuse_writev_cbk]
0-glusterfs-fuse: 12555429: WRITE => -1
gfid=864c9ea1-3a7e-4d41-899b-f30604a7584e fd=0x7f557016ae6c (Invalid
argument)

Below is the code and it will go to the else block when inode_count is
greater than SHARD_MAX_INODES which is 16384. And my dataset of 400GB
with 16MB shard size has enough shard file (400GB/16MB) to achieve it.
When i do the test with smaller dataset, there is no such error.

shard.c

                if (priv->inode_count + 1 <= SHARD_MAX_INODES) {
                /* If this inode was linked here for the first time (indicated
                 * by empty list), and if there is still space in the priv list,
                 * add this ctx to the tail of the list.
                 */
                        gf_uuid_copy (ctx->base_gfid, base_inode->gfid);
                        ctx->block_num = block_num;
                        list_add_tail (&ctx->ilist, &priv->ilist_head);
                        priv->inode_count++;
                } else {
                /*If on the other hand there is no available slot for this inode
                 * in the list, delete the lru inode from the head of the list,
                 * unlink it. And in its place add this new inode into the list.
                 */
                        lru_inode_ctx = list_first_entry (&priv->ilist_head,
                                                          shard_inode_ctx_t,
                                                          ilist);
                        /* add in message for debug*/
                        gf_msg (THIS->name, GF_LOG_WARNING, 0,
SHARD_MSG_INVALID_FOP,
                        "block number = %d", lru_inode_ctx->block_num);

                        GF_ASSERT (lru_inode_ctx->block_num > 0);

Hopefully can get some advice from you guys on this. Thanks.

Cw

On Tue, Dec 6, 2016 at 9:07 AM, qingwei wei <tchengwee@xxxxxxxxx> wrote:
> Hi,
>
> This is the repost of my email in the gluster-user mailing list.
> Appreciate if anyone has any idea on the issue i have now. Thanks.
>
> I encountered this when i do the FIO random write on the fuse mount
> gluster volume. After this assertion happen, the client log is filled
> with pending frames messages and FIO just show zero IO in the progress
> status. As i leave this test to run overnight, the client log file
> fill up with those pending frame messages and hit 28GB for around 12
> hours.
>
> The client log:
>
> [2016-12-04 15:48:35.274208] W [MSGID: 109072]
> [dht-linkfile.c:50:dht_linkfile_lookup_cbk] 0-testSF-dht: got
> non-linkfile testSF-replicate-0:/.shard/21da7b64-45e5-4c6a-9244-53d0284bf7ed.7038,
> gfid = 00000000-0000-0000-0000-000000000000
> [2016-12-04 15:48:35.277208] W [MSGID: 109072]
> [dht-linkfile.c:50:dht_linkfile_lookup_cbk] 0-testSF-dht: got
> non-linkfile testSF-replicate-0:/.shard/21da7b64-45e5-4c6a-9244-53d0284bf7ed.8957,
> gfid = 00000000-0000-0000-0000-000000000000
> [2016-12-04 15:48:35.277588] W [MSGID: 109072]
> [dht-linkfile.c:50:dht_linkfile_lookup_cbk] 0-testSF-dht: got
> non-linkfile testSF-replicate-0:/.shard/21da7b64-45e5-4c6a-9244-53d0284bf7ed.11912,
> gfid = 00000000-0000-0000-0000-000000000000
> [2016-12-04 15:48:35.312751] E
> [shard.c:460:__shard_update_shards_inode_list]
> (-->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(shard_common_lookup_shards_cbk+0x2d)
> [0x7f86cc42efdd]
> -->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(shard_link_block_inode+0xdf)
> [0x7f86cc42ef6f]
> -->/usr/lib64/glusterfs/3.7.17/xlator/features/shard.so(__shard_update_shards_inode_list+0x22e)
> [0x7f86cc42a1ce] ) 0-: Assertion failed: lru_inode_ctx->block_num > 0
> pending frames:
> frame : type(0) op(0)
> frame : type(0) op(0)
> frame : type(0) op(0)
> frame : type(0) op(0)
> frame : type(0) op(0)
> frame : type(0) op(0)
>
> Gluster info (i am testing this on one server with each disk
> representing one brick, this gluster volume is then mounted locally
> via fuse)
>
> Volume Name: testSF
> Type: Distributed-Replicate
> Volume ID: 3f205363-5029-40d7-b1b5-216f9639b454
> Status: Started
> Number of Bricks: 2 x 3 = 6
> Transport-type: tcp
> Bricks:
> Brick1: 192.168.123.4:/mnt/sdb_mssd/testSF
> Brick2: 192.168.123.4:/mnt/sdc_mssd/testSF
> Brick3: 192.168.123.4:/mnt/sdd_mssd/testSF
> Brick4: 192.168.123.4:/mnt/sde_mssd/testSF
> Brick5: 192.168.123.4:/mnt/sdf_mssd/testSF
> Brick6: 192.168.123.4:/mnt/sdg_mssd/testSF
> Options Reconfigured:
> features.shard-block-size: 16MB
> features.shard: on
> performance.readdir-ahead: on
>
> Gluster version: 3.7.17
>
> The actual disk usage (Is about 91% full):
>
> /dev/sdb1                235G  202G   22G  91% /mnt/sdb_mssd
> /dev/sdc1                235G  202G   22G  91% /mnt/sdc_mssd
> /dev/sdd1                235G  202G   22G  91% /mnt/sdd_mssd
> /dev/sde1                235G  200G   23G  90% /mnt/sde_mssd
> /dev/sdf1                235G  200G   23G  90% /mnt/sdf_mssd
> /dev/sdg1                235G  200G   23G  90% /mnt/sdg_mssd
>
> Anyone encounter this issue before?
>
> Cw
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel