Creating a large pre-allocated qemu-img raw image takes too long and fails on fuse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for the work on gluster.

 

We have a situation where we need a very large virtual machine image. We use a simple raw image but it can be up to 40T in size in some cases. For this experiment we’ll call it 24T.

 

When creating the image on fuse with qemu-img, using falloc preallocation, the qemu-img create fails and a fuse error results. This happens after around 3 hours.

 

I created a simple C program using gfapi that does the fallocate of 10T and it to 1.25 hours. I didn’t run tests at larger than that as 1.25 hours is too long anyway.

 

Using qemu-img in prellocation-falloc gfapi mode takes a long time too – similar to qemu-img in gfapi mode.

 

However, I found if I create a 2.4T image file and then do 9 more resizes to bring it up to the full desired size (24T in this case), it only takes like 16 minutes total  (I did this on the fuse mount). This includes the first 2.4T qemu-img create (prealloc falloc), followed by 9 resize +2.4T runs.

 

We are avoiding a non-prellocated image as we have had trouble with people assuming available disk space “is available” and running bricks out of space by accident.

 

We would like to avoid the kludge of calling qemu-img 10 times (or more) to make a larger fallocated image. If there are suggested methods or tunings, please let me know!

 

We are currently at gluster 9.3

 

Volume setup:

[root@nano-1 images]# gluster volume info adminvm

 

Volume Name: adminvm

Type: Replicate

Volume ID: e09122b9-8bc4-409b-a423-7596feebf941

Status: Started

Snapshot Count: 0

Number of Bricks: 1 x 3 = 3

Transport-type: tcp

Bricks:

Brick1: 172.23.254.181:/data/brick_adminvm

Brick2: 172.23.254.182:/data/brick_adminvm

Brick3: 172.23.254.183:/data/brick_adminvm

Options Reconfigured:

performance.client-io-threads: on

nfs.disable: on

transport.address-family: inet

storage.fips-mode-rchecksum: on

cluster.granular-entry-heal: enable

performance.quick-read: off

performance.read-ahead: off

performance.io-cache: off

performance.low-prio-threads: 32

network.remote-dio: disable

performance.strict-o-direct: on

cluster.eager-lock: enable

cluster.quorum-type: auto

cluster.server-quorum-type: server

cluster.data-self-heal-algorithm: full

cluster.locking-scheme: granular

cluster.shd-max-threads: 8

cluster.shd-wait-qlength: 10000

features.shard: on

user.cifs: off

cluster.choose-local: off

client.event-threads: 4

server.event-threads: 4

network.ping-timeout: 20

server.tcp-user-timeout: 20

server.keepalive-time: 10

server.keepalive-interval: 2

server.keepalive-count: 5

cluster.lookup-optimize: off

network.frame-timeout: 10800

performance.io-thread-count: 32

storage.owner-uid: 107

storage.owner-gid: 107

________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux