Thin-provisioned LVs throughput

Moshe Lazarov <Moshe.Lazarov@xxxxxxxxxx> · Thu, 27 Jul 2017 23:02:05 +0000

Hi,
I am Moshe Lazarov. A researcher in Axxana (https://www.axxana.com). We have developed a Black Box to for Zero Data Loss in asynchronous replications
 systems. 
We consider integrating the LVM2 and its thinly-provisioned and snapshots mechanisms to our product.

Lately I have been running several experiments with LVs and snapshots, using LVM2 (LVM2.2.02.98) and Linux Kernel 4.9.11.
The platforms which I run the tests on are VM (on Dell server carrying intel Xeon E5-2620 CPUs with 2 SAS HDDs) and Axxana’s Black Box (“BBX”); a unique motherboard carrying intel
 i7-3517UE CPU and 2 SAS SSDs.
Both platforms run the same scripts and tester application for creating the destinations (LVs and snapshots) and writing to them. The VG (that include all the LVs and snapshots) is
 stripped (64KB) over the 2 drives.
In the graph below one can see the throughput of writing 1GB (each time a different and random 1GB is written) to each of the destinations in the VM or in the BBX:
SlowLV – A linear LV.
T0 1^st – 1^st write to a thinly-provisioned LV. Since the allocation of a chunk is initiated in this access, the downgraded throughput is clear.
T0 – Consecutive writes to T0 (thinly-provisioned LV).
T0S0 1^st – 1^st write to a thinly-provisioned snapshot of T0 (the thinly-provisioned LV above). Since the allocation of a chunk is initiated in this access, the
 downgraded throughput is clear.
T0S0 – Consecutive writes to T0S0 (the snapshot of thinly-provisioned LV).

IMHO, the degradation of the throughput between writes to the SlowLV and writes to T0 or T0S0 is extremely high; while the BBX writes to the LV in ~740MBps, the writes to other destinations
 is limited to ~400Mbps (almost 50% degradation). On the VM there ~17% degradation at least.
Since both platforms run the same code and reach around the same write throughput, it seems that the SW/Kernel is the bottleneck for achieving higher throughput.
What is your opinion about that?
In addition, I added the output of IOSTAT while executing writes to T0 (thin-provisioned LV).
The following legend applies:
vg-slowlv                                             (252:0)
vg-ramlv1                                           (252:1)
vg-ramlv2                                           (252:2)
vg-poolThinDataLV_tmeta           (252:3)
vg-poolThinDataLV_tdata             (252:4)
vg-poolThinDataLV-tpool              (252:5)
vg-poolThinDataLV                          (252:6)
vg-t0                                                      (252:7)
vg-t0s0                                                 (252:8)

1GB of data was written during 3 IOSTAT sampling windows (samples every 1 second). See in yellow in the 2^nd second window; data was written to dm-4 (poolThinDataLV_tdata)
 and dm-5 (poolThinDataLV-tpool) at 557MBps, while it was written to dm-7 (t0) in 820MBps.
During the 1^st second window, the behavior was different; data was written to dm-4, dm-5 and dm-7 at the same throughput (average of 204MBps over this 1 second window).
In other cases, data is written to the 3 destinations at the same speed during all three 1 second windows.
What could be the reason for the behavior in the 2^nd window?
Is data really written twice to dm-4 and dm-5 (one time for each), or is it the same write?
Can throughput be improved by increasing the request size (write in larger packets, how?)?

avg-cpu:

%user

%nice

%system

%iowait

%steal

%idle

0.26

0

30.08

0

0

69.67

Device:

rrqm/s

wrqm/s

r/s

w/s

rMB/s

wMB/s

avgrq-sz

avgqu-sz

await

r_await

w_await

svctm

%util

sda

0

0

19

0

0.07

0

8

0.04

2.11

2.11

0

0.21

0.4

sdb

0

24785

14

1312

0.05

96.31

148.84

5.69

4.17

0.29

4.21

0.44

58.4

sdc

0

24772

1

1312

0

96.31

150.23

5.52

4.07

0

4.08

0.44

58.4

dm-0

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-1

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-2

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-3

0

0

15

0

0.06

0

8

0

0.27

0.27

0

0.27

0.4

dm-4

0

0

0

52224

0

204

8

1253.31

23.21

0

23.21

0.01

58.4

dm-5

0

0

0

52224

0

204

8

1253.35

23.21

0

23.21

0.01

58.4

dm-7

0

0

0

52225

0

204

8

1255.08

23.25

0

23.25

0.01

58.4

dm-8

0

0

0

0

0

0

0

0

0

0

0

0

0

avg-cpu:

%user

%nice

%system

%iowait

%steal

%idle

0

0

57.37

3.68

0

38.95

Device:

rrqm/s

wrqm/s

r/s

w/s

rMB/s

wMB/s

avgrq-sz

avgqu-sz

await

r_await

w_await

svctm

%util

sda

0

0

1

0

0

0

8

0

0

0

0

0

0

sdb

0

18031

20

53270

0.08

284.15

10.92

52.74

0.99

3.4

0.99

0.02

99.6

sdc

0

18044

27

53264

0.11

284.12

10.92

32.19

0.61

0.59

0.61

0.02

99.2

dm-0

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-1

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-2

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-3

0

0

47

0

0.18

0

8

0.08

1.79

1.79

0

0.6

2.8

dm-4

0

0

0

142612

0

557.08

8

1505.75

10.85

0

10.85

0.01

100

dm-5

0

0

0

142613

0

557.08

8

1545.26

10.85

0

10.85

0.01

100

dm-7

0

0

0

209919

0

820

8

35341.8

87.93

0

87.93

0

100

dm-8

0

0

0

0

0

0

0

0

0

0

0

0

0

avg-cpu:

%user

%nice

%system

%iowait

%steal

%idle

0.51

0

30.61

9.95

0

58.93

Device:

rrqm/s

wrqm/s

r/s

w/s

rMB/s

wMB/s

avgrq-sz

avgqu-sz

await

r_await

w_await

svctm

%util

sda

0

8

1291

4

5.83

0.05

9.3

1.62

1.25

1.25

2

0.12

15.2

sdb

0

0

0

33674

0

131.54

8

9.09

0.27

0

0.27

0.01

43.2

sdc

0

0

0

33680

0

131.56

8

7

0.21

0

0.21

0.01

43.6

dm-0

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-1

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-2

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-3

0

0

0

0

0

0

0

0

0

0

0

0

0

dm-4

0

0

0

67308

0

262.92

8

15.96

0.24

0

0.24

0.01

43.6

dm-5

0

0

0

67307

0

262.92

8

16.2

0.24

0

0.24

0.01

43.6

dm-7

0

0

0

0

0

0

0

14886.74

0

0

0

0

43.6

dm-8

0

0

0

0

0

0

0

0

0

0

0

0

0

I hope the information is clear.
I would appreciate your response to the questions I’ve raised above.

Thanks a lot,
-Moshe
----------------------------------------
Moshe Lazarov
Axxana

C:    +1-669-213-9752
F:    +972-74-7887878
www.axxana.com

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel