Hello Vijay,
I’m really sorry to bother you but the situation is really critical for our research jobs. Indeed, since this morning, due to previously described situation, we’ve decided to stop all the production and data access until your script fix the problem.
After having reboot our storage cluster this morning (french time), no more crazy processes, CPU usage is back to the normal and quotas dont seem to grow up (but still contain big errors: > 1TB nay much more); but several quotas are no longer computed (since a couple of hours) as you can read below: [root@lucifer ~]# gluster volume quota vol_home list Path Hard-limit Soft-limit Used Available -------------------------------------------------------------------------------- /derreumaux_team 11.0TB 80% 0Bytes 11.0TB /baaden_team 20.0TB 80% 15.1TB 4.9TB /sterpone_team 14.0TB 80% 0Bytes 14.0TB /amyloid_team 7.0TB 80% 6.4TB 577.5GB /amyloid_team/nguyen 4.0TB 80% 3.7TB 312.7GB /sacquin_team 10.0TB 80% 0Bytes 10.0TB /simlab_team 5.0TB 80% 1.3TB 3.7TB
I dont know your operational hours in India but i think the end-of-day is over, right? I’m really sorry to stress you but we are currently completely under pressure because it’s not a good period to stop the scientific computation and production.
Thanks by advance for your script and for your help. Can I do something accelerate the script development process (coding it myself or something like that)?
Nice evening (or night). Geoffrey
------------------------------------------------------ Geoffrey Letessier Responsable informatique & ingénieur système UPR 9080 - CNRS - Laboratoire de Biochimie Théorique Institut de Biologie Physico-Chimique 13, rue Pierre et Marie Curie - 75005 Paris Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
On Sunday 28 June 2015 01:34 PM,
Geoffrey Letessier wrote:
Hello,
@Krutika: Thanks for transferring my issue.
Everything is becoming completely crazy; other quotas are
exploding. Indeed, after having remove my previous quota in
failure, some other quotas have grown up as you can read below:
[root@lucifer
~]# gluster volume quota vol_home list
Path Hard-limit Soft-limit
Used Available
--------------------------------------------------------------------------------
/baaden_team
20.0TB 90% 15.1TB
4.9TB
/sterpone_team
14.0TB 90% 25.5TB 0Bytes
/simlab_team
5.0TB 90% 1.3TB
3.7TB
/sacquin_team
10.0TB 90% 8.3TB
1.7TB
/admin_team
1.0TB 90% 17.0GB
1007.0GB
/amyloid_team
7.0TB 90% 6.4TB
577.5GB
/amyloid_team/nguyen
4.0TB 90% 3.7TB 312.7GB
[root@lucifer
~]# pdsh -w cl-storage[1,3] du -sh
/export/brick_home/brick*/sterpone_team
cl-storage1:
3,1T /export/brick_home/brick1/sterpone_team
cl-storage1:
2,3T /export/brick_home/brick2/sterpone_team
cl-storage3:
2,7T /export/brick_home/brick1/sterpone_team
cl-storage3:
2,9T /export/brick_home/brick2/sterpone_team
=> ~11TB (not 25.5TB!!!)
[root@lucifer
~]# pdsh -w cl-storage[1,3] du -sh
/export/brick_home/brick*/baaden_team
cl-storage1:
4,2T /export/brick_home/brick1/baaden_team
cl-storage3:
3,7T /export/brick_home/brick1/baaden_team
cl-storage1:
3,6T /export/brick_home/brick2/baaden_team
cl-storage3:
3,5T /export/brick_home/brick2/baaden_team
=> ~15TB (not 14TB).
Etc.
Do you please help me to urgently solve this issue because
this situation is blocking and I must stop the production
until.
Do you think upgrading storage cluster into 3.7.1 (the
latest) version of GlusterFS could fix the problem?
WE need to manually fix this issue. We need to find what are
directory whose quota size is miscalculated and need to fix the
meta-data in the brick. We are writing an automated
script for fixing this issue and will provide the script by eod
IST time
Thanks,
Vijay
Thanks by advance,
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
Copying Vijai and Raghavendra for help...
-Krutika
From:
"Geoffrey Letessier" <geoffrey.letessier@xxxxxxx>
To: "Krutika Dhananjay" <kdhananj@xxxxxxxxxx>
Sent: Saturday, June 27, 2015 2:13:52 AM
Subject: Re: GlusterFS 3.5.3 -
untar: very poor performance
Hi Krutika,
Since I have re-enabled the quota feature on my
volume vol_home, one defined quota is become like
crazy… And it’s a very very very big problem for us.
During all the day, after having re-enabled it, i
noted the used space detected growing up (without
any user IO on)..
[root@lucifer ~]#
gluster volume quota vol_home list|grep
derreumaux_team
/derreumaux_team
14.0TB 80% 13.7TB
357.2GB
[root@lucifer ~]#
gluster volume quota vol_home list
/derreumaux_team
Path
Hard-limit Soft-limit Used
Available
--------------------------------------------------------------------------------
/derreumaux_team
14.0TB 80% 13.1TB
874.1GB
[root@lucifer ~]# pdsh
-w cl-storage[1,3] du -sh
/export/brick_home/brick*/derreumaux_team
cl-storage3: 590G /export/brick_home/brick1/derreumaux_team
cl-storage3: 611G /export/brick_home/brick2/derreumaux_team
cl-storage1: 567G /export/brick_home/brick1/derreumaux_team
cl-storage1: 564G /export/brick_home/brick2/derreumaux_team
As you can see in these 3 command lines, i
obtain 3 different results but, the worse, it’s
quota system est very very far from the real disk
used space (13.7TB <> 13.1TB
<<>> 2.3TB).
Can you please help to fix it very quickly
because all this group is completely block by
exceeded quota.
Thank you so much by advance,
Have a nice week-end,
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie
Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
No but if you are saying it is 3.5.3 rpm
version, then that bug does not exist there.
And still it is weird how you are seeing
such bad performance. :-/
Anything suspicious in the logs?
-Krutika
From: "Geoffrey Letessier"
<geoffrey.letessier@xxxxxxx>
To: "Krutika Dhananjay" <kdhananj@xxxxxxxxxx>
Sent: Friday, June 26, 2015 1:27:16
PM
Subject: Re:
GlusterFS 3.5.3 - untar: very poor
performance
No , it’s the 3.5.3 RPMS version if found on
your reposity (published on novembre 2014).
So, you suggest me to simply
upgrade all servers and clients with the
new 3.5.4 version? Wouldn't it be better
to upgrade all the system (servers and
clients) to the 3.7.1 version?
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique &
ingénieur système
UPR 9080 - CNRS - Laboratoire de
Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005
Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
Also, so are you
running 3.5.3 rpms on the
clients? Or is it a patched
version with more fixes on top
of 3.5.3?
The reason I ask
this is because there was one
performance issue introduced
after 3.5.3 and fixed by 3.5.4
in replication module. I'm
wondering if that could be
causing the issue you
experience.
-Krutika
From: "Geoffrey
Letessier" <geoffrey.letessier@xxxxxxx>
To: "Krutika
Dhananjay" <kdhananj@xxxxxxxxxx>
Sent: Friday,
June 26, 2015 10:05:26 AM
Subject: Re:
GlusterFS
3.5.3 - untar: very poor
performance
Hi Krutika,
Oops, I disable
quota manager without saving
configuration. Could you
tell me how to retrieve
quota list information?
I’m gonna test
the untar in the meantime.
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique
& ingénieur système
UPR 9080 - CNRS -
Laboratoire de Biochimie
Théorique
Institut de Biologie
Physico-Chimique
13, rue Pierre et Marie
Curie - 75005 Paris
Tel: 01 58 41 50 93 -
eMail: geoffrey.letessier@xxxxxxx
Hi,
So i
tried out kernel src
tree untar locally
on a plain replicate
(1x2) volume and it
took me 7m30sec on
an average. This was
on vms and there was
no rdma and there
was no quota
enabled.
Could
you try the same
thing on a volume
without quota to see
if it makes a
difference to the
perf?
-Krutika
From: "Geoffrey
Letessier" <geoffrey.letessier@xxxxxxx>
To: "Krutika
Dhananjay" <kdhananj@xxxxxxxxxx>
Sent: Wednesday,
June 24, 2015
10:21:13 AM
Subject:
Re:
GlusterFS 3.5.3 -
untar: very poor
performance
Hi Krutika,
OK,
thank you very
much by advance.
Concerning
quota system, are
you in touch with
Vijaykumar?
Because I’m still
waiting for a
answer since a
couple of days,
nay more.
One
more time, thank
you.
Have a
nice day (in
France it’s 6:50
AM).
Geoffrey
-----------------------------------------------
Geoffrey
Letessier
Responsable
informatique
&
ingénieur
système
CNRS - UPR
9080 -
Laboratoire
de Biochimie
Théorique
Institut de
Biologie
Physico-Chimique
13, rue Pierre
et Marie Curie
- 75005 Paris
Tel: 01 58 41
50 93 -
eMail: geoffrey.letessier@xxxxxxx
Ok
so for
anything
related to
replication, I
could help you
out.
But
for quota, it
would be
better to ask
Vijaikumar
Mallikarjuna
or Raghavendra
G on the
mailing list.
I
used to work
on quota, long
time back. But
now I am not
in touch with
the component
anymore and do
not know of
the latest
changes to it.
For
the
performance
issue, I will
try linux
kernel src
untar on my
machines and
let you know
what I find.
-Krutika
From:
"Geoffrey
Letessier"
<geoffrey.letessier@xxxxxxx>
To:
"Krutika
Dhananjay"
<kdhananj@xxxxxxxxxx>
Sent:
Monday,
June 22, 2015
9:00:52 PM
Subject:
Re:
GlusterFS
3.5.3 - untar:
very poor
performance
Hi Krutika,
Sorry
for the delay
but i was in
meeting all
the day.
Good
to hear from
you as well.
:)
;-)
So
you are seeing
this bad
performance
only in 3.5.3?
Any other
releases you
tried this
test on, where
the results
were much
better with
replication?
Yes
but I’m not
sure my issue
is only
concerning
this specific
release. A few
days ago, the
untar process
(with the same
version of
GlusterFS)
took around 8
minutes, now
around 32
minutes. 8 was
too much but
what about 32
minutes? :)
That
said, my
matter is only
concerning
small files
because if i
play with dd
(or other)
with only 1
big file all
is OK (client
write
throughput:
~1GBs =>
~500MBs in
each replica)
If
i run my bench
on my only
distributed
volume i get a
good
performance
(untar:
~1m44s,
etc.)..
In
addition, i
dunno if it
can be
important, I
have some
troubles with
GlusterFS
group quota:
there are a
lot of
conflicts
between quota
size and
actual file
size which
dont match and
a lot of
"quota xattrs
not found"
messages with
quota-verify
glusterfs app.
-you can find
in attachment
an extract of
quota-verify
outputs.
If
so, could you
please let me
know?
Meanwhile let
me try the
untar myself
on my vms to
see what could
be causing the
perf issue.
OK,
thanks.
See
you,
Geoffrey
------------------------------------------------------
Geoffrey
Letessier
Responsable
informatique &
ingénieur
système
UPR 9080 -
CNRS -
Laboratoire de
Biochimie Théorique
Institut de
Biologie
Physico-Chimique
13, rue Pierre
et Marie Curie
- 75005 Paris
Tel: 01 58 41
50 93 -
eMail: geoffrey.letessier@xxxxxxx
Hi
Geoffrey,
Good
to hear from
you as well.
:)
Ok
so you say
disabling
write-behind
does not help.
Makes me
wonder what
the problem
could be.
So
you are seeing
this bad
performance
only in 3.5.3?
Any other
releases you
tried this
test on, where
the results
were much
better with
replication?
If
so, could you
please let me
know?
Meanwhile let
me try the
untar myself
on my vms to
see what could
be causing the
perf issue.
-Krutika
From:
"Geoffrey
Letessier"
<geoffrey.letessier@xxxxxxx>
To:
"Krutika
Dhananjay"
<kdhananj@xxxxxxxxxx>
Sent:
Monday,
June 22, 2015
10:14:26 AM
Subject:
Re:
GlusterFS
3.5.3 - untar:
very poor
performance
Hi Krutika,
It’s
good to read
you again :)
Here
are my
answers:
1-
could you
remind me how
to know if
self-heal is
currently in
progress? I
dont note any
special
neither
mount-point
(except
/var/run/gluster/vol_home
one) nor
dedicated
process; but
maybe i look
in the wrong
place..
2-
OK, I just
disabled
write-behind
parameter and
rerun the
bench. I’ll
let you know
more about
when I will
arrive at my
office (I’m
still at home
at this time).
See
you and thanks
you for
helping.
Geoffrey
-----------------------------------------------
Geoffrey
Letessier
Responsable
informatique
&
ingénieur
système
CNRS - UPR
9080 -
Laboratoire
de Biochimie
Théorique
Institut de
Biologie
Physico-Chimique
13, rue Pierre
et Marie Curie
- 75005 Paris
Tel: 01 58 41
50 93 -
eMail: geoffrey.letessier@xxxxxxx
Hi
Geoffrey,
1.
Was self-heal
also in
progress while
I/O was
happening on
the volume?
2.
Also, there
seem to be
quite a few
fsyncs which
could possibly
have slowed
things down a
bit. Could you
disable
write-behind
and try
getting the
time stats one
more time to
eliminate the
possibility of
write-behind's
presence
causing
out-of-order
writes to
increase the
number of
fsyncs
by the
replication
module.
-Krutika
From:
"Geoffrey
Letessier"
<geoffrey.letessier@xxxxxxx>
To:
gluster-users@xxxxxxxxxxx
Sent:
Saturday,
June 20, 2015
6:04:40 AM
Subject:
Re:
GlusterFS
3.5.3 - untar:
very poor
performance
Re,
For
comparison,
here is the
output of the
same script
run on a
distributed
only volume (2
servers of the
4 previously
described, 2
bricks each):
#######################################################
################
UNTAR time
consumed
################
#######################################################
real 1m44.698s
user 0m8.891s
sys 0m8.353s
#######################################################
#################
DU time
consumed
##################
#######################################################
554M linux-4.1-rc6
real 0m21.062s
user 0m0.100s
sys 0m1.040s
#######################################################
#################
FIND time
consumed
################
#######################################################
52663
real 0m21.325s
user 0m0.104s
sys 0m1.054s
#######################################################
#################
GREP time
consumed
################
#######################################################
7952
real 0m43.618s
user 0m0.922s
sys 0m3.626s
#######################################################
#################
TAR time
consumed
#################
#######################################################
real 0m50.577s
user 0m29.745s
sys 0m4.086s
#######################################################
#################
RM time
consumed
##################
#######################################################
real 0m41.133s
user 0m0.171s
sys 0m2.522s
The
performances
are amazing
different!
Geoffrey
-----------------------------------------------
Geoffrey
Letessier
Responsable
informatique
&
ingénieur
système
CNRS - UPR
9080 -
Laboratoire
de Biochimie
Théorique
Institut de
Biologie
Physico-Chimique
13, rue Pierre
et Marie Curie
- 75005 Paris
Tel: 01 58 41
50 93 -
eMail: geoffrey.letessier@xxxxxxx
Dear all,
I
just noticed
on my main
volume of my
HPC cluster my
IO operations
become
impressively
poor..
Doing
some file
operations
above a linux
kernel sources
compressed
file, the
untar
operation can
take more than
1/2 hours for
this file
(roughly 80MB
and 52 000
files inside)
as you read
below:
#######################################################
################
UNTAR time
consumed
################
#######################################################
real 32m42.967s
user 0m11.783s
sys 0m15.050s
#######################################################
#################
DU time
consumed
##################
#######################################################
557M linux-4.1-rc6
real 0m25.060s
user 0m0.068s
sys 0m0.344s
#######################################################
#################
FIND time
consumed
################
#######################################################
52663
real 0m25.687s
user 0m0.084s
sys 0m0.387s
#######################################################
#################
GREP time
consumed
################
#######################################################
7952
real 2m15.890s
user 0m0.887s
sys 0m2.777s
#######################################################
#################
TAR time
consumed
#################
#######################################################
real 1m5.551s
user 0m26.536s
sys 0m2.609s
#######################################################
#################
RM time
consumed
##################
#######################################################
real 2m51.485s
user 0m0.167s
sys 0m1.663s
For
information,
this volume is
a distributed
replicated one
and is
composed by 4
servers with 2
bricks each.
Each bricks is
a 12-drives
RAID6 vdisk
with nice
native
performances
(around
1.2GBs).
In
comparison,
when I use DD
to generate a
100GB file on
the same
volume, my
write
throughput is
around 1GB
(client side)
and 500MBs
(server side)
because of
replication:
Client
side:
[root@node056
~]# ifstat -i
ib0
ib0
KB/s
in KB/s out
3251.45
1.09e+06
3139.80
1.05e+06
3185.29
1.06e+06
3293.84
1.09e+06
...
Server
side:
[root@lucifer
~]# ifstat -i
ib0
ib0
KB/s
in KB/s out
561818.1 1746.42
560020.3 1737.92
526337.1 1648.20
513972.7 1613.69
...
DD
command:
[root@node056
~]# dd
if=/dev/zero
of=/home/root/test.dd
bs=1M
count=100000
100000+0
enregistrements
lus
100000+0
enregistrements
écrits
104857600000
octets (105
GB) copiés,
202,99 s, 517
MB/s
So
this issue
doesn’t seem
coming from
the network
(which is
Infiniband
technology in
this case)
You
can find in
attachments a
set of files:
- mybench.sh:
the bench
script
- benches.txt:
output of my
"bench"
- profile.txt:
gluster volume
profile during
the "bench"
- vol_status.txt:
gluster volume
status
- vol_info.txt:
gluster volume
info
Can
someone help
me to fix it
(it’s very
critical
because this
volume is on a
HPC cluster in
production).
Thanks
by advance,
Geoffrey
-----------------------------------------------
Geoffrey
Letessier
Responsable
informatique
&
ingénieur
système
CNRS - UPR
9080 -
Laboratoire
de Biochimie
Théorique
Institut de
Biologie
Physico-Chimique
13, rue Pierre
et Marie Curie
- 75005 Paris
Tel: 01 58 41
50 93 -
eMail: geoffrey.letessier@xxxxxxx
<benches.txt>
<mybench.sh>
<profile.txt>
<vol_info.txt>
<vol_status.txt>
_______________________________________________
Gluster-users
mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
|