On 20/08/2016 9:28 PM, Pranith Kumar Karampuri wrote:
Lindsay,
Please do "gluster volume set <volname>
data-self-heal-algorithm full" to prevent diff self-heals(checksum
computations on the files) which use a lot of CPU if not already.
I'll givbe that a spin and see how it works out - toss up as to which is
a bigger resource problem, CPU or bandwidth :)
One more thing that could have lead to lot of CPU is full directory
heals on .shard. Krutika recently implemented a feature called
granular entry self-heal which should address this issue in future. We
have throttling feature coming along in future as well to play nice
with rest of the system.
I already have "cluster.granular-entry-heal: on" and
"cluster.locking-scheme: granular" set, or are you saying that feature
has improvements yet to come?
Anyway, I'm not really looking at cpu hogging (well not much anyway :)),
rather I was trying to find why heal were not starting. With my first
test I had 25000 shard needing healing and nothing happened for over 3
hours untill I shutdown all vm's on the ndoe and restarted it.
I did the same test yesterday
- killed all gluster processes on a node
- waited to heal-count rose to 1500
- restarted gluster on that node
- nothing happened for 45 minutes (heal-count stayed at 1500).
- I shutdown all VM's on that node
- healing started withint several minutes and completed in under half an
hour
Which leads me to wonder if having active local I/O on a gluster node
when you crash and restarted the gluster processes (as opposed to
rebooting the node) blocks the heals from starting.
If so, not a huge issue for me - typically that will never happen as
gluster never actually crashes on me :) The most likely scenario is
rolling upgrades or hard reboots.
gluster v info
Volume Name: datastore4
Type: Replicate
Volume ID: 0ba131ef-311d-4bb1-be46-596e83b2f6ce
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: vnb.proxmox.softlog:/tank/vmdata/datastore4
Brick2: vng.proxmox.softlog:/tank/vmdata/datastore4
Brick3: vna.proxmox.softlog:/tank/vmdata/datastore4
Options Reconfigured:
cluster.locking-scheme: granular
cluster.granular-entry-heal: on
performance.readdir-ahead: on
cluster.self-heal-window-size: 1024
cluster.data-self-heal: on
features.shard: on
cluster.quorum-type: auto
cluster.server-quorum-type: server
nfs.disable: on
nfs.addr-namelookup: off
nfs.enable-ino32: off
performance.strict-write-ordering: off
performance.stat-prefetch: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
cluster.eager-lock: enable
network.remote-dio: enable
features.shard-block-size: 64MB
cluster.background-self-heal-count: 16
--
Lindsay Mathieson
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users