Hi, On Thu, 2012-02-23 at 11:56 -0500, Greg Mortensen wrote: > Hi. > > I'm testing a two-node virtual-host CentOS 6.2 (2.6.32-220.4.2.el6.x86_64) > GFS2 cluster running on the following hardware: > > Two physical hosts, running VMware ESXi 5.0.0 > EqualLogic PS6000XV iSCSI SAN > > I have exported a 200GB shared LUN that the virtual hosts have mounted > as a Mapped Raw LUN (physical compatibility mode) using the LSI Logic > Parallel adapter. The hosts are using clvmd. > > When I have not explicitly set any fence devices, the throughput is > quite fast. Testing the nodes concurrently with bonnie++ (the > sequential block testing is representative of what my real-life > workload will be) shows that it's almost as fast as a "local" ext4 > device. > > When fence_scsi is used (my cluster.conf is included below), the > throughput drops to 1/10th of the no fencing test. Is this normal? > I've tried enabled SCSI debugging while the test was in progress, > and nothing popped out at me. I have tried both manually setting > the arguments to fence_scsi and allowing it to determine them on its > own, with the same results. I also get the same results with one node > brought down, and the other node mounting the filesystem > with lock_nolock. The node network traffic (through a secondary > interface) is minimal. I have tried different mount options (noatime) > and schedulers (deadline works best), but they offer only modest > performance gains. > That is very strange.. I can't see why fence_scsi (or any other fence device) would have an effect on throughput, since if the node is not actually fenced, the path to the device should be the same. Was that the only change between the two tests? Steve. > I don't know if the following is normal or worth mentioning, but I have > seen that nodes will sometime register their keys multiple times. Here > node1 has done it (but node2 has done it before as well): > > # sg_persist --read-keys /dev/sdb > EQLOGIC 100E-00 5.2 > Peripheral device type: disk > PR generation=0x1323d, 4 registered reservation keys follow: > 0x131e0001 > 0x131e0001 > 0x131e0001 > 0x131e0002 > > I'd be interested in hearing if anyone else has experienced poor > throughput with fence_scsi, or if this is a result of my > misconfiguration of cluster.conf. I had wanted to do in-band fencing to > simplify my configuration, but I will consider out-of-band fencing > (perhaps using VMware) if I can't resolve this issue. > > Thanks. > > Best regards, > Greg > > > No explicit fencing: > Version 1.03 ------Sequential Output------ --Sequential Input- --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- > Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP > node1 15744M 70741 82 130502 25 45511 10 53395 75 102164 4 1162 3 > ------Sequential Create------ --------Random Create-------- > -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > 16 1806 28 +++++ +++ 12847 54 1755 31 +++++ +++ 12129 52 > node1,15744M,70741,82,130502,25,45511,10,53395,75,102164,4,1161.5,3,16,1806,28,+++++,+++,12847,54,1755,31,+++++,+++,12129,52 > > Version 1.03 ------Sequential Output------ --Sequential Input- --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- > Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP > node2 15744M 71529 82 122064 20 45717 7 62156 68 102410 > 5 892.6 2 > ------Sequential Create------ --------Random Create-------- > -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > 16 1663 27 +++++ +++ 16188 59 1685 30 +++++ +++ 10400 52 > node2,15744M,71529,82,122064,20,45717,7,62156,68,102410,5,892.6,2,16,1663,27,+++++,+++,16188,59,1685,30,+++++,+++,10400,52 > > > > With fence_scsi fencing: > Version 1.03 ------Sequential Output------ --Sequential Input- --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- > Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP > node1 15744M 9939 12 10372 2 5042 1 11753 17 12220 0 756.4 2 > ------Sequential Create------ --------Random Create-------- > -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > 16 587 10 +++++ +++ 10646 44 597 12 +++++ +++ 11742 52 > node1,15744M,9939,12,10372,2,5042,1,11753,17,12220,0,756.4,2,16,587,10,+++++,+++,10646,44,597,12,+++++,+++,11742,52 > > Version 1.03 ------Sequential Output------ --Sequential Input- --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- > Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP > node2 15744M 10335 12 10740 1 4960 0 11761 13 12197 0 730.9 1 > ------Sequential Create------ --------Random Create-------- > -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > 16 603 10 +++++ +++ 11602 49 610 11 +++++ +++ 12739 51 > node2,15744M,10335,12,10740,1,4960,0,11761,13,12197,0,730.9,1,16,603,10,+++++,+++,11602,49,610,11,+++++,+++,12739,51 > > > > /etc/cluster/cluster.conf: > <?xml version="1.0"?> > <cluster name="centralcluster" config_version="2"> > > <cman two_node="1" expected_votes="1"/> > <clusternodes> > <clusternode name="node1" votes="1" nodeid="1"> > <fence> > <method name="scsi"> > <device name="scsi_dev" key="131e0001" action="off"/> > </method> > </fence> > <unfence> > <device name="scsi_dev" key="131e0001" action="on"/> > </unfence> > </clusternode> > <clusternode name="node2" votes="1" nodeid="2"> > <fence> > <method name="scsi"> > <device name="scsi_dev" key="131e0002" action="off"/> > </method> > </fence> > <unfence> > <device name="scsi_dev" key="131e0002" action="on"/> > </unfence> > </clusternode> > </clusternodes> > > <fencedevices> > <fencedevice agent="fence_scsi" name="scsi_dev" devices="/dev/sdb"/> > </fencedevices> > > <rm> > <failoverdomains/> > <resources/> > </rm> > > <logging> > <logging_daemon name="corosync" debug="on"/> > <logging_daemon name="fenced" debug="on"/> > <logging_daemon name="dlm_controld" debug="on"/> > <logging_daemon name="gfs_controld" debug="on"/> > </logging> > > </cluster> > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster