Hi, On Mon, 2012-02-13 at 15:06 +0100, Laszlo Beres wrote: > Hi Steven, > > On Mon, Feb 13, 2012 at 10:38 AM, Steven Whitehouse <swhiteho@xxxxxxxxxx> wrote: > > > I'd be very interested to know what has not worked for you. Please open > > a ticket with our support team if you are a Red Hat customer. We are > > keen to ensure that people don't run into such issues, but we'll need a > > bit more information in order to investigate, > > Well, we're still in the investigation phase, so as long as I don't > have direct evidences in my hand I don't want to bother Red Hat > support. Let me briefly summarize our case. > It is still worth talking to our support team, since they may well be able to suggest things to look into, or may have solved a similar problem. They are there to assist even if you don't actually have a bug as such to report. > I set up a two noded cluster both on RHEL 5.7 x86_64 on HP DL380 G7 > (96 GB RAM, 4 Intel Xeon X5675 / 6 cores) with EMC VMAX storage (on > dm-multipath). We launched our Tibco service in December which has > been working so far without any issues. A few days ago our customers > reported that the service is "slow" - as a system engineer I could not > really do anything with such a statement, so our application guys > deployed a small tool which sends messages to the message bus, and > also measures the message delivery delay. Most of the times it's > around 1-2 ms, but surprisingly found extra high values, which have > affect on dependant applications: > > 2012-02-08 22:12:31,919 INFO [Main] Message sent in: 1ms > 2012-02-08 22:12:33,974 ERROR [Main] Message sent in: 1053ms > 2012-02-08 22:12:34,978 INFO [Main] Message sent in: 2ms > 2012-02-08 22:12:35,980 INFO [Main] Message sent in: 1ms > > At the same time iostat show high utilization: > > 2012-02-08 22:12:33 avg-cpu: %user %nice %system %iowait %steal %idle > 2012-02-08 22:12:33 0,12 0,00 0,37 4,00 0,00 95,51 > 2012-02-08 22:12:33 > 2012-02-08 22:12:33 Device: rrqm/s wrqm/s r/s w/s > rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util > 2012-02-08 22:12:33 sdb 0,00 0,00 0,00 0,00 > 0,00 0,00 0,00 0,00 0,00 0,00 0,00 > 2012-02-08 22:12:33 sde 0,00 0,00 0,00 0,00 > 0,00 0,00 0,00 0,00 0,00 0,00 0,00 > 2012-02-08 22:12:33 sdh 0,00 0,00 0,00 0,00 > 0,00 0,00 0,00 0,00 0,00 0,00 0,00 > 2012-02-08 22:12:33 sdk 0,00 0,00 0,00 0,00 > 0,00 0,00 0,00 0,00 0,00 0,00 0,00 > 2012-02-08 22:12:33 sdn 0,00 0,00 0,00 0,00 > 0,00 0,00 0,00 0,00 0,00 0,00 0,00 > 2012-02-08 22:12:33 sdq 0,00 0,00 0,00 0,00 > 0,00 0,00 0,00 0,00 0,00 0,00 0,00 > 2012-02-08 22:12:33 sdt 0,00 0,00 56,00 0,00 > 0,34 0,00 12,43 0,12 2,21 2,20 12,30 > 2012-02-08 22:12:33 sdw 2,00 39,00 337,00 28,00 > 1,84 0,12 10,98 2,40 2,75 2,34 85,40 > 2012-02-08 22:12:33 dm-5 0,00 0,00 394,00 201,00 > 2,18 0,79 10,19 2,98 1,94 1,65 97,90 > > >From our service point of view these are off-peak times, less messages > were directed to EMS. > > All unimportant services are disabled, no jobs are scheduled in system. > > Also the occurence pattern is quite strange: > > 2012-02-10 03:01:24,333 ERROR [Main] Message sent in: 2306ms > 2012-02-10 03:02:04,953 ERROR [Main] Message sent in: 1221ms > 2012-02-10 03:11:29,725 ERROR [Main] Message sent in: 1096ms > 2012-02-10 03:31:35,195 ERROR [Main] Message sent in: 1051ms > 2012-02-10 03:36:36,943 ERROR [Main] Message sent in: 1263ms > 2012-02-10 04:01:17,059 ERROR [Main] Message sent in: 1585ms > 2012-02-10 04:01:41,953 ERROR [Main] Message sent in: 1790ms > 2012-02-10 04:02:42,953 ERROR [Main] Message sent in: 1307ms > 2012-02-10 04:06:03,181 ERROR [Main] Message sent in: 2305ms > 2012-02-10 04:08:48,844 ERROR [Main] Message sent in: 1294ms > 2012-02-10 04:12:52,282 ERROR [Main] Message sent in: 1350ms > 2012-02-10 05:01:00,411 ERROR [Main] Message sent in: 1143ms > 2012-02-10 06:01:01,550 ERROR [Main] Message sent in: 1291ms > 2012-02-10 06:01:48,957 ERROR [Main] Message sent in: 1092ms > > Do you have any backup scripts running and/or any other cron jobs which might touch the GFS2 filesystem at certain times? That is usually the first thing to look into, Steve. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster