> From: (Imed Chihi) ???? ?????? Â<imed.chihi@xxxxxxxxx> > Subject: Re: RHEL4 Sun Java Messaging Server deadlock (was: Humm.. I have looked at my last message and I think that I have assumed too much. I believe some parts could sound a bit cryptic. Here are a few comments which should shed some light on the reasoning behind my "theories". > Based on the above, I could suggest two theories to explain what's happening: > > 1. you have a Normal zone starvation > Try to set vm.lower_zone_protection to something large enough like 100 MB: > sysctl -w vm.lower_zone_protection 100 > If this theory is correct, then the setting should fix the issue. On 32-bit platforms, and for historical reasons, physical memory is divided into 3 "zones". Zones are parts of the physical memory which need to be managed in different ways. In your setup (32-bit with hugemem kernel), the Normal zone is 4GB in size. The Normal zone is a bit special in the sense that some kernel allocations can only take place in this zone: typically buffers allocated for disk and network IO. What could happen under stress is to run out of memory in this Normal zone alone. The result would show as a system coming to crawl and near deadlock despite having plenty of free memory. Therefore, having free memory in the "wrong" zone would not help. As the Normal zone can also take allocations for regular processes, we could instruct the memory allocators to avoid filling this Normal zone with allocations that can be services elsewhere (HighMem zone). In your case, you seem to have exhausted the Normal zone (0.5% free), hence the suggested parameter. > 2. you have a pagecache flushing storm > A huge size of dirty pages from the IO of large data sets would stall > the system while being sync'ed to disk. ÂThis typically occurs once > the pagecache size has grown to significant sizes. ÂMounting the > filesystem in sync mode (mount -oremount,sync /dev/device) would "fix" > the issue. ÂHowever, synchronous IO is painfully slow, but the test > would at least tell where the problem is. ÂIf this turns out to be the > problem, then we could think of other less annoying options for a > bearable fix. The Linux virtual memory manager would cache file system IO as long as there is free memory. This cache goes into the pagecache which is a set of memory pages dynamically sized to accommodate the requirements. When there is "memory pressure", that is, a situation where memory in at least one zone becomes seriously scarce, the VM would try to free pages aggressively; typically, it translates into behaviours like direct reclaim (freeing memory while a request to allocate memory is waiting), scanning all pages in a zone repeatedly, etc. This "aggressiveness" could result into saturating the IO subsystem for quite some time while trying to flush a very large pagecache into disk in the hope of freeing memory. With a system like yours this pagecache could be something like 10 or 20GB. Therefore, writing so much data to disks would stall the system for quite a long time. The suggestion would remove the biggest cause of filling the pagecache by forcing synchronous IO and avoiding storms of very large writes to the disk. I hope this is less confusing. -Imed -- Imed Chihi - ØÙØØ ØÙØÙØÙ http://perso.hexabyte.tn/ichihi/ -- redhat-list mailing list unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe https://www.redhat.com/mailman/listinfo/redhat-list