Search squid archive

COSS causing squid Segment Violation on FreeBSD 6.2S

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,
Just in the process of putting a small percentage of our web requests through 3 new caches to test them. However, I'm encountering SEGV seemingly due to COSS. Two of the caches ran for about a day and then failed e.g.

2007/04/26 10:39:26| storeCossCompletePendingReloc: got failure (-1)
FATAL: Received Segment Violation...dying.

When the caches restart, read the COSS dir and then when they finish reading it they die with the same error again. They are both seemingly in a loop doing this forever now. However, the other cache is still running happily (perhaps just luck?).
  COSS drives were completely blanked before use.
They are all configured the same. Dell Poweredge 2650, PERC 4/DC RAID controller, 4GB RAM, 2x3.2GHz Xeon, 5x72GB 15Krpm drives:

2x72GB	RAID 1	OS and everything except cache_dir
1x72GB	JBOD	COSS
1x72GB	JBOD	aufs
1x72GB	JBOD	aufs

Very recent 32bit FreeBSD 6.2-STABLE #94: Fri Apr 20 11:22:18 BST 2007.
Using the FreeBSD squid port which is currently 2.6-STABLE12. As both COSS and aufs are specified, I believe that both use internal AIO code in squid? Therefore the FreeBSD VFS_AIO module is not required?

cache_dir coss /dev/amrd1 65000 max-size=16384 block-size=4096
cache_dir aufs /2 56000 16 256
cache_dir aufs /3 56000 16 256

I changed squid to libthr library using /etc/libmap.conf:

# ldd /usr/local/sbin/squid
/usr/local/sbin/squid:
        libcrypt.so.3 => /lib/libcrypt.so.3 (0x88158000)
        libm.so.4 => /lib/libm.so.4 (0x88171000)
        libpthread.so.2 => /usr/lib/libthr.so.2 (0x88187000)
        libc.so.6 => /lib/libc.so.6 (0x8819a000)

I always map to libthr as in the past it has been more stable. Well libpthread causes crashes with MySQL where libthr is ok.
  A quick look at the squid.core gives (I'm no debugging expert :( ):

(gdb) where
#0  0x88264bb7 in memset () from /lib/libc.so.6
#1  0x00004144 in ?? ()
#2  0x8826194e in calloc () from /lib/libc.so.6
#3  0x080fa334 in xcalloc (n=28, sz=2284278208) at util.c:561
#4  0x080e7a7d in storeCossDirWriteCleanStart (sd=0x82a4000) at coss/store_dir_coss.c:409
#5  0x080c94b5 in storeDirWriteCleanLogs (reopen=0) at store_dir.c:426
#6  0x080cc4dc in death (sig=0) at tools.c:314
#7  0xbfbfff94 in ?? ()
#8  0x0000000b in ?? ()
#9  0x0000000c in ?? ()
#10 0xbfbfe610 in ?? ()
#11 0x08047ffc in ?? ()
#12 0x080cc495 in uniqueHostname () at tools.c:556
#13 0x080e729f in aioCheckCallbacks (SD=0x82a40b0) at aufs/async_io.c:319
#14 0x080c97b2 in storeDirCallback () at store_dir.c:508
#15 0x0807b710 in comm_select (msec=10) at comm_generic.c:377
#16 0x080a568a in main (argc=2, argv=0xbfbfeea4) at main.c:837

Just tried it with libpthread and got the same error once it had read the COSS dir. The debug gives:

(gdb) where
#0  0x881a5abf in pthread_testcancel () from /lib/libpthread.so.2
#1  0x8819df3b in pthread_mutexattr_init () from /lib/libpthread.so.2
#2  0x882a3450 in ?? ()

Any ideas? Or can I find more info to help nail this?
  Many thanks for any pointers.

--
Mark Powell - UNIX System Administrator - The University of Salford
Information Services Division, Clifford Whitworth Building,
Salford University, Manchester, M5 4WT, UK.
Tel: +44 161 295 4837  Fax: +44 161 295 5888  www.pgp.com for PGP key

[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux