On Tue, Mar 30, 2010 at 04:16, Mike Williams <mike.williams@xxxxxxxxxx> wrote: > Thanks Alex, good to know I've not screwed up the kernel somehow. > > I've been using 2.6.32 with grsecurity-2.1.14-2.6.32.9-201002231820 applied. Looks like the first instance I had of this problem was with 2.6.31.1-rc1-grsec. I know I tried 2.6.32-grsec and various 2.6.32.X-grsecs but all those had this issue at some point. Currently im on a mostly stock 2.6.33.1 with no problems. I have not had the nerve to try a -grsec kernel on it again. For reference here are the errors I got: could not open segment 3 of relation base/4440720/8003730 COPY public.page_loads (cgi, content_length, date_created, defunct, host, ip, page_load_id, protocol, proxy_ip, referrer, request_method, sessionid, url, user_id, audit_tid, user_agent_id, action, server) TO '/tmp/blah.sql'; ERROR: invalid memory alloc request size 18446744073709551613 There were more could not open segment errors... but I seem to have lost them. Normally I would think the above is corrupt data, but it would sometimes work. It *always* worked on the non grsec kernel. So instead it smells like bad ram, well its got ecc ram and survived multiple runs of memtest, memtest86+ various versions. [ Yeah I know people including me have seen ram that passes all that and is still bad ] Since you are having similar problems with a -grsec kernel sounds like there might be some kind of memory corruption bug with it. I would recommend trying a stock kernel and seeing if the problem goes away. I also think the general attitude here is if you run crazy security patches you get to keep both pieces. :) Another fact that seemed to point to bad ram or some kind of kernel corruption was trying to find the bad row COPY reported above: SELECT count(*) from (select * from page_loads order by page_load_id desc limit 937980) as foo; ERROR: could not open segment 3 of relation base/4440720/8003730 (target block 4680336): No such file or directory SELECT count(*) from ( select * from page_loads order by page_load_id desc limit 937970) as foo; count -------- 937970 <70-79 snipped all worked> SELECT count(*) from (select * from page_loads order by page_load_id desc limit 937979) as foo; count -------- 937979 -- Uhh this was just broken... SELECT count(*) from (select * order by page_load_id desc limit 937980) as foo; count -------- 937980 I thought I had some stacktraces... but they are not in my notes... I do remember tracing through them and coming to the conclusion that its most likely some kind of kernel bug. (That error can only happen if we try to open a file that does not exist, but we can only get that far if the file existed or some such) Sorry Im a bit hazy this was back in September. -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin