Csaba Nagy wrote:
Richar, Martijn,
Thanks for answering, but I had to kill the process in the meantime. I
tried kill -11 in the hope it will produce a core dump at least, but it
either didn't dump core or I don't know where to look for it as I can't
find it.
In any case, this is the second time I experience such a lock-up this
week, so I will definitely need to find out what's going on.
I would exclude hardware failure, as it happened exactly with the same
process, involving exactly the same queries/table and the same failure
symptoms, which is not characteristic for hardware failures (that should
be more random).
So, in order to find out what's going on, what should I do if it happens
again ? Use gdb, and do what ?
Strace is a good idea, I'll do that too if there is a next time.
Well, I've had time to read your previous message too.
The first time you seem to imply the machine slowed down across all
processes - ssh etc. Was that the case this time?
When you say "locked" do you mean it was waiting on locks, was using all
the CPU, unresponsive or just taking the query a long time?
To prepare for next time I'd:
1. Leave ssh logged-in, run screen to get three sessions
2. Leave "top" running in the first - that'll show you process
activity/general load
3. Run "vmstat 10" in the second - that'll show you overall
memory/swap/disk/cpu usage.
4. The third session is then free to work in, if neither of the first
two show anything useful.
--
Richard Huxton
Archonet Ltd