On Thu, Aug 19, 2010 at 1:20 PM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote:
SNIPIt's almost certainly not ruby's fault. Have they done anything
strange like kill the instance and restart it without letting the db
shut down? I'd tend to suspect Amazon's fsyncing is amiss and they
did something that triggered it.
--
To understand recursion, one must first understand recursion.
They haven't done anything like that, that we know of. However, they do have a process that kills off all waiting (and only waiting) postgres processes if there are more than 1000 locks. Could that be an issue?
If Amazon's fsyncing is the problem and they're doing something to trigger it, how would we go about debugging that?