On 2/23/2019 16:13, Peter Geoghegan wrote:
On Sat, Feb 23, 2019 at 1:06 PM Gunther <raj@xxxxxxxx> wrote:
I thought to keep my index tight, I would define it like this:
CREATE UNIQUE INDEX Queue_idx_pending ON Queue(jobId) WHERE pending;
so that only pending jobs are in that index.
When a job is done, follow up work is often inserted into the Queue as pending, thus adding to that index.
How many distinct jobIds are there in play, roughly? Would you say
that there are many fewer distinct Jobs than distinct entries in the
index/table? Is the number of jobs fixed at a fairly low number, that
doesn't really grow as the workload needs to scale up?
Jobs start on another, external queue, there were about 200,000 of them
waiting when I started the run.
When the SQL Queue is empty, the workers pick one job from the external
queue and add it to the SQL queue.
When that happens immediately 2 more jobs are created on that queue.
Let's cal it phase 1 a and b
When phase 1 a has been worked off, another follow-up job is created.
Let' s call it phase 2.
When phase 2 has been worked off, a final phase 3 job is created.
When that is worked off, nothing new is created, and the next item is
pulled from the external queue and added to the SQL queue.
So this means, each of the 200,000 items add (up to) 4 jobs onto the
queue during their processing.
But since these 200,000 items are on an external queue, the SQL queue
itself is not stuffed full at all. It only slowly grows, and on the main
index where we have only the pending jobs, there are only probably than
20 at any given point in time. When I said 7 jobs per second, it meant
7/s simultaneously for all these 3+1 phases, i.e., 28 jobs per second.
And at that rate it takes little less than 30 min for the index to
deteriorate. I.e. once about 50,000 queue entries have been processed
through that index it has deteriorated to become nearly unusable until
it is rebuilt.
thanks,
-Gunther