On Wed, 6 Jul 2022 at 11:55, Florents Tselai <florents.tselai@xxxxxxxxx> wrote: > Also, fwiw looking at top the CPU% and MEM% activity, looks like it does data crunching work. ... > >> On 06.07.22 10:42, Florents Tselai wrote: > >>> I have a beefy server (40+ worker processes , 40GB+ shared buffers) and a table holding (key text, text text,) of around 50M rows. > >>> These are text fields extracted from 4-5 page pdfs each. How big is yout table? from your query it seems you expect more than 1M-1 ( left... ), but if you have very big text columns it may be spending a lot of time fully decompressing / reading them ( I'm not sure if it left(..) on toasted values is optimized to stop after reading enough ). Also, it has to rewrite a lot of data to insert the columns, it it takes some ms per row which I would not discard 50M rows * 1 ms / row = 50ksecs = 500k secs ~=13.9 hours per ms-row, so at 2 ms ( which may be right for reading a big row, calculating the vector and writing an even bigger row ) it would take more than a day to finish, which I would not discard given you are asking for a heavy thing. If you have stopped it I would try doing a 1000 row sample in a copied table to get an speed idea. Otherwise, with this query, I would normally monitor disk usage of disk files as an indication of progress, I'm not sure there is another thing you could look at without disturbing it. FWIW, I would consider high mem usage normal in these kind of query, hi cpu would depend on what you call it, but it wouldn't surprise me if it has at least one cpu running at full detoasting and doing vectors, I do not know if alter table can go paralell.. Francisco Olarte.