Search squid archive

Re: very poor performance of rock cache ipc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2023-10-16 16:24, Julian Taylor wrote:
On 15.10.23 05:42, Alex Rousskov wrote:
On 2023-10-14 12:04, Julian Taylor wrote:
On 14.10.23 17:40, Alex Rousskov wrote:
On 2023-10-13 16:01, Julian Taylor wrote:


The reproducer uses as single request, the same very thing can be observed on a very busy squid

If a busy Squid sends lots of IPC messages between worker and disker, then either there is a Squid bug we do not know about OR that disker is just not as busy as one might expect it to be.

In Squid v6+, you can observe disker queues using mgr:store_queues cache manager report. In your environment, do those queues always have lots of requests when Squid is busy? Feel free to share (a pointer to) a representative sample of those reports from your busy Squid.

N.B. Besides worker-disker IPC messages, there are also worker-worker cache synchronization IPC messages. They also have the same "do not send IPC messages if the queue has some pending items already" optimization.

I checked the queues running with the configuration from my initial mail with workers increase and the queues are generally low, around 1-10 items in the queue when sending around 100 parallel requests reading about 100mb data files. Here is a sample: https://dpaste.com/8SLNRW5F8 Also with the higher request rate than the single curl the majority of work throughput was more than doubled by increasing the blocksize.

How are the queues supposed to look like on a busy squid that is not spending a large portion of its time doing notify IPC?

The queues are supposed to look "not empty" -- a non-empty queue does not result in IPC notifications. Needless to say, the further away from "empty" the queues are, the lesser the chance they will become empty when cache manager report is _not_ "looking" at them.


Increasing the parallel requests does decrease the amount of overhead but its still pretty large, I measured about 10%-30% cpu overhead with 100 parallel requests served from cache in the worker and disker
Here a snipped of a profile:
--22.34%--JobDialer<AsyncJob>::dial(AsyncCall&)
    |
    |--21.19%--Ipc::UdsSender::start()
    |       |
    |        --21.13%--Ipc::UdsSender::write()
    |           |
    |           |--16.12%--Ipc::UdsOp::conn()
    |           |          |
   |           |           --15.84%--comm_open_uds(int, int, sockaddr_un*, int)
    |           |                |--1.70%--commSetCloseOnExec(int)
    |           |                 --1.56%--commSetNonBlocking(int)
   ...
--12.98%--comm_close_complete(int)

Clearing and constructing the large Ipc::TypedMsgHdr is also very noticeable.

That the overhead and maximum throughput is so low for not so busy squids (say 1-10 requests per second but requests on average > 1MiB) is imo also a reason for concern and could be improved.

I agree.


If I understand the way it works correctly e.g. the worker when it gets a request splits it into 4k blocks and enqueues read requests into the ipc queue and if the queue is empty it emits a notify ipc so the disker starts popping from the queue.

Yes, at some level of abstraction, the above summary is not wrong. However, please keep in mind that, for a single HTTP transaction, most of the disk read requests are queued by worker, read by disk, and received by worker one read request at a time. There is no disk read "prefetching" (yet?).


On large requests that are answered immediately from the disker the problem seems to be that the queue is mostly empty and it sends an ipc ping pong for each 4k block.

Due to lack of prefetching, the total size of the HTTP response does not really affect the queue length. Only the transaction concurrency level does; on average, that is determined by mean response time multiplied by the I/O request rate from a particular worker to a particular disker.


So my though was when the request is larger than 4k enqueue multiple pending reads in the worker and only notify after a certain amount has been added to the queue, vice versa in the disker.

So I messed around a bit trying to reduce the notifications by delaying the Notify call in src/DiskIO/IpcIo/IpcIoFile.cc for larger requests but it ended up blocking after the first queue push with no notify. If I understand the queue correctly this is due to the reader requires a notify to initially start and and simply pushing multiple read requests onto the queue without notifying will not work

You are correct.


Is this approach feasible or am I misunderstanding how it works?

Prefetching is feasible in principle, but is not easy to implement well and will probably require configuration options (because it will slow down busy Squids that do not have the time to prefetch but may not know that).

I would consider increasing I/O size (and shared memory page size) instead, at least as the first step. Doing so well is not trivial either, but may be easier and beneficial to more use cases.


I also tried to add reusing of the IPC connection between calls so the major source of overhead,tearing down and reestablishing the connection, is removed but that also turned out difficult as the connections are closed in various places and the general complexity of the code.

Yes, that would be nice. Reusing sockets is especially difficult to get right with startup/bootstrapping, reconfigurations, and kid death/restarts problems in mind. On the other hand, it is probably much easier to optimize this than to implement disk hit "prefetching".

There may be some other, more effcient IPC notification mechanisms available on your OS that Squid can be enhanced to support. I have not surveyed what is available these days.


HTH,

Alex.

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.squid-cache.org/listinfo/squid-users




[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux