Yes it will need some changes but I don't think they are big changes. I
think the functions to decode/encode already exist. We just to need to
move encoding/decoding as tasks and run as synctasks.
I was also thinking in sleeping fops. Currently when they are resumed, they are processed in the same thread that was processing another fop. This could add latencies to fops or unnecessary delays in lock management. If they can be scheduled to be executed by another thread, these delays are drastically reduced.
On the other hand, splitting the computation of EC encoding into multiple threads is bad because current implementation takes advantage of internal CPU memory cache, which is really fast. We should compute all fragments of a single request in the same thread. Multiple independent computations could be executed by different threads.
Xavi,
Long time back we chatted a bit about synctask code and you wanted
the scheduling to happen by kernel or something. Apart from that do you
see any other issues? At least if the tasks are synchronous i.e. nothing
goes out the wire, task scheduling = thread scheduling by kernel and it
works exactly like thread-pool you were referring to. It does
multi-tasking only if the tasks are asynchronous in nature.
How would this work ? should we have to create a new synctask for each background function we want to execute ? I think this has an important overhead, since each synctask requires its own stack, creates a frame that we don't really need in most cases, and it causes context switches.
Yes we will have to create a synctask. Yes it does have overhead of own stack because it assumes the task will pause at some point. I think when synctask framework was written the smallest thing that will be executed is a fop over network. It was mainly written to do replace-brick using 'pump' xlator which is now deprecated. If we know upfront that the task will never pause there is absolutely no need to create a new stack. In which case it just executes the function and moves on to the next task.
We could have hundreds or thousands of requests per second. they could even require more than one background task for each request in some cases. I'm not sure if synctasks are the right choice in this case.
For each request we need to create a new synctask. It will be placed in the tasks that are ready to execute. there will be 16 threads(in the stressful scenario) waiting for new tasks, one of them will pick it up and execute.
I think that a thread pool is more lightweight.
I think a small write-up of your thoughts on how it should be would be a good start for us.
In my head a thread-pool is a set of threads waiting for incoming tasks. Each thread picks up a new task and executes the task, upon completion it will move on to the next task that needs to be executed.
Synctask framework is also a thread-pool waiting for incoming tasks. Each thread picks up a task in readyq and executes the task. If the task has to pause in the middle it will have to put it in wait-q and move on to the next one. If the task never pauses, then it will complete the task execution and moves on to the next task.
So synctask is more complex than thread-pool because it assumes the tasks will pause. I am wondering if we can 1) break the complexity into thread-pool which is more light-weight and add synctask framework on top of it. or alternatively 2) Optimize synctask framework to perform synchronous tasks without any stack creation and execute it in the thread stack itself.
Xavi
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel