[PATCH v9] pipe-source: [DRAFT] implement autosuspend behavior

rommer@xxxxxxxxxxx (Raman Shishniou) · Mon, 26 Feb 2018 13:03:08 +0300

On 02/23/2018 05:06 PM, Georg Chini wrote:
> On 23.02.2018 13:54, Raman Shishniou wrote:
>> On 02/23/2018 01:26 PM, Georg Chini wrote:
>>> On 23.02.2018 11:03, Raman Shishniou wrote:
>>>> On 02/23/2018 11:38 AM, Georg Chini wrote:
>>>>
>>>>> But now I have another issue:
>>>>> You are polling the pipe and running the loop even if the source is user suspended.
>>>>> This seems like a waste of CPU (even more than accepting some POLLIN spam
>>>>> during wakeup transition). I know you do it to discard data that is written during
>>>>> suspend, but I wonder if there is no better way to discard that data without running
>>>>> the loop permanently.
>>>>> I am thinking of draining the pipe in the SET_STATE handler. If you are setting
>>>>> events = 0 and open the corkfd on user suspend, nothing except messages
>>>>> should wake us up. Now, when the state changes to running, you can drain the
>>>>> pipe in the SET_STATE handler. The thread loop will just run through on the first
>>>>> iteration after user suspend, because revents = 0 and chunk.length = 0. Now,
>>>>> because the source is open again, you can set events = POLLIN.
>>>>> After that, you are back to normal.
>>>>> You can safely assume writer_connected=false during user suspend (you do
>>>>> not notice when the writer disconnects if you set events = 0). If the writer
>>>>> is still connected after suspend, writer_connected will get set when you read
>>>>> the first data. It will cause an unnecessary unsuspend message, but this will
>>>>> have no effect because neither the suspend cause nor the state change.
>>>>>
>>>>> I would also suggest to use a flag like set_pollin in the comparison, set and reset
>>>>> the flag in the appropriate places and explain why in a comment. This is one of
>>>>> the situations, where a little bit more code could make the concept clearer. I don't
>>>>> mind keeping it as is however, if you think it's not worth the effort.
>>>>>
>>>> We will face two main problems if we do something like that:
>>>>
>>>> First problem - we don't know how writer will react to full pipe.
>>>> If it open pipe in non-blocking mode, it will get EAGAIN on every
>>>> write() while pipe stays full. If it open pipe in blocking mode,
>>>> it will just stuck at write() until user unsuspend the source.
>>>> I think both behaviors are bad for writer - it should contain a
>>>> special code to deal with it.
>>> I guess that should be the problem of the writer. If it is intended
>>> to write to a pipe, it must be able to deal with the situation that
>>> the pipe is full.
>>>
>> Ok, but it can just ignore EAGAIN/EWOULDBLOCK while writing to pipe
>> and we'll can get broken stream when user unsuspend source.
>> Writer should do the same handling - if it got EAGAIN/EWOULDBLOCK,
>> it should drop the buffer keeping the last frame, move it to the head
>> of next chunk and try to write next chunk. Each writer should do
>> something like this in order to be ready for pipe suspend.
>>
>> Actually this it very cheap to do poll, read data, copy some bytes
>> inside a buffer and poll again. This is about several megabits.
>>
>> For example, I do this on my laptop:
>> $ mkfifo testfifo
>> $ dd if=testfifo of=/dev/null bs=512
>>
>> And on second terminal:
>> $ timeout 5 cat /dev/zero > testfifo
>>
>> Results:
>> 6991136+0 records in
>> 6991136+0 records out
>> 3579461632 bytes (3.6 GB) copied, 5.0019 s, 716 MB/s
>>
>> I.e. 6991136 * 8 * 512 / 5 / 1000000000 ~ 5.72 Gbps
>> And 6991136 / 5 ~ 1398227 read() -> write() pairs per second
> 
> This is on your machine. I just talked to somebody who is
> using PA on a 300 MHz single core MIPS system. And there
> I guess the load would be significant.
>>

I did some tests on Allwinner H3 @ 312MHz:

$ cat poll-read-test.c 
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <poll.h>
#include <string.h>

int main(void) {
    struct pollfd pollfd;
    char buffer[4096];
    int cnt = 0;
    ssize_t ret;

    pollfd.fd = open("/tmp/fifo", O_RDONLY);
    pollfd.events = POLLIN;

    if (pollfd.fd < 0) {
        perror("open");
        return 1;
    }

    for (;;) {
        ret = poll(&pollfd, 1, 0);

        if (ret < 0) {
            perror("poll");
            return 1;
        } else if (ret > 0) {
            ret = read(pollfd.fd, buffer, sizeof(buffer));

            if (ret < 0) {
                perror("read");
                return 1;
            } else if (ret == 0)
                break;

            memcpy(buffer, buffer + sizeof(buffer) - 3, 3);

            cnt++;
        }
    }

    printf("Total iterations: %d\n", cnt);
    return 0;
}

# uname -a
Linux orangepiplus2e 3.4.113-sun8i #4 SMP PREEMPT Wed Nov 22 13:45:28 CET 2017 armv7l armv7l armv7l GNU/Linux
# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
# echo 312000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq 
# echo 312000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq 
# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq 
312000
312000
312000
312000

$ mkfifo /tmp/fifo
$ timeout 5 cat /dev/zero > /tmp/fifo

$ ./poll-read-test 
Total iterations: 94026

I.e. 94026*4096*8/5 ~ 616 Mb/s

At full speed (1200MHz):

$ ./poll-read-test 
Total iterations: 304572

I.e. 304572*4096*8/5 ~ 1996 Mb/s

So we are talking about 1-2% of CPU for audio speeds (< 10 Mb/s)
even for Arm CPU @312MHz

>>>> The second problem is hidden now because I temporary dropped
>>>> a part of code that keep frame boundaries. If we drain the pipe
>>>> as soon as user resume the source - we'll loose frame boundaries.
>>>> Audio stream will be broken for any case except s8/u8 mono.
>>>> Actually we have to read every time while suspended by user and drop
>>>> whole chunk except for not completely read last frame, move it to
>>>> the head of memchunk and do next read() at position where this frame ends.
>>> We could work around this I think. You just need to have the last
>>> read fragment available in the SET_STATE handler. Then you do
>>> not loose frame boundaries, because you continue to read where
>>> you have stopped.
>>>
>> Yep, but after resume it will be part of another frame. And only
>> if writer does right things while we a suspended.
> 
> No, you pick up reading exactly where you stop. There is no other reader.
> So you can do the same thing you would do when you post the data, which
> is discard full frames starting at the head of the buffer and then move the
> remaining fragment to the head.
> If the writer disconnects during suspend, we can assume that it delivered
> full frames, so in that case there should be no fragment left.
> 
> You are however right that there is a situation where it is possible that we
> loose frame boundaries. Whenever the pipe gets filled completely, the
> writer might do the wrong thing or it might disconnect, leaving some
> fragment in the pipe.
> 
> So, yes, I agree to continue polling during suspend. Or could we make that
> behavior optional? I'm not sure how much more work it is to implement both
> alternatives.
> 
>>
>>>> BTW, currently pipe-source PA just crashed if I try to write s24le to pipe:
>>>>
>>>> I decided to do not open a bug because almost whole pipe-source should
>>>> be rewritten, and this is what I'm doing now.
>>>>
>>> Yes, makes sense. If your patch fixes a crash bug, it has a good
>>> chance to get into 12.0 (which it would not have otherwise).
>> It depends on whether Tanu's patches come to the master or not.
>>
> They surely will because they are fixing a release-blocker bug.

--
Raman