Re: Is this a bug in the current serial8250_rx_chars()?

"George Spelvin" <linux@xxxxxxxxxxx> · 9 Mar 2015 18:30:08 -0400

> The tty_buffer is a lockless design because there is only ever a
> single producer (the uart driver) and a single consumer (the
> flush_to_ldisc() input worker).
> 
> The uart driver produces filled (or partially filled) buffers at the
> tail and the input worker consumes buffers at the head.
> 
> The lockless design requires one buffer in the list at all times,
> even if it's empty and consumed; the sentinel guarantees that condition
> when the tty is first created.

I omitted the explanations because I'd figured it out; I'm actually quite
familiar with lockless designs.

However, while the lockless design requires that the last buffer
never disappear, it's actually quite possible to *start* with a
NULL head pointer.

An alternative would be to make the sentinel non-zero size and just
let it be recycled through the free list like everything else.

> I suppose you're right about "tail->used" and "head->read" but please
> analyze each and every use if you plan to optimize those.

I promise.  The entire reason I'm asking all these questions is an
excess of caution.  I stare at the code for a few days and *then* ask
"am I missing something?".

>> Anyway, the steady state will typically be two 512-byte (size = 256)
>> buffers that are ping-ponged.  So there should always be at least
>> 256 bytes of buffer space available without needing to allocate anything.

> The flip buffer design was ditched years ago for good reason; what
> happens when the input worker has not even run yet on the previous input?

I'm afraid I don't understand the point your rhetorical question is
trying to make.  I understand that it's not a simple fixed-size buffer,
but in the usual case, when the ldisc is keeping up with the input,
you end up with two buffers acting a lot like flip buffers.

Since the 8250 driver only ever requests space one byte at a time,
all buffers will be MIN_TTYB_SIZE, and minimum-size buffers are always
reallocated and not recycled.

One buffer is allocated as soon as the first input byte arrives.  Because
buffers are not pushed to the ldsic until the FIFO has been emptied,
it is very likely that a full FIFO will straddle the end of a buffer
and force the allocation of a second.

Once that steady state is reached, a third buffer will not be allocated
unless the ldisc gets way behind, or a FIFO read has straddled both
buffers when an error occurs and forces the allocation of a !TTYB_NORMAL
buffer.

>> I'll have to add some features to the tty_buffer interface to support a
>> no-allocation interface that reports if space is available and uses it
>> if it is.  A bit invasive, but it seems like the only clean way.

> You should be able to use the tty_buffer interface without modification
> from console write; the only other producer is the irq handler which is
> excluded by the console write with the port->lock.

I know you're the expert here, but I really don't think that's right.
console_write may be called from somewhere deep in the kernel with
arbitrary locks held.  So I can't safely call kmalloc().

And there's currently no way to make the call chain
tty_insert_flip_string_fixed_flag
-> __tty_buffer_request_room
   -> tty_buffer_alloc
*not* call kmalloc().

Fundamentally, what I want to do is ask "how much space can you give
me with a guarantee of no kmalloc?" 

If there's no space, the fallback is don't empty the hardware FIFO and
risk overrun.

> Yep, I realize since I initially wrote about that the back in October,
> it's been lost in the sands of time (and you weren't in the conversation
> at the time), but I thought lock dropping should be removed.

Whew!  I spent a long time staring at the code thinking that I was missing
something significant.

>> Another possible fix would be to test --max_count *before*
>> re-reading the LSR, so the errors would stay in the FIFO.

> This would be my vote.

It Will Be Done.

>> Regarding the need for the unlock in the first place, I'm really
>> not seeing it.  tty_flip_buffer_push() is just an alias for
>> tty_schedule_flip(), which just fills in port->buf.tail->commit and
>> calls schedule_work(port->buf.work).

> The unlock is a remnant of the low_latency steering which was
> buggy and has been removed. I left it there because the -RT patches
> don't schedule the input worker but run it directly, and I didn't
> want to break that when there was no harm in leaving it.

Ah!  So there is a reason.  I don't mind just adding a comment and
leaving it.

But... looking at
https://www.kernel.org/pub/linux/kernel/projects/rt/3.18/
I don't see the change you're referring to.

- The call to tty_flip_buffer_push isn't changed
- tty_flip_buffer_push() isn't changed
- tty_schedule_flip() isn't changed
- schedule_work() isn't changed
- queue_work() isn't changed
- queue_work_on() is only changed very slightly (locking tweaks)
- __queue_work is only changed very slightly (locking tweaks)

Does this still apply?

By the way, the patches there to

drivers/tty/serial/amba-pl011.c
drivers/tty/serial/omap-serial.c

Look like they should be in mainline.

> Well, dropping the unlock allows normal tx to occur so that could
> break up the console output.

It reorders the two, but it's a race condition anyway, so the ordering
isn't well-defined in the first place.

Is there a more significant consequence I'm missing?

Thank you very much for your thoughtful reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-serial" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html