> The tty_buffer is a lockless design because there is only ever a > single producer (the uart driver) and a single consumer (the > flush_to_ldisc() input worker). > > The uart driver produces filled (or partially filled) buffers at the > tail and the input worker consumes buffers at the head. > > The lockless design requires one buffer in the list at all times, > even if it's empty and consumed; the sentinel guarantees that condition > when the tty is first created. I omitted the explanations because I'd figured it out; I'm actually quite familiar with lockless designs. However, while the lockless design requires that the last buffer never disappear, it's actually quite possible to *start* with a NULL head pointer. An alternative would be to make the sentinel non-zero size and just let it be recycled through the free list like everything else. > I suppose you're right about "tail->used" and "head->read" but please > analyze each and every use if you plan to optimize those. I promise. The entire reason I'm asking all these questions is an excess of caution. I stare at the code for a few days and *then* ask "am I missing something?". >> Anyway, the steady state will typically be two 512-byte (size = 256) >> buffers that are ping-ponged. So there should always be at least >> 256 bytes of buffer space available without needing to allocate anything. > The flip buffer design was ditched years ago for good reason; what > happens when the input worker has not even run yet on the previous input? I'm afraid I don't understand the point your rhetorical question is trying to make. I understand that it's not a simple fixed-size buffer, but in the usual case, when the ldisc is keeping up with the input, you end up with two buffers acting a lot like flip buffers. Since the 8250 driver only ever requests space one byte at a time, all buffers will be MIN_TTYB_SIZE, and minimum-size buffers are always reallocated and not recycled. One buffer is allocated as soon as the first input byte arrives. Because buffers are not pushed to the ldsic until the FIFO has been emptied, it is very likely that a full FIFO will straddle the end of a buffer and force the allocation of a second. Once that steady state is reached, a third buffer will not be allocated unless the ldisc gets way behind, or a FIFO read has straddled both buffers when an error occurs and forces the allocation of a !TTYB_NORMAL buffer. >> I'll have to add some features to the tty_buffer interface to support a >> no-allocation interface that reports if space is available and uses it >> if it is. A bit invasive, but it seems like the only clean way. > You should be able to use the tty_buffer interface without modification > from console write; the only other producer is the irq handler which is > excluded by the console write with the port->lock. I know you're the expert here, but I really don't think that's right. console_write may be called from somewhere deep in the kernel with arbitrary locks held. So I can't safely call kmalloc(). And there's currently no way to make the call chain tty_insert_flip_string_fixed_flag -> __tty_buffer_request_room -> tty_buffer_alloc *not* call kmalloc(). Fundamentally, what I want to do is ask "how much space can you give me with a guarantee of no kmalloc?" If there's no space, the fallback is don't empty the hardware FIFO and risk overrun. > Yep, I realize since I initially wrote about that the back in October, > it's been lost in the sands of time (and you weren't in the conversation > at the time), but I thought lock dropping should be removed. Whew! I spent a long time staring at the code thinking that I was missing something significant. >> Another possible fix would be to test --max_count *before* >> re-reading the LSR, so the errors would stay in the FIFO. > This would be my vote. It Will Be Done. >> Regarding the need for the unlock in the first place, I'm really >> not seeing it. tty_flip_buffer_push() is just an alias for >> tty_schedule_flip(), which just fills in port->buf.tail->commit and >> calls schedule_work(port->buf.work). > The unlock is a remnant of the low_latency steering which was > buggy and has been removed. I left it there because the -RT patches > don't schedule the input worker but run it directly, and I didn't > want to break that when there was no harm in leaving it. Ah! So there is a reason. I don't mind just adding a comment and leaving it. But... looking at https://www.kernel.org/pub/linux/kernel/projects/rt/3.18/ I don't see the change you're referring to. - The call to tty_flip_buffer_push isn't changed - tty_flip_buffer_push() isn't changed - tty_schedule_flip() isn't changed - schedule_work() isn't changed - queue_work() isn't changed - queue_work_on() is only changed very slightly (locking tweaks) - __queue_work is only changed very slightly (locking tweaks) Does this still apply? By the way, the patches there to drivers/tty/serial/amba-pl011.c drivers/tty/serial/omap-serial.c Look like they should be in mainline. > Well, dropping the unlock allows normal tx to occur so that could > break up the console output. It reorders the two, but it's a race condition anyway, so the ordering isn't well-defined in the first place. Is there a more significant consequence I'm missing? Thank you very much for your thoughtful reply. -- To unsubscribe from this list: send the line "unsubscribe linux-serial" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html