Re: Is this a bug in the current serial8250_rx_chars()?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/10/2015 06:49 PM, George Spelvin wrote:
>>> However, while the lockless design requires that the last buffer
>>> never disappear, it's actually quite possible to *start* with a
>>> NULL head pointer.
> 
>> But there is no value to it; on the contrary, it makes the code a
>> mess to check for that initial condition everywhere.
> 
> There's also code to handle the size==0 sentinel specially.
> 
> I think you're right the NULL check would be messier, but it's
> not a huge mess.  (Not that I necessarily intend to do this;
> it's just one alternative to explore and see what's cleanest.)
> 
>>> An alternative would be to make the sentinel non-zero size and just
>>> let it be recycled through the free list like everything else.
> 
>> Which would mean tty_buffer_init() could fail allocating the memory,
>> which would mean tty_port_init() could fail...
> 
> Er, that wasn't the idea.  The sentinel would still be part of the
> tty_bufhead structure, and thus preallocated with it; it would just have
> some actual data[] appended.
> 
> The only change to tty_buffer_init would be
> -	tty_buffer_reset(&buf->sentinel, 0);
> +	tty_buffer_reset(&buf->sentinel, MIN_TTYB_SIZE);

Ok, I'll wait to reserve opinion until I see the code that implements
either of these alternatives.


>>>>> Anyway, the steady state will typically be two 512-byte (size = 256)
>>>>> buffers that are ping-ponged.  So there should always be at least
>>>>> 256 bytes of buffer space available without needing to allocate anything.
>  
>>>> The flip buffer design was ditched years ago for good reason; what
>>>> happens when the input worker has not even run yet on the previous input?
>  
>>> I'm afraid I don't understand the point your rhetorical question is
>>> trying to make.  I understand that it's not a simple fixed-size buffer,
>>> but in the usual case, when the ldisc is keeping up with the input,
>>> you end up with two buffers acting a lot like flip buffers.
> 
>> On non-preemptive kernel with high line speed, you'll lose data.
>> Same on preemptive kernel with a high load.
>>
>> IOW, the flip buffer construct requires the input worker to be
>> run within a time deadline that the CFS does not guarantee.
> 
> I'm still not getting it.  It seems that you're describing the problems
> with the old flip buffer that led to it getting replaced.  I understand
> the reasons for the change.
> 
> What I don't understand is *why* you're describing old problems in
> code that doesn't exist any more.  How do they affect tweaks to the
> current design?

Perhaps I misunderstand what you propose, so again, I'll just wait
to see what it looks like.


> My point is that Prasad's use case requires, at most, receive
> buffer space equal to the size of serial8250_console_write output.
> 
> And the current design allocates (and never frees) buffer space equal
> to the peak usage plus one buffer.  (512 bytes with no errors; 256
> bytes with error flags).  So if I'm willing to settle for "probably",
> then when serial8250_console_write() is called, there's probably that
> much space available.
> 
> I'm hoping that's enough preallocation for Pradad's use case.
> 
> 
>>> Since the 8250 driver only ever requests space one byte at a time,
>>> all buffers will be MIN_TTYB_SIZE, and minimum-size buffers are always
>>> reallocated and not recycled.
> 
> By the way, I words two swapped.  Minimum-size buffers *are*
> recycled (via the tty_bufhead's free list) and *not* de/reallocated.
> 
> 
>>> Fundamentally, what I want to do is ask "how much space can you give
>>> me with a guarantee of no kmalloc?" 
>>>
>>> If there's no space, the fallback is don't empty the hardware FIFO and
>>> risk overrun.
> 
>> That looks to be a difficult and thorny problem to overcome.
> 
> It's not really that difficult, but it *does* require revision to the
> tty_buffer interface.  Since the polling code only adds one byte at a
> time, I can do it with one function added to tty_buffer.h:
> 
> static unsigned char *
> char_buf_ptr_noalloc(struct tty_bufhead *buf, char flag);
> 
> This takes the proposed flag, and either returns a pointer to the
> space to put the character, or NULL.
> 
> If it returns NULL, the rx_char() function will stash the flags in
> lsr_saved_flags and stop.
> 
> If it returns non-null, the rx_char() function will read the UART_RX
> register, insert it at the returned pointer, and increment buf->head.used.
> 
> (Alternatively, it could return a boolean and get the pointer from
> char_buf_ptr(buf->head, buf->head.used++).)

The char_buf_ptr() isn't meant to be used as part of the tty_buffer interface.
The tty_buffer implementation is only visible because struct tty_port embeds
struct tty_bufhead.


> By the way, there's also a performance bug in tty_insert_flip_char.
> If (tb->flags & TTYB_NORMAL), then tb->used is allowed to go up
> to 2*tb->size, as the __tty_buffer_request_room code makes clear.
> But tty_insert_flip_char() doesn't know that and will take the slow path
> for the second half of every buffer.
> 
> 
>>> - The call to tty_flip_buffer_push isn't changed
>>> - tty_flip_buffer_push() isn't changed
>>> - tty_schedule_flip() isn't changed
>>> - schedule_work() isn't changed
>>> - queue_work() isn't changed
>>> - queue_work_on() is only changed very slightly (locking tweaks)
>>> - __queue_work is only changed very slightly (locking tweaks)
>>>
>>> Does this still apply?
> 
>> The patch you're looking at applies to -stable-rt 3.14; that's where
>> the scheduler/workqueue changes are.
> 
> Er, I checked again and I really don't think so.
> 
> First of all, the patch at
> https://www.kernel.org/pub/linux/kernel/projects/rt/3.18/patch-3.18.9-rt4.patch.xz
> applies cleanly to v3.18.9.

No, you're right; I'm not sure what I was thinking there.

> Second, looking at v3.14.33-rt30 from 
> https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/log/?h=v3.14-rt
> The only changes to the queuing functions in kernel/workqueue.c and
> include/linux/workqueue.h are the ones I described: the addition of a new
> "pendingb_lock" and a lot of
> 
> -	local_irq_save(flags);
> +	local_lock_irqsave(pendingb_lock, flags);
> 
> -	local_irq_restore(flags);
> +	local_unlock_irqrestore(pendingb_lock, flags);
> 
> Spedifically, I reviewed the output of
> 
> git log -p v3.14.34..v3.14.34-rt31 -- kernel/workqueue.c include/linux/workqueue.h
> 
> There are additional locking changes to the worker tasks that take work
> off the queues, but nothing that looks remotely like conversion of queued
> work to immediate calls.

Here's a 3.10-rt stacktrace. I doubt it's changed.

>> [  416.153174]        CPU0
>> [  416.153176]        ----
>> [  416.153180]   lock(&ldata->output_lock);
>> [  416.153185]   lock(&ldata->output_lock);
>> [  416.153187]
>> [  416.153187]  *** DEADLOCK ***
>> [  416.153187]
>> [  416.153189]  May be due to missing lock nesting notation
>> [  416.153189]
>> [  416.153193] 2 locks held by dropbear/306:
>> [  416.153209]  #0:  (&tty->atomic_write_lock){+.+...}, at: [<802b9b04>] tty_write_lock+0x1c/0x5c
>> [  416.153223]  #1:  (&ldata->output_lock){+.+...}, at: [<802bce2c>] n_tty_write+0x148/0x464
>> [  416.153225]
>> [  416.153225] stack backtrace:
>> [  416.153232] CPU: 3 PID: 306 Comm: dropbear Tainted: G           O 3.10.63-rt65-svn68 #1
>> [  416.153237] Backtrace:
>> [  416.153261] [<80011a60>] (dump_backtrace+0x0/0x108) from [<80011c70>] (show_stack+0x18/0x1c)
>> [  416.153273]  r6:809dac30 r5:808b7944 r4:809dac30 r3:00000000
>> [  416.153290] [<80011c58>] (show_stack+0x0/0x1c) from [<80615888>] (dump_stack+0x24/0x28)
>> [  416.153313] [<80615864>] (dump_stack+0x0/0x28) from [<80071c88>] (__lock_acquire+0x1d04/0x2018)
>> [  416.153325] [<8006ff84>] (__lock_acquire+0x0/0x2018) from [<80072790>] (lock_acquire+0x68/0x7c)
>> [  416.153340] [<80072728>] (lock_acquire+0x0/0x7c) from [<80618e74>] (_mutex_lock+0x38/0x48)
>> [  416.153350]  r7:8c9d1000 r6:8c9d0000 r5:8cd5d58e r4:8c9d12a8
>> [  416.153364] [<80618e3c>] (_mutex_lock+0x0/0x48) from [<802bc728>] (process_echoes+0x48/0x2b4)
>> [  416.153368]  r4:8c9d12f8
>> [  416.153381] [<802bc6e0>] (process_echoes+0x0/0x2b4) from [<802beb1c>] (n_tty_receive_buf+0x1040/0x1044)
>> [  416.153396] [<802bdadc>] (n_tty_receive_buf+0x0/0x1044) from [<802c26e4>] (flush_to_ldisc+0x11c/0x16c)
>> [  416.153407] [<802c25c8>] (flush_to_ldisc+0x0/0x16c) from [<802c2778>] (tty_flip_buffer_push+0x44/0x48)
>> [  416.153419] [<802c2734>] (tty_flip_buffer_push+0x0/0x48) from [<802c38a0>] (pty_write+0x5c/0x6c)
>> [  416.153425]  r5:8c573800 r4:00000001
>> [  416.153436] [<802c3844>] (pty_write+0x0/0x6c) from [<802bce44>] > (n_tty_write+0x160/0x464)
>> [  416.153446]  r6:8c573800 r5:8c9d1400 r4:00000001 r3:802c3844
>> [  416.153459] [<802bcce4>] (n_tty_write+0x0/0x464) from [<802b9c60>] (tty_write+0x11c/0x2e0)
>> [  416.153475] [<802b9b44>] (tty_write+0x0/0x2e0) from [<800d1e10>] (vfs_write+0xb8/0x194)
>> [  416.153484] [<800d1d58>] (vfs_write+0x0/0x194) from [<800d2394>] (SyS_write+0x44/0x80)
>> [  416.153498]  r9:00000000 r8:00000000 r7:013002cc r6:00000000 r5:00000001
>> [  416.153498] r4:8c6cf540
>> [  416.153511] [<800d2350>] (SyS_write+0x0/0x80) from [<8000e580>] (ret_fast_syscall+0x0/0x48)
>> [  416.153524]  r9:8c9a8000 r8:8000e744 r7:00000004 r6:012f9128 r5:00000001
>> [  416.153524] r4:012f9668


>>> It reorders the two, but it's a race condition anyway, so the ordering
>>> isn't well-defined in the first place.
>>>
>>> Is there a more significant consequence I'm missing?
> 
>> Currently, complete lines are output by the console write.
> 
> Er, yes, and not dropping the lock in non-console write won't change that,
> so I don't understand how it's a change of any significance.
> 
> Since console writes disable serial interrupts for the duration,
> they're always atomic.
> 
> All that will happen is, if the console write happens to race with a call
> to serial8250_rx_chars, serial8250_handle_irq might output one more Tx
> FIFO worth of bytes before the console write.

Um, no.

The interrupt handler could be parked on the spin lock already, so
you drop the lock and the interrupt handler outputs in the middle of
the console line.

Regards,
Peter Hurley
--
To unsubscribe from this list: send the line "unsubscribe linux-serial" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux