Re: [Bug #14015] pty regressed again, breaking expect and gcc's testsuite

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

> On Tue, 1 Sep 2009, Rafael J. Wysocki wrote:
>> On Tuesday 01 September 2009, Mikael Pettersson wrote:
>> > 
>> > Starting with 2.6.31-rc8 and reverting
>> > 
>> > 85dfd81dc57e8183a277ddd7a56aa65c96f3f487 pty: fix data loss when stopped (^S/^Q)
>> > d945cb9cce20ac7143c2de8d88b187f62db99bdc pty: Rework the pty layer to use the normal buffering logic
>> > 
>> > in that order gives me a kernel that works on both x86 and powerpc64.
>> > 
>> > So the bug is definitely limited to the pty buffering logic change.
>> 
>> Thanks a lot for this information, adding somme CCs to the list.
>
> Mikael, is there any way to get the gcc testsuite to show the "expected" 
> vs "result" cases when the failures occur, so that we can see what the 
> pattern is ("it drops one character every 8kB" or something like that).
>
> However, I get the feeling that it's really the same bug that 
> OGAWA-san already fixed - and that his fix just doesn't always do a 100% 
> of the job. 
>
> So what Ogawa did was to make sure that we flush any pending data whenever 
> we;re checking "do we have any data left". He did that by calling out to 
> tty_flush_to_ldisc(), which should flush the data through to the ldisc. 
>
> The keyword here being "should". In flush_to_ldisc(), we have at least one 
> case where we say "we'll delay it a bit more":
>
> 		if (!tty->receive_room) {
> 			schedule_delayed_work(&tty->buf.work, 1);
> 			break;
> 		}
>
> and while I think this _should_ be ok (because if there is no 
> receive-room, then we'll hopefully always return non-zero from 
> "input_available_p()". However, we do have this really odd case that the 
> reader side will do "n_tty_set_room()" onlyl _after_ having checked for 
> input_available_p(), and so maybe we do sometimes trigger the case that
>
>  - input_available_p() tries to flush to the input buffer before checking 
>    how much data is available, by calling 'tty_flush_to_ldisc()'
>
>  - but 'tty_flush_to_ldisc()' won't do anything, because tty->receive_room 
>    is zero.
>
>  - so now input_available_p will say "I don't have any data", even though 
>    there was data in the write buffers.
>
>  - we'll notice that the other end has hung up, and return EOF/EIO.
>
>  - which is very WRONG, because the other end may have hung up, but before 
>    it did that, it wrote data that is still in the write queues, and we 
>    should have returned that data.
>
> Anyway, I'm not at all sure that the "receive_room == 0" case can happen 
> at all, but maybe it can. Ogawa-san?

If I'm not missing, I think it doesn't have big change with old
code. But I would need to check more deeply.

Um.., If "receive_room == 0 && tty->read_cnt == 0" is possible, I wonder
why reverting buffer handling fixes the problem.

Well, anyway, I'd like to reproduce this on my machine. Could you tell
me the version of tools? I guess gcc testsuite using the gcc's source
(svn revision?), expect, dejagnu, tcl. (BTW, I'm using debian
testing. If it can be reproduced on kvm, I can install distro version
which you are using)

Thanks.
-- 
OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux