Bug 214611 - UM: stdout output ceases under certain conditions

Anton Ivanov anton.ivanov at kot-begemot.co.uk
Thu Oct 7 00:00:46 PDT 2021


On 06/10/2021 22:28, Glenn Washburn wrote:
> On Wed, 6 Oct 2021 20:52:40 +0100
> Anton Ivanov <anton.ivanov at kot-begemot.co.uk> wrote:
>
>> On 06/10/2021 20:48, Glenn Washburn wrote:
>>> On Wed, 6 Oct 2021 19:53:48 +0100
>>> Anton Ivanov <anton.ivanov at kot-begemot.co.uk> wrote:
>>>
>>>> On 06/10/2021 19:05, Glenn Washburn wrote:
>>>>> On Wed, 6 Oct 2021 17:44:14 +0100
>>>>> Anton Ivanov <anton.ivanov at kot-begemot.co.uk> wrote:
>>>>>
>>>>>> On 06/10/2021 16:57, Anton Ivanov wrote:
>>>>>>> On 04/10/2021 17:54, Glenn Washburn wrote:
>>>>>>>> On Mon, 04 Oct 2021 14:48:34 +0200
>>>>>>>> Johannes Berg <johannes at sipsolutions.net> wrote:
>>>>>>>>
>>>>>>>>> On Sat, 2021-10-02 at 21:00 -0500, Glenn Washburn wrote:
>>>>>>>>>> Hi list,
>>>>>>>>>>
>>>>>>>>>> I'm notifying the list of a bug report[1] I created in the kernel
>>>>>>>>>> bugzilla. I'm not subscribed to this list, so please add this my email
>>>>>>>>>> in any replies to this email.
>>>>>>>>>> [1] https://bugzilla.kernel.org/show_bug.cgi?id=214611
>>>>>>>>> This really has nothing to do with UBD or something. What's happening is
>>>>>>>>> that you're using the command line badly.
>>>>>>>>>
>>>>>>>>> What do you expect this:
>>>>>>>>>
>>>>>>>>>       ... < <(cat /dev/null)
>>>>>>>>>
>>>>>>>>> to do?
>>>>>>>> This was just a way to trigger the issue I was seeing. I have a bash
>>>>>>>> script which was doing something like the following:
>>>>>>>>
>>>>>>>> grep "search" /path/to/file |
>>>>>>>> while read VAR; do
>>>>>>>>       run_some_script_which_eventually_runs_uml $VAR;
>>>>>>>> done
>>>>>>>>
>>>>>>>> I was confused why running this script caused UML to lose output always
>>>>>>>> when mounting the ubd in the UML mount script. And it didn't happen
>>>>>>>> when I ran "run_some_script_which_eventually_runs_uml" alone. Since the
>>>>>>>> amount of data returned by the grep was small, this issue was triggered
>>>>>>>> all the time. If the output were a lot of data, I might have noticed
>>>>>>>> that early runs of run_some_script_which_eventually_runs_uml would not
>>>>>>>> have output disappear after mounting. Thanks for debugging this.
>>>>>>>>
>>>>>>>>> What happens is that the shell creates a pipe. This pipe is connected on
>>>>>>>>> the one side to fd:1 in UML (stdin) and on the other to stdout of 'cat'.
>>>>>>>>>
>>>>>>>>> Now this is all fine, but 'cat' will *quit immediately* since it cannot
>>>>>>>>> read anything from /dev/null (it's write-only!).
>>>>>>>>>
>>>>>>>>> Therefore, the fd:1 in UML will be invalidated pretty much immediately,
>>>>>>>>> receiving EPOLLHUP.
>>>>>>>>>
>>>>>>>>> This is detected by the epoll code, raising an interrupt into the line
>>>>>>>>> level code, and the line code then closes the stdio console channel
>>>>>>>>> entirely, including stdout.
>>>>>>>> This seems like it could be a bug. Couldn't the console not be closed,
>>>>>>>> but the console handling code internally mark stdin as closed? Perhaps
>>>>>>>> there could even be logic to detect if stdin and stdout are from the
>>>>>>>> same fd, then close the console, otherwise don't. From a user
>>>>>>>> perspective, thinking of UML as a normal process, it doesn't make sense
>>>>>>>> that closing stdin would close stdout as well.
>>>>>>> There is an even more convoluted case where the stdin is a socket (which
>>>>>>> is possible - you pass it to UML as a fd:N). That can be half-closed.
>>>>>>>
>>>>>>> Looking at it at the moment, but to be honest, separating the logic for in
>>>>>>> and out if the fd is the same is going to be quite difficult (if at all
>>>>>>> possible). It all ends as EPOLL events at the bottom. Even if you handle IN
>>>>>>> and OUT separately in the upper layers, the kernel will handle them as the
>>>>>>> same fd and any event (f.e. closure) will show up on both.
>>>>>> Further to this, the same holds even if we start playing games with multiple
>>>>>> EPOLL descriptors, dup-ing fds, etc, the event will still show up on all of
>>>>>> them.
>>>>> Thanks for looking into this. If I'm understanding correctly, you're
>>>>> looking at the case where the UML process has STDIN and STDOUT to the
>>>>> same file descriptor. However, the situation is when STDIN is to a pipe
>>>>> that gets closed and STDOUT is to something else (pty, tty, file,
>>>>> different pipe, etc..). Does your logic still hold true in this case?
>>>> No. They should be on different IRQs.
>>>>
>>>> Question:
>>>>
>>>> Have you tried using con0=null,fd:1 ?
>>>>
>>>> Assign null explicitly to the input instead of a fd which is closed?
>>>>
>>>> A
>>> I just tried that and it does not trigger the bug, which I'd expect.
>>> This would be another work around, but I think it would be good to fix
>>> the bug. What if you want to pipe some data to stdin? Then when program
>>> on the write side of the pipe exits because its done sending data, the
>>> UML will stop sending data to stdout because the pipe gets closed.  How
>>> hard do you think this would be to fix?
>> I will see if we can do something about it.
>>
>> I'd rather have this as an option instead of always enabled, because
>> having it always on will break error handling elsewhere.
> Interesting, for posterity, specifically what error handling would be
> broken by this?

The detection that a file "behind" a device is closed at present 
presently happens in the interrupt controller - it maps EPOLL events on 
the fds to IRQs. All file IO gets multiplexed through there.

Alternatively, we can try playing with this in the upper layer - the tty 
system channels. Some of them use external helpers and correct handling 
of file close is essential for them to work properly. We had bugs along 
these lines and had to fix them. F.E. commits 
9b1c0c0e25dcccafd30e7d4c150c249cc65550eb and 
9b1c0c0e25dcccafd30e7d4c150c249cc65550eb in the kernel tree fixed an 
actual tty handling bug.

All in all, if we add this functionality to the console/tty channels it 
should be an extra option, because it is opposite to existing behavior.

Brgds,

>
> Glenn
>

-- 
Anton R. Ivanov
https://www.kot-begemot.co.uk/




More information about the linux-um mailing list