Bug 214611 - UM: stdout output ceases under certain conditions

Glenn Washburn development at efficientek.com
Wed Oct 6 14:28:27 PDT 2021


On Wed, 6 Oct 2021 20:52:40 +0100
Anton Ivanov <anton.ivanov at kot-begemot.co.uk> wrote:

> On 06/10/2021 20:48, Glenn Washburn wrote:
> > On Wed, 6 Oct 2021 19:53:48 +0100
> > Anton Ivanov <anton.ivanov at kot-begemot.co.uk> wrote:
> >
> >> On 06/10/2021 19:05, Glenn Washburn wrote:
> >>> On Wed, 6 Oct 2021 17:44:14 +0100
> >>> Anton Ivanov <anton.ivanov at kot-begemot.co.uk> wrote:
> >>>
> >>>> On 06/10/2021 16:57, Anton Ivanov wrote:
> >>>>> On 04/10/2021 17:54, Glenn Washburn wrote:
> >>>>>> On Mon, 04 Oct 2021 14:48:34 +0200
> >>>>>> Johannes Berg <johannes at sipsolutions.net> wrote:
> >>>>>>
> >>>>>>> On Sat, 2021-10-02 at 21:00 -0500, Glenn Washburn wrote:
> >>>>>>>> Hi list,
> >>>>>>>>
> >>>>>>>> I'm notifying the list of a bug report[1] I created in the kernel
> >>>>>>>> bugzilla. I'm not subscribed to this list, so please add this my email
> >>>>>>>> in any replies to this email.
> >>>>>>>> [1] https://bugzilla.kernel.org/show_bug.cgi?id=214611
> >>>>>>> This really has nothing to do with UBD or something. What's happening is
> >>>>>>> that you're using the command line badly.
> >>>>>>>
> >>>>>>> What do you expect this:
> >>>>>>>
> >>>>>>>      ... < <(cat /dev/null)
> >>>>>>>
> >>>>>>> to do?
> >>>>>> This was just a way to trigger the issue I was seeing. I have a bash
> >>>>>> script which was doing something like the following:
> >>>>>>
> >>>>>> grep "search" /path/to/file |
> >>>>>> while read VAR; do
> >>>>>>      run_some_script_which_eventually_runs_uml $VAR;
> >>>>>> done
> >>>>>>
> >>>>>> I was confused why running this script caused UML to lose output always
> >>>>>> when mounting the ubd in the UML mount script. And it didn't happen
> >>>>>> when I ran "run_some_script_which_eventually_runs_uml" alone. Since the
> >>>>>> amount of data returned by the grep was small, this issue was triggered
> >>>>>> all the time. If the output were a lot of data, I might have noticed
> >>>>>> that early runs of run_some_script_which_eventually_runs_uml would not
> >>>>>> have output disappear after mounting. Thanks for debugging this.
> >>>>>>
> >>>>>>> What happens is that the shell creates a pipe. This pipe is connected on
> >>>>>>> the one side to fd:1 in UML (stdin) and on the other to stdout of 'cat'.
> >>>>>>>
> >>>>>>> Now this is all fine, but 'cat' will *quit immediately* since it cannot
> >>>>>>> read anything from /dev/null (it's write-only!).
> >>>>>>>
> >>>>>>> Therefore, the fd:1 in UML will be invalidated pretty much immediately,
> >>>>>>> receiving EPOLLHUP.
> >>>>>>>
> >>>>>>> This is detected by the epoll code, raising an interrupt into the line
> >>>>>>> level code, and the line code then closes the stdio console channel
> >>>>>>> entirely, including stdout.
> >>>>>> This seems like it could be a bug. Couldn't the console not be closed,
> >>>>>> but the console handling code internally mark stdin as closed? Perhaps
> >>>>>> there could even be logic to detect if stdin and stdout are from the
> >>>>>> same fd, then close the console, otherwise don't. From a user
> >>>>>> perspective, thinking of UML as a normal process, it doesn't make sense
> >>>>>> that closing stdin would close stdout as well.
> >>>>> There is an even more convoluted case where the stdin is a socket (which
> >>>>> is possible - you pass it to UML as a fd:N). That can be half-closed.
> >>>>>
> >>>>> Looking at it at the moment, but to be honest, separating the logic for in
> >>>>> and out if the fd is the same is going to be quite difficult (if at all
> >>>>> possible). It all ends as EPOLL events at the bottom. Even if you handle IN
> >>>>> and OUT separately in the upper layers, the kernel will handle them as the
> >>>>> same fd and any event (f.e. closure) will show up on both.
> >>>> Further to this, the same holds even if we start playing games with multiple
> >>>> EPOLL descriptors, dup-ing fds, etc, the event will still show up on all of
> >>>> them.
> >>> Thanks for looking into this. If I'm understanding correctly, you're
> >>> looking at the case where the UML process has STDIN and STDOUT to the
> >>> same file descriptor. However, the situation is when STDIN is to a pipe
> >>> that gets closed and STDOUT is to something else (pty, tty, file,
> >>> different pipe, etc..). Does your logic still hold true in this case?
> >> No. They should be on different IRQs.
> >>
> >> Question:
> >>
> >> Have you tried using con0=null,fd:1 ?
> >>
> >> Assign null explicitly to the input instead of a fd which is closed?
> >>
> >> A
> > I just tried that and it does not trigger the bug, which I'd expect.
> > This would be another work around, but I think it would be good to fix
> > the bug. What if you want to pipe some data to stdin? Then when program
> > on the write side of the pipe exits because its done sending data, the
> > UML will stop sending data to stdout because the pipe gets closed.  How
> > hard do you think this would be to fix?
> 
> I will see if we can do something about it.
> 
> I'd rather have this as an option instead of always enabled, because 
> having it always on will break error handling elsewhere.

Interesting, for posterity, specifically what error handling would be
broken by this?

Glenn



More information about the linux-um mailing list