Race between SIGIO and epoll from SMP host

Anton Ivanov anton.ivanov at kot-begemot.co.uk
Wed Apr 21 16:45:31 BST 2021



On 21/04/2021 14:35, YiFei Zhu wrote:
> On Wed, Apr 21, 2021 at 7:32 AM Anton Ivanov
> <anton.ivanov at kot-begemot.co.uk> wrote:
>>> Considering that this is a race on the host, what would be the best
>>> way to fix this?
>>
>> Interesting one. I need to think.
>>
>> One option would be to wait for epoll events with a timeout which is larger than zero - f.e. HZ.
> 
> I was about to say I could reproduce it even with a timeout of 1ms,
> then I realized that code I pasted above already used 1ms timeout.
> Assertion failures using 1ms timeout seems much rarer than 0 timeout
> however.
> 
> For reference my CONFIG_HZ on the host is 1000. I also use
> CONFIG_NO_HZ_IDLE if that's relevant (I'm not too familiar with how
> the kernel ticking works).
> 
>> If we have received a SIGIO there is an epoll event on the way. The fact that it is not in the queue right now means that we are due to process it shortly.

This seems to be limited to ttys. Why - I need to figure it out.

If this ends up as tty specific, we can enable the work-around for ttys which was there when they were not producing sigio on write correctly.

This ends up disabled on most modern machines, because modern kernels produce sigio on write correctly for ttys.

With the workaround enabled there is an extra IO event which is produced after the notification appears on the poll loop in a helper thread. So the stall should never happen.

A.

>>
>> A.
> 
> YiFei Zhu
> 
> _______________________________________________
> linux-um mailing list
> linux-um at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um
> 

-- 
Anton R. Ivanov
https://www.kot-begemot.co.uk/



More information about the linux-um mailing list