Boot hang with SiFive PLIC when routing I2C-HID level-triggered interrupts

Eva Kurchatova nyandarknessgirl at gmail.com
Thu Mar 14 20:33:06 PDT 2024


On Thu, Mar 14, 2024 at 11:46 PM Conor Dooley <conor at kernel.org> wrote:
> This immediately seemed odd to me, but I have no reason to disbelieve
> you, given you say this was discovered in RVVM which is an emulator and
> you should know whether or not registers are accessed.
> The very first action taken by the ocores i2c controller driver when it
> gets an interrupt though is to read a register:
>
>         u8 stat = oc_getreg(i2c, OCI2C_STATUS);
>
> I would expect that this handler would be called, and therefore you'd
> see the register read, had the probe function of that driver run to
> completion. I'd also expect that the interrupt would not even be
> unmasked if that probe function had failed.
> In your case though, you can see that the interrupt is not masked,
> since it is being raised and handled repeatedly by the PLIC driver.
> Has the i2c controller driver probed in the period of boot that you say
> this problem manifests?

There is not a single problem with the ocores I2C driver. I2C-HID device
itself has a separate IRQ line which is level-triggered.
This is to signal the host that there is input available without polling,
since I2C is a master-driven bus with no "data available" notifications.
So in reality the I2C-HID driver should handle the interrupt; Then it
uses the I2C controller to access I2C-HID slave registers (via I2C) to
read the incoming HID input report. I2C controller interrupts are unrelated;
it's the link between the HID device and the host and it doesn't seem
to be touched at all inside the I2C-HID IRQ handler (So it's just a pair
of Claim/Complete actions). I2C ocores interrupts are not generated
(and shouldn't) at that point, because no I2C transfer was initiated at all.

There is a way to make I2C-HID device edge-triggered, in RVVM
emulation code, but it's not actually spec compliant. It gets rid of the
hang too; However the same Claim/Complete actions without any
handling inside the IRQ handler are observed at least once, which
technically means a lost interrupt (Pressing a key somewhere early in
boot thus doesn't propagate the keypress to the guest until you press
another key later, after which both HID reports are read), so it's not
a way how I'd like to mitigate this in the emulator code.
I, and another developer from Haiku OS team (X512), are almost sure
this is a kernel bug related to level triggered IRQs with PLIC or a
specific incompatibility of PLIC/I2C-HID (Not the ocores I2C controller).

This hang is not reproducible with a Haiku OS guest in any way and
most of the drivers involved seem to be FreeBSD based or written by
X512 (Specifically the PLIC and I2C-HID drivers are). This person also
believes that this Claim/Complete behavior on PLIC side without any
other actions made in between is erroneous kernel behavior too.

I am open to discussions what specifically could be wrong with the VM
since one of my end goals is to just make HID devices work without
issues there; However if a simple 2-line patch (which I'm not entirely
sure of it's implications) that removes return path at line 223 in PLIC
driver resolves the issue (I kept a guest in a 24 hr reboot loop whilst
spamming fake I2C-HID input and no hang was observed), then it does
lead me to belief that it's at least not some blatant emulation issue.

I came here to collect some kernel devs opinions since we are
debugging this for some 2 weeks already. Your initial understanding
that something is wrong with ocores I2C controller is not what I meant,
sorry for lacking in my explanation.

>Are interrupts unmasked by default on RVVM?

By default all PLIC ENABLE registers are set to zero. All PRIO,
THRESHOLD registers are zero on reset. So all PLIC state is
simply zeroed on reset, as can be seen here:
https://github.com/LekKit/RVVM/blob/f81df57a2af77cbae25fd3cc65d07106d9505e23/src/devices/plic.c#L265

> Have you checked that this actually affects any actual hardware?

I might very soon if no one has immediate ideas what is wrong;
Problem is that I don't have hardware that exposes PLIC IRQ lines to
the user. It might be possible to use some FPGA or at least reproduce
in other simulators.



More information about the linux-riscv mailing list