Question about potential missed IPI events

Bo Gan ganboing at
Thu Nov 16 00:41:09 PST 2023

On 11/15/23 7:12 PM, Xiang W wrote:
> 在 2023-11-15星期三的 16:33 -0800,Bo Gan写道:
>> On 11/10/23 10:05 PM, Bo Gan wrote:
>>> Right before Hart A ipi_dev->ipi_clear() for the msip write-1 issued by hart B,
>>> Hart C did an ipi_dev->ipi_send to hart A. Hart A then cleared the msip bit,
>>> resulting in both IPI (from B and C) coalesced to a single one. In this case,
>>> A will observe the ipi_type set by B, due to the smp_wmb + __io_bw, and the IPI
>>> was indeed generated by B's write to msip. However will the barrier ensure A
>>> also observe the write to ipi_type from hart C? For the coalescing to happen,
>>> clint will observe C's write-1 before A's write-0, does it imply A'll observe
>>> C's write to ipi_type from xchg?
> Does the picture below represent what you mean?
>   B             A             C
>   |             |             |
>   | xchg        |             |
>   | send------->|             |
>   |             | xchg        |
>   |             |             | xchg
>   |             |<------------| send
>   |             | clear       |

Hi Xiang, Thank you so much for replying. I'd like to revise your chart a little bit:

Case 1, xchg_ulong in `ipi_process` gets reordered before ipi_clear:

   B                 A                C                 CLINT
   |                 |                |                  |
   | bit_set         |                |                  |
   | send -----------|----------------|----------------> |
   |                 | xchg           |                  |
   |                 |                | bit_set          |
   |                 |                | send ----------> |
   |                 | clear----------|----------------> |

In this case, A would observe the ipi_data->ipi_type *eventually*, but when?
A won't process C's IPI request until another IPI comes in the future, which
might cause the IPI request to wait indefinitely. This is not an efficiency
problem, but a correctness problem. I think the fix should be adding specific
fence (need to reason about the correct one) between `ipi_dev->ipi_clear` and
`atomic_raw_xchg_ulong` in `ipi_process`

Case 2, xchg_ulong doesn't get reordered, however:

   B                 A                C                 CLINT
   |                 |                |                  |
   | bit_set         |                |                  |
   | send -----------|----------------|----------------> |
   |                 | xchg           |            /---> |
   |                 | clear----------|-----------/----> |
   |                 |                | bit_set  /       |
   |                 |                | send ----        |

In this case, the same thing can happen. The request from C can be delayed
indefinitely. The cause is due to the clint observes memory order differently
than harts. I'm wondering if the clint can be viewed as another hart that
follows RVWMO? If yes, then this case is likely invalid.


More information about the opensbi mailing list