[PATCH v2] PCI: rockchip: Avoid accessing PCIe registers with clocks gated

Robin Murphy robin.murphy at arm.com
Wed Jun 30 02:44:44 PDT 2021


On 2021-06-30 00:14, Bjorn Helgaas wrote:
> On Tue, Jun 29, 2021 at 11:52:44AM +0100, Robin Murphy wrote:
>> On 2021-06-29 07:17, Javier Martinez Canillas wrote:
>>> On 6/29/21 2:38 AM, Bjorn Helgaas wrote:
>>>> On Thu, Jun 24, 2021 at 05:40:40PM -0500, Bjorn Helgaas wrote:
>>>
>>> [snip]
>>>
>>>>>>
>>>>>> So let's just move all the IRQ init before the pci_host_probe() call, that
>>>>>> will prevent issues like this and seems to be the correct thing to do too.
>>>>>
>>>>> Previously we registered rockchip_pcie_subsys_irq_handler() and
>>>>> rockchip_pcie_client_irq_handler() before the PCIe clocks were
>>>>> enabled.  That's a problem because they depend on those clocks being
>>>>> enabled, and your patch fixes that.
>>>>>
>>>>> rockchip_pcie_legacy_int_handler() depends on rockchip->irq_domain,
>>>>> which isn't initialized until rockchip_pcie_init_irq_domain().
>>>>> Previously we registered rockchip_pcie_legacy_int_handler() as the
>>>>> handler for the "legacy" IRQ before rockchip_pcie_init_irq_domain().
>>>>>
>>>>> I think your patch *also* fixes that problem, right?
>>>>
>>>> The lack of consistency in how we use
>>>> irq_set_chained_handler_and_data() really bugs me.
>>>>
>>>> Your patch fixes the ordering issue where we installed
>>>> rockchip_pcie_legacy_int_handler() before initializing data
>>>> (rockchip->irq_domain) that it depends on.
>>>>
>>>> But AFAICT, rockchip still has the problem that we don't *unregister*
>>>> rockchip_pcie_legacy_int_handler() when the rockchip-pcie module is
>>>> removed.  Doesn't this mean that if we unload the module, then receive
>>>> an interrupt from the device, we'll try to call a function that is no
>>>> longer present?
>>>
>>> Good question, I don't to be honest. I'll have to dig deeper on this but
>>> my experience is that the module removal (and device unbind) is not that
>>> well tested on ARM device drivers in general.
>>
>> Well, it does use devm_request_irq() so the handler should be unregistered
>> by devres *after* ->remove has finished, however that does still leave a
>> potential race window in which a pending IRQ could be taken during the later
>> part of rockchip_pcie_remove() after it has started turning off critical
>> things. Unless the clocks and regulators can also be delegated to devres, it
>> might be more robust to explicitly manage the IRQs as well. Mixing the two
>> schemes can be problematic when the exact order of both setup and teardown
>> matters.
> 
> I don't understand the devm_request_irq() connection.  I'm looking at
> this irq_set_chained_handler_and_data() call [1]:
> 
>    static int rockchip_pcie_setup_irq(struct rockchip_pcie *rockchip)
>    {
>      ...
>      irq = platform_get_irq_byname(pdev, "legacy");
>      irq_set_chained_handler_and_data(irq,
> 				     rockchip_pcie_legacy_int_handler,
> 				     rockchip);
> 
>      irq = platform_get_irq_byname(pdev, "client");
>      ...
> 
> We look up "irq", pass it to irq_set_chained_handler_and_data(), and
> throw it away without saving it anywhere.  How would anything know how
> to unregister rockchip_pcie_legacy_int_handler()?
> 
> I could imagine irq_set_chained_handler_and_data() saving what's
> needed for unregistration, but I would think that would require a
> device pointer, which we don't give it.
> 
> I'm IRQ-illiterate, so please educate me!

Oh, my mistake, I was looking at the "sys" and "client" IRQs and spoke 
too soon; the "legacy" IRQ does appear to be completely leaked as you 
say, so there are two degrees of issue here.

Robin.

> 
> Bjorn
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/pcie-rockchip-host.c?id=v5.13#n562
> 



More information about the Linux-rockchip mailing list