[EXT] Re: [PATCH 1/3] irqchip: imx mu worked as msi controller

Frank Li frank.li at nxp.com
Sat Jul 9 08:23:18 PDT 2022



> -----Original Message-----
> From: Marc Zyngier <maz at kernel.org>
> Sent: Saturday, July 9, 2022 3:16 AM
> To: Frank Li <frank.li at nxp.com>
> Cc: tglx at linutronix.de; robh+dt at kernel.org;
> krzysztof.kozlowski+dt at linaro.org; shawnguo at kernel.org;
> s.hauer at pengutronix.de; kw at linux.com; bhelgaas at google.com; linux-
> kernel at vger.kernel.org; devicetree at vger.kernel.org; linux-arm-
> kernel at lists.infradead.org; linux-pci at vger.kernel.org; Peng Fan
> <peng.fan at nxp.com>; Aisheng Dong <aisheng.dong at nxp.com>;
> jdmason at kudzu.us; kernel at pengutronix.de; festevam at gmail.com; dl-linux-
> imx <linux-imx at nxp.com>; kishon at ti.com; lorenzo.pieralisi at arm.com;
> ntb at lists.linux.dev
> Subject: Re: [EXT] Re: [PATCH 1/3] irqchip: imx mu worked as msi controller
> 
> Caution: EXT Email
> 
> On Fri, 08 Jul 2022 17:26:33 +0100,
> Frank Li <frank.li at nxp.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Marc Zyngier <maz at kernel.org>
> > > Sent: Friday, July 8, 2022 3:59 AM
> > > To: Frank Li <frank.li at nxp.com>
> > > Cc: tglx at linutronix.de; robh+dt at kernel.org;
> > > krzysztof.kozlowski+dt at linaro.org; shawnguo at kernel.org;
> > > s.hauer at pengutronix.de; kw at linux.com; bhelgaas at google.com; linux-
> > > kernel at vger.kernel.org; devicetree at vger.kernel.org; linux-arm-
> > > kernel at lists.infradead.org; linux-pci at vger.kernel.org; Peng Fan
> > > <peng.fan at nxp.com>; Aisheng Dong <aisheng.dong at nxp.com>;
> > > jdmason at kudzu.us; kernel at pengutronix.de; festevam at gmail.com; dl-
> linux-
> > > imx <linux-imx at nxp.com>; kishon at ti.com; lorenzo.pieralisi at arm.com;
> > > ntb at lists.linux.dev
> > > Subject: [EXT] Re: [PATCH 1/3] irqchip: imx mu worked as msi controller
> > >
> > > Caution: EXT Email
> > >
> > > On Thu, 07 Jul 2022 22:02:36 +0100,
> > > Frank Li <Frank.Li at nxp.com> wrote:
> > > >
> > > > MU support generate irq by write data to a register.
> > > > This patch make mu worked as msi controller.
> > > > So MU can do doorbell by using standard msi api.
> > > >
> > > > Signed-off-by: Frank Li <Frank.Li at nxp.com>
> > > > ---
> > > >  drivers/irqchip/Kconfig          |   7 +
> > > >  drivers/irqchip/Makefile         |   1 +
> > > >  drivers/irqchip/irq-imx-mu-msi.c | 490
> > > +++++++++++++++++++++++++++++++
> > > >  3 files changed, 498 insertions(+)
> > > >  create mode 100644 drivers/irqchip/irq-imx-mu-msi.c
> > > >
> 
> [...]
> 
> > > > +static void imx_mu_msi_mask_irq(struct irq_data *data)
> > > > +{
> > > > +     struct imx_mu_msi *msi_data = irq_data_get_irq_chip_data(data-
> > > >parent_data);
> > > > +
> > > > +     pci_msi_mask_irq(data);
> > >
> > > What is this? Below, you create a platform MSI domain. Either you
> > > support PCI, and you create a PCI/MSI domain (and the above may make
> > > sense), or you are doing platform MSI, and the above is non-sense.
> >
> > [Frank Li] You are right. This work as platform msi. Needn't call pci_msi_irq()
> 
> OK, hold that thought and see below.
> 
> > > > +static int imx_mu_msi_domain_irq_alloc(struct irq_domain *domain,
> > > > +                                     unsigned int virq,
> > > > +                                     unsigned int nr_irqs,
> > > > +                                     void *args)
> > > > +{
> > > > +     struct imx_mu_msi *msi_data = domain->host_data;
> > > > +     msi_alloc_info_t *info = args;
> > > > +     int pos, err = 0;
> > > > +
> > > > +     pm_runtime_get_sync(&msi_data->pdev->dev);
> > >
> > > The core code already deals with runtime PM. What prevents it from
> > > working, other than the fact you don't populate the device in the
> > > top-level domain?
> >
> > [Frank Li]  Do you means power domain or irq domain?
> 
> IRQ domain. See irq_domain_set_pm_device() and how PM is used on
> interrupt request.
> 
> [...]
> 
> > > > +static void imx_mu_msi_domain_irq_free(struct irq_domain *domain,
> > > > +                                    unsigned int virq, unsigned int nr_irqs)
> > > > +{
> > > > +     struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> > > > +     struct imx_mu_msi *msi_data = irq_data_get_irq_chip_data(d);
> > > > +     int pos;
> > > > +
> > > > +     pos = d->hwirq;
> > > > +     if (pos < 0 || pos >= msi_data->irqs_num) {
> > > > +             pr_err("failed to teardown msi. Invalid hwirq %d\n", pos);
> > > > +             return;
> > > > +     }
> > >
> > > How can this happen?
> >
> > I just copy from irq-ls-scfg-msi.c
> 
> I wish you didn't do that.
> 
> > It should be impossible happen if everything work as expected.
> 
> Then it should go.
> 
> [...]
> 
> > > > +static const struct imx_mu_dcfg imx_mu_cfg_imx6sx = {
> > > > +     .xTR    = 0x0,
> > > > +     .xRR    = 0x10,
> > > > +     .xSR    = {0x20, 0x20, 0x20, 0x20},
> > > > +     .xCR    = {0x24, 0x24, 0x24, 0x24},
> > > > +};
> > > > +
> > > > +static const struct imx_mu_dcfg imx_mu_cfg_imx7ulp = {
> > > > +     .xTR    = 0x20,
> > > > +     .xRR    = 0x40,
> > > > +     .xSR    = {0x60, 0x60, 0x60, 0x60},
> > > > +     .xCR    = {0x64, 0x64, 0x64, 0x64},
> > > > +};
> > > > +
> > > > +static const struct imx_mu_dcfg imx_mu_cfg_imx8ulp = {
> > > > +     .type   = IMX_MU_V2,
> > > > +     .xTR    = 0x200,
> > > > +     .xRR    = 0x280,
> > > > +     .xSR    = {0xC, 0x118, 0x124, 0x12C},
> > > > +     .xCR    = {0x110, 0x114, 0x120, 0x128},
> > > > +};
> > > > +
> > > > +static const struct imx_mu_dcfg imx_mu_cfg_imx8ulp_s4 = {
> > > > +     .type   = IMX_MU_V2 | IMX_MU_V2_S4,
> > > > +     .xTR    = 0x200,
> > > > +     .xRR    = 0x280,
> > > > +     .xSR    = {0xC, 0x118, 0x124, 0x12C},
> > > > +     .xCR    = {0x110, 0x114, 0x120, 0x128},
> > > > +};
> > >
> > > What are these? We really don't need more magic numbers.
> >
> > It is register offset.  The difference version MU hardware's
> > register map is difference.
> 
> Then please document what this is, what the various registers are, and
> correctly set type everywhere.
> 
> [...]
> 
> > > If that's hardcoded, why do we need an extra variable? I also question
> > > the usefulness of this driver if the HW can only deal with *4* MSIs...
> > > This looks a bit like a joke.
> >
> > MU don't really MSI controller.  Each MU have 4 channel.
> > I.MX have several MU units.
> 
> Then is it really useful to model that as a MSI controller? This
> smells of a mailbox controller to me instead. 

MU already have driver as mailbox controller. But mailbox interface can't
Match PCI-EP requirement.  

┌───────┐          ┌──────────┐
│       │          │          │
│       │          │ PCI Host │
│       │          │          │
│       │          │          │
│       │          │ Bar0     │
│ PCI   │          │ Bar1     │
│ Func  │          │ Bar2     │
│       │          │ Bar3     │
│       │          │ Bar4     │
│       ├─────────►│          │
└───────┘ MSI      └──────────┘

Many PCI controllers provided Endpoint functions.
Generally PCI endpoint is hardware, which is not running a rich OS, like linux.

But Linux also supports endpoint functions.  PCI Host write bar<n> space like
write to memory. The EP side can't know memory changed by the Host driver. 

PCI Spec has not defined a standard method to do that.  Only define MSI<x> to let
EP notified RC status change. 

The basic idea is to trigger an irq when PCI RC writes to a memory address.  That's
what MSI controller provided.  EP drivers just need to request a platform MSI interrupt, 
struct msi_msg *msg will pass down a memory address and data.  EP driver will
map such memory address to one of PCI bar<n>.  Host just writes such an address to
trigger EP side irq.

> And I really worry that
> this device doesn't correctly preserve the ordering between a device
> doing DMA and generating an interrupt to indicate completion of the
> DMA transaction... Does this block offers such a guarantee?

According to PCI memory model,  the sequence of write is the ordered.
PCI host write data to DDR, then write data to MSI (MU)'s register are ordered. 

PCI->AXI bridge will guarantee the order, otherwise it will block memory model.

> 
> > PCI EP driver need an address as doorbell,  so PCI RC side can write
> > This address to trigger irq.  Ideally,  it use GIC-ITS. But our i.MX chip
> > Have not ITS support yet now.  So we can use MU as simple MSI controller.
> 
> Is that an integrated EP on the same SoC? Or are you talking of two
> SoCs connected over PCIe? 

Two Socs connected over PCIe.

> Also, you explicitly said that this was
> *not* a PCI/MSI controller. So what is this all about?

I think real MSI controller works as
1.  Write data to address to trigger irq.   Data is  DID | irq_number. 
2.  MSI controller can distribute difference irq to difference core. 
3.  There are status bit, which match irq_number.   If DID| <0>  and DID | <1> write
To address at same time,  both <0> an <1> irq can be handled.
4. other features ..

MU is not designed as MSI controller at hardware. 
But if we limit irq_number to 1,  MU can work as MSI controller although only provide 4 irq numbers.

For NTB using case, it is enough.  Because i.MX have not gic-its,  I have to use MU as MSI controller.
So PCIe EP can use  it to improve transfer latency.  Without it,  PCI ep driver have to using polling to check
Status, delay will be bigger than 5ms. With msi,  transfer delay will < 1ms. 

I will put all background information at cover letter at next version. 

> 
> [...]
> 
> > > > +static int imx_mu_msi_remove(struct platform_device *pdev)
> > > > +{
> > > > +     struct imx_mu_msi *msi_data = platform_get_drvdata(pdev);
> > > > +
> > > > +     imx_mu_msi_teardown_hwirq(msi_data);
> > > > +
> > > > +     irq_domain_remove(msi_data->msi_domain);
> > > > +     irq_domain_remove(msi_data->parent);
> > >
> > > How do you ensure that no device is still holding interrupts? Let me
> > > give you a hint: you can't. So removing an interrupt controller module
> > > should not be possible.
> >
> > [Frank Li] I agree. But there are many *_remove under irqchip.
> 
> That doesn't make it right.
> 
> Thanks,
> 
>         M.
> 
> --
> Without deviation from the norm, progress is not possible.


More information about the linux-arm-kernel mailing list