[PATCH v13 6/6] iommu/arm-smmu: add support for PRR bit setup

Rob Clark robdclark at gmail.com
Wed Jul 17 08:00:04 PDT 2024


On Wed, Jul 17, 2024 at 3:27 AM Bibek Kumar Patro
<quic_bibekkum at quicinc.com> wrote:
>
>
>
> On 7/16/2024 1:37 AM, Rob Clark wrote:
> > On Mon, Jul 15, 2024 at 4:00 AM Bibek Kumar Patro
> > <quic_bibekkum at quicinc.com> wrote:
> >>
> >>
> >>
> >> On 7/10/2024 10:31 PM, Rob Clark wrote:
> >>> On Tue, Jul 9, 2024 at 12:43 PM Bibek Kumar Patro
> >>> <quic_bibekkum at quicinc.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 7/4/2024 9:28 PM, Rob Clark wrote:
> >>>>> On Thu, Jul 4, 2024 at 7:46 AM Rob Clark <robdclark at gmail.com> wrote:
> >>>>>>
> >>>>>> On Wed, Jul 3, 2024 at 4:38 AM Bibek Kumar Patro
> >>>>>> <quic_bibekkum at quicinc.com> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 7/2/2024 2:01 AM, Rob Clark wrote:
> >>>>>>>> On Mon, Jul 1, 2024 at 4:01 AM Bibek Kumar Patro
> >>>>>>>> <quic_bibekkum at quicinc.com> wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 6/28/2024 9:14 PM, Rob Clark wrote:
> >>>>>>>>>> On Fri, Jun 28, 2024 at 8:10 AM Bibek Kumar Patro
> >>>>>>>>>> <quic_bibekkum at quicinc.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 6/28/2024 7:47 PM, Rob Clark wrote:
> >>>>>>>>>>>> On Fri, Jun 28, 2024 at 7:05 AM Bibek Kumar Patro
> >>>>>>>>>>>> <quic_bibekkum at quicinc.com> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Add an adreno-smmu-priv interface for drm/msm to call
> >>>>>>>>>>>>> into arm-smmu-qcom and initiate the PRR bit setup or reset
> >>>>>>>>>>>>> sequence as per request.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This will be used by GPU to setup the PRR bit and related
> >>>>>>>>>>>>> configuration registers through adreno-smmu private
> >>>>>>>>>>>>> interface instead of directly poking the smmu hardware.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Suggested-by: Rob Clark <robdclark at gmail.com>
> >>>>>>>>>>>>> Signed-off-by: Bibek Kumar Patro <quic_bibekkum at quicinc.com>
> >>>>>>>>>>>>> ---
> >>>>>>>>>>>>>        drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 23 ++++++++++++++++++++++
> >>>>>>>>>>>>>        drivers/iommu/arm/arm-smmu/arm-smmu.h      |  2 ++
> >>>>>>>>>>>>>        include/linux/adreno-smmu-priv.h           |  6 +++++-
> >>>>>>>>>>>>>        3 files changed, 30 insertions(+), 1 deletion(-)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> >>>>>>>>>>>>> index bd101a161d04..64571a1c47b8 100644
> >>>>>>>>>>>>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> >>>>>>>>>>>>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> >>>>>>>>>>>>> @@ -28,6 +28,7 @@
> >>>>>>>>>>>>>        #define PREFETCH_SHALLOW       (1 << PREFETCH_SHIFT)
> >>>>>>>>>>>>>        #define PREFETCH_MODERATE      (2 << PREFETCH_SHIFT)
> >>>>>>>>>>>>>        #define PREFETCH_DEEP          (3 << PREFETCH_SHIFT)
> >>>>>>>>>>>>> +#define GFX_ACTLR_PRR          (1 << 5)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>        static const struct actlr_config sc7280_apps_actlr_cfg[] = {
> >>>>>>>>>>>>>               { 0x0800, 0x04e0, PREFETCH_DEFAULT | CMTLB },
> >>>>>>>>>>>>> @@ -235,6 +236,27 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina
> >>>>>>>>>>>>>               arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg);
> >>>>>>>>>>>>>        }
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> +static void qcom_adreno_smmu_set_prr(const void *cookie, phys_addr_t page_addr, bool set)
> >>>>>>>>>>>>> +{
> >>>>>>>>>>>>> +       struct arm_smmu_domain *smmu_domain = (void *)cookie;
> >>>>>>>>>>>>> +       struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> >>>>>>>>>>>>> +       struct arm_smmu_device *smmu = smmu_domain->smmu;
> >>>>>>>>>>>>> +       u32 reg = 0;
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +       writel_relaxed(lower_32_bits(page_addr),
> >>>>>>>>>>>>> +                               smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR);
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +       writel_relaxed(upper_32_bits(page_addr),
> >>>>>>>>>>>>> +                               smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR);
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +       reg =  arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR);
> >>>>>>>>>>>>> +       reg &= ~GFX_ACTLR_PRR;
> >>>>>>>>>>>>> +       if (set)
> >>>>>>>>>>>>> +               reg |= FIELD_PREP(GFX_ACTLR_PRR, 1);
> >>>>>>>>>>>>> +       arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg);
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>
> >>>>>>>>>>>> nit, extra line
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Ack, will remove this. Thanks for pointing out.
> >>>>>>>>>>>
> >>>>>>>>>>>> Also, if you passed a `struct page *` instead, then you could drop the
> >>>>>>>>>>>> bool param, ie. passing NULL for the page would disable PRR.  But I
> >>>>>>>>>>>> can go either way if others have a strong preference for phys_addr_t.
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Oh okay, this looks simple to reset the prr bit.
> >>>>>>>>>>> But since this page is allocated and is used inside gfx driver
> >>>>>>>>>>> before being utilized for prr bit operation, would it be safe for
> >>>>>>>>>>> drm/gfx driver to keep a reference to this page in smmu driver?
> >>>>>>>>>>>
> >>>>>>>>>>> Since we only need the page address for configuring the
> >>>>>>>>>>> CFG_UADDR/CFG_LADDR registers so passed the phys_addr_t.
> >>>>>>>>>>
> >>>>>>>>>> I don't think the smmu driver needs to keep a reference to the page..
> >>>>>>>>>> we can just say it is the responsibility of the drm driver to call
> >>>>>>>>>> set_prr(NULL) before freeing the page
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> That makes sense. If we go by this NULL page method to disable the PRR,
> >>>>>>>>> we would have to set the address registers to reset value as well.
> >>>>>>>>>
> >>>>>>>>> The sequence would be like the following as per my understaning:
> >>>>>>>>> - Check if it's NULL page
> >>>>>>>>> - Set the PRR_CFG_UADDR/PRR_CFG_LADDR to reset values i.e - 0x0 for
> >>>>>>>>>        these registers
> >>>>>>>>> - Reset the PRR bit in actlr register
> >>>>>>>>>
> >>>>>>>>> Similar to this snippet:
> >>>>>>>>>
> >>>>>>>>> #PRR_RESET_ADDR 0x0
> >>>>>>>>>
> >>>>>>>>> --------------
> >>>>>>>>> reg =  arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR);
> >>>>>>>>> reg &= ~GFX_ACTLR_PRR;
> >>>>>>>>> arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg);
> >>>>>>>>>
> >>>>>>>>> if (!prr_page) {
> >>>>>>>>>             writel_relaxed(PRR_RESET_ADDR,
> >>>>>>>>>                             smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR);
> >>>>>>>>>             writel_relaxed(PRR_RESET_ADDR),
> >>>>>>>>>                              smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR);
> >>>>>>>>>             return;
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> writel_relaxed(lower_32_bits(page_to_phys(prr_page)),
> >>>>>>>>>                     smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR);
> >>>>>>>>>
> >>>>>>>>> writel_relaxed(upper_32_bits(page_to_phys(prr_page)),
> >>>>>>>>>                     smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR);
> >>>>>>>>>
> >>>>>>>>> reg |= FIELD_PREP(GFX_ACTLR_PRR, 1);
> >>>>>>>>> arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg);
> >>>>>>>>> -----------------
> >>>>>>>>>
> >>>>>>>>> If looks good, will implement the same in next version.
> >>>>>>>>
> >>>>>>>> yeah, that looks like it could work..
> >>>>>>>>
> >>>>>>>> you probably don't need to zero out the PRR_CFG_*ADDR when disabling,
> >>>>>>>> and probably could avoid double writing ACTLR, but that is getting
> >>>>>>>> into bikeshedding
> >>>>>>>>
> >>>>>>>
> >>>>>>> Actually Rob, since you rightly pointed this out.
> >>>>>>> I crosschecked again on these registers.
> >>>>>>> PRR_CFG_*ADDR is a global register in SMMU space but
> >>>>>>> ACTLR register including PRR bit is a per-domain register.
> >>>>>>> There might also be a situation where PRR feature need to be
> >>>>>>> disabled or enabled separately for each domain.
> >>>>>>> So I think it would be cleaner to have two apis, set_prr_addr(),
> >>>>>>> set_prr_bit().
> >>>>>>> set_prr_addr() will be used only to set this PRR_CFG_*ADDR
> >>>>>>> register by passing a 'struct page *'
> >>>>>>> set_prr_bit() will be used as a switch for PRR feature,
> >>>>>>> where required smmu_domain will be passed along with
> >>>>>>> the bool value to set/reset the PRR bit depending on which this
> >>>>>>> feature will be enabled/disabled for the selected domain.
> >>>>>>
> >>>>>> on a related note, adreno has been using arm-smmu for a number of
> >>>>>> generations, I guess not all support PRR?  The drm driver will need to
> >>>>>> know whether PRR is supported (and expose that to userspace to let the
> >>>>>> UMD know whether to expose certain extensions).  How should this work?
> >>>>>
> >>>>> So, I noticed in the x1e80100.dtsi that there is a gpu_prr_mem
> >>>>> reserved section..  maybe we should be connecting this to the smmu
> >>>>> driver in dt, and using that to detect presence of PRR?  Ie. the smmu
> >>>>> driver would configure PRR_CFG_*ADDR based on the reserved mem, and
> >>>>> the interface to drm would just be to enable/disable PRR, returning an
> >>>>> error code if the reserved mem section isn't there.
> >>>>>
> >>>>> This simplifies the interface, and handles the question of how to
> >>>>> detect if PRR is supported.
> >>>>>
> >>>>> BR,
> >>>>> -R
> >>>>>
> >>>>
> >>>> As I checked gpu_prr_mem reserved mem section is not used for mobile
> >>>> targets hence not present for other DT only compute targets like
> >>>> x1e80100.dtsi has the same. PRR looks to be smmu version specific
> >>>> property.
> >>>
> >>> I only see it in gpu_prr_mem in x1e80100.dtsi, but not documented
> >>> anywhere.  I'm only assuming based on the name that it is intended to
> >>> be for PRR (but not sure why it is larger than 0x1000).  Are the
> >>> PRR_CFG_*ADDR regs programmed by the fw (and access blocked in EL1) on
> >>> this device?
> >>>
> >>
> >> As I checked, if the drm/gfx driver allocates the page for drm, then
> >> this reserved-memory region is not required.
> >>
> >> PRR_CFG_*ADDR regs have read and write access in EL1 only for this
> >> device, behavior is same as other devices as well. These are not
> >> programmed by fw.
> >
> > If there is any device which _doesn't_ have EL1 access to these regs,
> > I think going the reserved memory route seems more future proof > Otherwise we later on have to deal with two different ways to do
> > things.  But I'm not sure if there is any such device or risk.
> >
>
> PRR is a bit in ACTLR register which is in SMMU space,
> so is the PRR_CFG_*ADDR registers - with EL1 having access
> to both the registers in all targets released till now with MMU-500.
> It's unlikely that this design would change in future
> for MMU-500 based targets, so I feel this risk is somewhat negligible.

I wasn't worried about the ACTLR register, but the PRR_CFG_*ADDR regs ;-)

IIRC those were in the SMMU global space, why hyp tends to like to own.

> Also would the reserved memory route look a bit hackish?
> Because, since this reserved-memory node is not used when page is
> allocated through drm - so it might turn out to be redundant.
> If we are aiming for a device-tree handle/node for reference then I
> think a better way would be to create a bool parameter inside smmu-node
> indicating presence of PRR ?

tbh, I don't think there is anything better or worse about having the
reserved-memory node vs dynamically allocating it.  (If we dynamically
allocate, we should remove the reserved memory node from
x1e80100.dtsi)

The thing I was more concerned about was whether there was any chance
that some existing or future SoC+fw combo _relied_ on a reserved
memory node and the fw programming PRR_CFG_*ADDR.  If there was any
chance of that, and we went the dynamic allocation route, then we'd
have some devices with a reserved memory node, and some without.  That
seems a bit ugly to me.

If there is no chance of this, then we can go either route.

> Personally,I feel since the PRR enablement mechanism is same for all
> MMU-500 targets - compat string would be a robust approach.

I guess if it is all mmu-500, then we can just pick based on compat
string.  If it turns out some subset of smmu-v2 have PRR, we can just
have a list of compat strings in arm-smmu-qcom.c.. there would only be
a finite # of them ;-)

BR,
-R



More information about the linux-arm-kernel mailing list