[PATCH v6 5/6] iommu/arm-smmu-v3: Add in-kernel support for NVIDIA Tegra241 (Grace) CMDQV

Jason Gunthorpe jgg at nvidia.com
Wed May 1 06:00:42 PDT 2024


On Tue, Apr 30, 2024 at 11:08:55AM -0700, Nicolin Chen wrote:
> (Removing chunks that I simply ack)
> 
> On Tue, Apr 30, 2024 at 01:35:45PM -0300, Jason Gunthorpe wrote:
> > On Mon, Apr 29, 2024 at 09:43:48PM -0700, Nicolin Chen wrote:
> 
> > > +/* MMIO helpers */
> > > +#define cmdqv_readl(reg) \
> > > +	readl(cmdqv->base + TEGRA241_CMDQV_##reg)
> > > +#define cmdqv_readl_relaxed(reg) \
> > > +	readl_relaxed(cmdqv->base + TEGRA241_CMDQV_##reg)
> > > +#define cmdqv_writel(val, reg) \
> > > +	writel((val), cmdqv->base + TEGRA241_CMDQV_##reg)
> > > +#define cmdqv_writel_relaxed(val, reg) \
> > > +	writel_relaxed((val), cmdqv->base + TEGRA241_CMDQV_##reg)
> > 
> > Please don't hide access to a stack variable in a macro, and I'm not
> > keen on the ##reg scheme either - it makes it much harder to search
> > for things.
> 
> I can pass in cmdqv/vintf/vcmdq pointers, if it would be better.
> 
> > Really this all seems like alot of overkill to make a little bit of
> > shorthand. It is not so wordy just to type it out:
> > 
> >   readl(vintf->base + TEGRA241_VINTF_CONFIG) 
> 
> vintf_readl(vintf, CONFIG) is much shorter. Doing so reduced the
> line breaks at quite a lot places, so overall the driver looks a
> lot cleaner to me.

We don't have the strict 80 column limit now, it would be fine to go a
few extra to avoid the breaks.

Certainly preferred to these readability damaging macros.

> I can probably change these logging helpers to inline functions.

Just call the normal logging functions directly.

> > > +#define vintf_warn(fmt, ...) \
> > > +	dev_warn(vintf->cmdqv->dev, "VINTF%u: " fmt, vintf->idx, ##__VA_ARGS__)
> > > +#define vintf_err(fmt, ...) \
> > > +	dev_err(vintf->cmdqv->dev, "VINTF%u: " fmt, vintf->idx, ##__VA_ARGS__)
> > > +#define vintf_info(fmt, ...) \
> > > +	dev_info(vintf->cmdqv->dev, "VINTF%u: " fmt, vintf->idx, ##__VA_ARGS__)
> > > +#define vintf_dbg(fmt, ...) \
> > > +	dev_dbg(vintf->cmdqv->dev, "VINTF%u: " fmt, vintf->idx, ##__VA_ARGS__)
> > > +
> > > +#define vcmdq_warn(fmt, ...)                                                   \
> > > +	({                                                                     \
> > > +		struct tegra241_vintf *vintf = vcmdq->vintf;                   \
> > > +		if (vintf)                                                     \
> > > +			vintf_warn("VCMDQ%u/LVCMDQ%u: " fmt,                   \
> > > +				   vcmdq->idx, vcmdq->lidx,                    \
> > > +				   ##__VA_ARGS__);                             \
> > > +		else                                                           \
> > > +			dev_warn(vcmdq->cmdqv->dev, "VCMDQ%u: " fmt,           \
> > > +				 vcmdq->idx, ##__VA_ARGS__);                   \
> > > +	})
> 
> > Some of these are barely used, is it worth all these macros??
> 
> Only vcmdq_warn isn't called. But I think it would be useful.
> I could also find a place to call it, if that's a must.

Just call the normal logging functions, there are so few callers
typing out the VCMDQ%u is not going to be so bad



> > > +
> > > +/* Configuring and polling helpers */
> > > +#define tegra241_cmdqv_write_config(_owner, _OWNER, _regval)                   \
> > > +	({                                                                     \
> > > +		bool _en = (_regval) & _OWNER##_EN;                            \
> > > +		u32 _status;                                                   \
> > > +		int _ret;                                                      \
> > > +		writel((_regval), _owner->base + TEGRA241_##_OWNER##_CONFIG);  \
> > > +		_ret = readl_poll_timeout(                                     \
> > > +			_owner->base + TEGRA241_##_OWNER##_STATUS, _status,    \
> > > +			_en ? (_regval) & _OWNER##_ENABLED :                   \
> > > +			      !((_regval) & _OWNER##_ENABLED),                 \
> > > +			1, ARM_SMMU_POLL_TIMEOUT_US);                          \
> > > +		if (_ret)                                                      \
> > > +			_owner##_err("failed to %sable, STATUS = 0x%08X\n",    \
> > > +				     _en ? "en" : "dis", _status);             \
> > > +		atomic_set(&_owner->status, _status);                          \
> > > +		_ret;                                                          \
> > > +	})
> > 
> > I feel like this could be an actual inline function without the macro
> > wrapper with a little fiddling.
> 
> It would be unrolled to three mostly identical inline functions:
> 	tegra241_cmdqv_write_config(cmdqv, regval)
> 	tegra241_vintf_write_config(vintf, regval)
> 	tegra241_vcmdq_write_config(vcmdq, regval)

Expand the parameters in the caller:

__do_write_config(owner->base, &owner->status, _CMDQV_EN, TEGRA241_CMDQ_CONFIG,
                  TEGRA241_CMDQ_STATUS, _CMDQ_ENABLED)

> > > +#define cmdqv_write_config(_regval) \
> > > +	tegra241_cmdqv_write_config(cmdqv, CMDQV, _regval)
> > > +#define vintf_write_config(_regval) \
> > > +	tegra241_cmdqv_write_config(vintf, VINTF, _regval)
> > > +#define vcmdq_write_config(_regval) \
> > > +	tegra241_cmdqv_write_config(vcmdq, VCMDQ, _regval)
> > 
> > More hidden access to stack values
> 
> Btw, any reason for forbidding this practice? It will break the
> build if something goes wrong, which seems to be pretty easy to
> catch.

It is the kernel consensus not to do that. function-like-macros should
act like functions and not reach into some other stack frame. It makes
it very hard to follow the calling function if you can't follow where
the references are.

> > > +	/* Use SMMU CMDQ if vintfs[0] is uninitialized */
> > > +	if (!FIELD_GET(VINTF_ENABLED, atomic_read(&vintf->status)))
> > > +		return &smmu->cmdq;
> > > +
> > > +	/* Use SMMU CMDQ if vintfs[0] has error status */
> > > +	if (FIELD_GET(VINTF_STATUS, atomic_read(&vintf->status)))
> > > +		return &smmu->cmdq;
> > 
> > Why atomic_read? The unlocked interaction with
> > tegra241_cmdqv_handle_vintf0_error() doesn't seem especially sane IMHO
> 
> Race between this get_cmdq() and the isr. Any alternative practice?

It doesn't fix any real race, I'm not sure what this is supposed to be
doing. The cmdq becomes broken and you get an ISR, so before the ISR
it will still post but get stuck, during the ISR it will avoid
posting, and after it will go back to posting?

Why? Just always post to the Q and let the ISR fix it?

> > > +static void tegra241_vcmdq_hw_deinit(struct tegra241_vcmdq *vcmdq)
> > > +{
> > > +	u32 gerrorn, gerror;
> > > +
> > > +	if (vcmdq_write_config(0)) {
> > > +		vcmdq_err("GERRORN=0x%X\n", vcmdq_page0_readl_relaxed(GERRORN));
> > > +		vcmdq_err("GERROR=0x%X\n", vcmdq_page0_readl_relaxed(GERROR));
> > > +		vcmdq_err("CONS=0x%X\n", vcmdq_page0_readl_relaxed(CONS));
> > 
> > Less prints, include a unique message about why this is being
> > printed..
> 
> Something must be wrong if disabling VCMDQ fails, so the prints of
> error register values would be helpful. And "failed to disable" is
> already printed by the vcmdq_write_config() call. I can merge them
> into one vcmdq_err call though.

Print on one line
> > > +static void tegra241_vcmdq_free_smmu_cmdq(struct tegra241_vcmdq *vcmdq)
> > > +{
> > > +	struct tegra241_cmdqv *cmdqv = vcmdq->cmdqv;
> > > +	struct arm_smmu_queue *q = &vcmdq->cmdq.q;
> > > +	size_t nents = 1 << q->llq.max_n_shift;
> > > +
> > > +	dmam_free_coherent(cmdqv->smmu->dev, (nents * CMDQ_ENT_DWORDS) << 3,
> > > +			   q->base, q->base_dma);
> > 
> > If we are calling dmam_free, do we really need devm at all?
> 
> Hmm. This is a part of SMMU's probe/device_reset().

But that is a proper device driver, this isn't.

> > > +	struct tegra241_cmdqv *cmdqv;
> > > +
> > > +	cmdqv = tegra241_cmdqv_find_resource(smmu, id);
> > > +	if (!cmdqv)
> > > +		return NULL;
> > > +
> > > +	if (tegra241_cmdqv_probe(cmdqv)) {
> > > +		if (cmdqv->irq > 0)
> > > +			devm_free_irq(smmu->dev, cmdqv->irq, cmdqv);
> > > +		devm_iounmap(smmu->dev, cmdqv->base);
> > > +		devm_kfree(smmu->dev, cmdqv);
> > > +		return NULL;
> > 
> > Oh. Please don't use devm at all in this code then, it is not attached
> > to a probed driver with the proper scope, devm isn't going to work in
> > sensible way.
> 
> Mind elaborating "it is not"? This function is called by
> arm_smmu_device_acpi_probe and arm_smmu_device_probe.

Normal devm usage will unwind the devm allocations when probe fails.

That doesn't happen here, you open coded the unwind above, and then
you have open coded freeing in another place anyhow.

So just don't use it. There is no value if the places where it should
work automatically are not functioning.

Jason



More information about the linux-arm-kernel mailing list