[PATCH 1/3] i3c: mipi-i3c-hci: Make bounce buffer code generic to all DMA transfers

Frank Li Frank.li at nxp.com
Tue Jun 17 08:09:41 PDT 2025


On Tue, Jun 10, 2025 at 01:08:12PM -0400, Frank Li wrote:
> On Tue, Jun 10, 2025 at 04:42:35PM +0300, Jarkko Nikula wrote:
> > On 6/9/25 6:04 PM, Frank Li wrote:
> > > On Mon, Jun 09, 2025 at 05:15:44PM +0300, Jarkko Nikula wrote:
> > > > On 6/6/25 6:02 PM, Frank Li wrote:
> > > > > On Fri, Jun 06, 2025 at 10:16:44AM +0300, Jarkko Nikula wrote:
> > > > > > Hi
> > > > > >
> > > > > > On 6/5/25 6:13 PM, Frank Li wrote:
> > > > > > > On Thu, Jun 05, 2025 at 05:07:19PM +0300, Jarkko Nikula wrote:
> > > > > > > > Hi
> > > > > > > >
> > > > > > > > On 6/4/25 6:00 PM, Frank Li wrote:
> > > > > > > > > On Wed, Jun 04, 2025 at 03:55:11PM +0300, Jarkko Nikula wrote:
> > > > > > > > > > Move DMA bounce buffer code for I3C private transfers to be generic for
> > > > > > > > > > all DMA transfers, and round up the receive bounce buffer size to a
> > > > > > > > > > multiple of DWORDs.
> > > > > > > > > >
> > > > > > > > > > It was observed that when the device DMA is IOMMU mapped and the receive
> > > > > > > > > > length is not a multiple of DWORDs, the last DWORD is padded with stale
> > > > > > > > > > data from the RX FIFO, corrupting 1-3 bytes beyond the expected data.
> > > > > > > > > >
> > > > > > > > > > A similar issue, though less severe, occurs when an I3C target returns
> > > > > > > > > > less data than requested. In this case, the padding does not exceed the
> > > > > > > > > > requested number of bytes, assuming the device DMA is not IOMMU mapped.
> > > > > > > > > >
> > > > > > > > > > Therefore, all I3C private transfer, CCC command payload and I2C
> > > > > > > > > > transfer receive buffers must be properly sized for the DMA being IOMMU
> > > > > > > > > > mapped. Even if those buffers are already DMA safe, their size may not
> > > > > > > > > > be, and I don't have a clear idea how to guarantee this other than
> > > > > > > > > > using a local bounce buffer.
> > > > > > > > > >
> > > > > > > > > > To prepare for the device DMA being IOMMU mapped and to address the
> > > > > > > > > > above issue, implement a local, properly sized bounce buffer for all
> > > > > > > > > > DMA transfers. For now, allocate it only when the buffer is in the
> > > > > > > > > > vmalloc() area to avoid unnecessary copying with CCC commands and
> > > > > > > > > > DMA-safe I2C transfers.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Jarkko Nikula <jarkko.nikula at linux.intel.com>
> > > > > > > > > > ---
> > > > > > > > > >      drivers/i3c/master/mipi-i3c-hci/core.c | 34 -------------------
> > > > > > > > > >      drivers/i3c/master/mipi-i3c-hci/dma.c  | 47 +++++++++++++++++++++++++-
> > > > > > > > > >      2 files changed, 46 insertions(+), 35 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/i3c/master/mipi-i3c-hci/core.c b/drivers/i3c/master/mipi-i3c-hci/core.c
> > > > > > > > > > index bc4538694540..24c5e7d5b439 100644
> > > > > > > > > > --- a/drivers/i3c/master/mipi-i3c-hci/core.c
> > > > > > > > > > +++ b/drivers/i3c/master/mipi-i3c-hci/core.c
> > > > > > > > > > @@ -272,34 +272,6 @@ static int i3c_hci_daa(struct i3c_master_controller *m)
> > > > > > > > > >      	return hci->cmd->perform_daa(hci);
> > > > > > > > > >      }
> > > > > > > > > >
> > > > > > > > > ...
> > > > > > > > > >      }
> > > > > > > > > >
> > > > > > > > > > +static void *hci_dma_alloc_safe_xfer_buf(struct i3c_hci *hci,
> > > > > > > > > > +					 struct hci_xfer *xfer)
> > > > > > > > > > +{
> > > > > > > > > > +	if (!is_vmalloc_addr(xfer->data))
> > > > > > > > > > +		return xfer->data;
> > > > > > > > > > +
> > > > > > > > > > +	if (xfer->rnw)
> > > > > > > > > > +		/*
> > > > > > > > > > +		 * Round up the receive bounce buffer length to a multiple of
> > > > > > > > > > +		 * DWORDs. Independently of buffer alignment, DMA_FROM_DEVICE
> > > > > > > > > > +		 * transfers may corrupt the last DWORD when transfer length is
> > > > > > > > > > +		 * not a multiple of DWORDs. This was observed when the device
> > > > > > > > > > +		 * DMA is IOMMU mapped or when an I3C target device returns
> > > > > > > > > > +		 * less data than requested. Latter case is less severe and
> > > > > > > > > > +		 * does not exceed the requested number of bytes, assuming the
> > > > > > > > > > +		 * device DMA is not IOMMU mapped.
> > > > > > > > > > +		 */
> > > > > > > > > > +		xfer->bounce_buf = kzalloc(ALIGN(xfer->data_len, 4),
> > > > > > > > > > +					   GFP_KERNEL);
> > > > > > > > > > +	else
> > > > > > > > > > +		xfer->bounce_buf = kmemdup(xfer->data, xfer->data_len,
> > > > > > > > > > +					   GFP_KERNEL);
> > > > > > > > > > +
> > > > > > > > > > +	return xfer->bounce_buf;
> > > > > > > > >
> > > > > > > > > Why need use bounce_buf? why not use dma_map_single()?, which will check
> > > > > > > > > alignment and size to decide if use swiotlb as bounce buffer.
> > > > > > > > >
> > > > > > > > We do pass the buffer to the dma_map_single(). I've understood swiotlb is
> > > > > > > > transparently used when the DMA cannot directly access the memory but that's
> > > > > > > > not the case here.
> > > > > > >
> > > > > > > why not? even though you have to use internal buf, why not use
> > > > > > > dma_alloc_coherent().
> > > > > > >
> > > > > > I don't think it's suitable for "smallish" per transfer allocations/mappings
> > > > > > and buffer is not accessed concurrently by the CPU and DMA during the
> > > > > > transfer.
> > > > >
> > > > > I still can't understand why need this special handle for i3c. Basic it use
> > > > > dma transfer to transfer data. Can you simple descript hci's data flow?
> > > > >
> > > > Unfortunately my knowledge doesn't go deeply enough in HW but I could guess
> > > > it's somewhere in MIPI I3C HCI DMA engine implementation and/or HCI IP to
> > > > memory bridge/bus integration.
> > > >
> > > > If it would be a cache line issue then I would think we would see corruption
> > > > independently is the DMA IOMMU mapped or not and in larger number of bytes
> > > > than 1-3 last bytes.
> > > >
> > > > Also another finding that if I3C target returns less data than expected then
> > > > there is also padding in the last DWORD with the real data. But not more
> > > > than requested, i.e. 2 bytes expected, target returns 1, only the 2nd byte
> > > > contains stale data and bytes 3-4 untouched. Or 8 bytes expected, target
> > > > returns 1, bytes 2-4 contain stale data and bytes 5-8 untouched.
> > > >
> > > > So something around trailing bytes handling in the HW when the transfer
> > > > length is not multiple of DWORDs as far as I could guess.
> > >
> > > That's reason why should not manually handle dma buffer here. It is
> > > small memory and less proformance requirement.
> > >
> > > If you use dma provided api, you should not take care such detial. DMA
> > > buffer at least one cache line size. otherwise, dma may currupt some data,
> > > which is hard to find.
> > >
> > Do you know what solution would solve the problem here?
> >
> > Let's take a look at the CCC command which is the easiest here. For example
> > GETPID where payload length is 6 bytes and is allocated using the kzalloc()
> > which guarantees the buffer is cache line aligned and ought to be DMA-safe.
> >
> > i3c_master_getpid_locked()
> > 	i3c_ccc_cmd_dest_init() // Allocates 6-byte payload buffer
> > 	i3c_master_send_ccc_cmd_locked()
> > 		i3c_hci_send_ccc_cmd() // mipi-i3c-hci
> > 			xfer[i].data = ccc->dests[i].payload.data;
> > 			hci_dma_queue_xfer()
> > 				// bounce buffer stuff, this patchset
> > 				dma_map_single()
> > 				// setup hw with DMA address
> > 				// and queue the transfer
> >
> > The DMA streaming API dma_map_single() maps buffer's CPU virtual address to
> > the DMA address and we use that to setup the HW and do the DMA.
> >
> > Assume now the DMA is IOMMU mapped. DMA writes correctly 6 bytes to the
> > payload buffer + 2 extra stale bytes. In this case perhaps harmless (due to
> > kzalloc() allocated buffer) but nevertheless corruption which KFENCE may be
> > able to detect.
> >
> > More difficult territory are I3C private transfers and especially I2C
> > transfers. I3C private transfers are expected to pass a DMA-able buffer but
> > I've noticed it may not be always true especially when I3C support is added
> > into existing I2C or SPI target device drivers and that are using
> > regmap_bulk_read(). Those use cases may pass a stack variable (not DMA-able
> > and will be rejected by DMA API), structure member or buffer index and
> > pointer into those come as it via regmap_bulk_read().
> >
> > I2C message buffers are not even expected to be DMA-able.
> > drivers/i2c/i2c-core-base.c has a helper i2c_get_dma_safe_msg_buf() but I
> > see it has the same problem than with the CCC command payload buffer.
> >
> > So far I haven't figured out any better solution than using a local bounce
> > buffer for all of these transfer types.
>
> Thanks, I finially understand the problem you faced. I think other master
> controller face similar issue if they use dma. Let me think more. If alloc
> buffer size is not align to cache line, swtlib will bounce again.
>
> rough idea i3c core layer provide an i3c_(un)map_api to do that.

struct i3c_dma {
	void *bounce;
	dma_addr_t addr;
	enum dma_data_direction;
	void *data;
	size_t data_size;
	size_t bounce_size;
}

struct i3c_dma *i3c_dma_map_single(struct device *dev, void *data, size_t sz, enum dma_data_direction dir)
{
	struct i3c_dma *p

	p = kmalloc(sizeof(*p));
	if (!p)
		return p;

	if (is_vmalloc_addr(data)) {
		p->bounce = kmalloc(align_up(sz, CACHE_LINE_SIZE)); //avoid swtlib bounce again.
		memcpy()
		...
	}

	dma_map_single();
	...
}

void i3c_dma_unmap_single(struct i3c_dma *p)
{
	dma_unmap_single();

	...
	if (data != bounce)
		memcpy();

	kfree(p);
}

#define DEFINE_FREE(i3c_dma_unmap_single, struct i3c_dma *, if (_T)i3c_dma_unmap_single(_T))


Frank



>
> Frank
>
> >
> > --
> > linux-i3c mailing list
> > linux-i3c at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-i3c
>
> --
> linux-i3c mailing list
> linux-i3c at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-i3c



More information about the linux-i3c mailing list