media: mediatek: vcodec: fix AV1 decode fail for 36bit iova
Nicolas Dufresne
nicolas at ndufresne.ca
Wed Jun 28 09:07:13 PDT 2023
Hi,
Le mercredi 28 juin 2023 à 13:41 +0800, Xiaoyong Lu a écrit :
> Decoder hardware will access incorrect iova address when tile buffer is
> 36bit, leading to iommu fault when hardware access dram data.
>
> Fixes: 2f5d0aef37c6 ("media: mediatek: vcodec: support stateless AV1 decoder")
> Signed-off-by: Xiaoyong Lu<xiaoyong.lu at mediatek.com>
> ---
> - Test ok: mt8195 32bit and mt8188 36bit iova.
> ---
> .../platform/mediatek/vcodec/vdec/vdec_av1_req_lat_if.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/media/platform/mediatek/vcodec/vdec/vdec_av1_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/vdec/vdec_av1_req_lat_if.c
> index 404a1a23fd40..420222c8a56d 100644
> --- a/drivers/media/platform/mediatek/vcodec/vdec/vdec_av1_req_lat_if.c
> +++ b/drivers/media/platform/mediatek/vcodec/vdec/vdec_av1_req_lat_if.c
> @@ -1658,9 +1658,9 @@ static void vdec_av1_slice_setup_tile_buffer(struct vdec_av1_slice_instance *ins
> u32 allow_update_cdf = 0;
> u32 sb_boundary_x_m1 = 0, sb_boundary_y_m1 = 0;
> int tile_info_base;
> - u32 tile_buf_pa;
> + u64 tile_buf_pa;
> u32 *tile_info_buf = instance->tile.va;
> - u32 pa = (u32)bs->dma_addr;
> + u64 pa = (u64)bs->dma_addr;
>
> if (uh->disable_cdf_update == 0)
> allow_update_cdf = 1;
> @@ -1673,7 +1673,8 @@ static void vdec_av1_slice_setup_tile_buffer(struct vdec_av1_slice_instance *ins
> tile_info_buf[tile_info_base + 0] = (tile_group->tile_size[tile_num] << 3);
> tile_buf_pa = pa + tile_group->tile_start_offset[tile_num];
>
> - tile_info_buf[tile_info_base + 1] = (tile_buf_pa >> 4) << 4;
> + tile_info_buf[tile_info_base + 1] = (unsigned int)(tile_buf_pa >> 4) << 4 +
> + ((unsigned int)(tile_buf_pa >> 32) & 0xf);
I'm not clear on how this works. In the original code, it was a complicated way
to ignore the 4 least significant bits. Something like this would avoid the cast
and clarify it:
tile_info_buf[tile_info_base + 1] = tile_buf_pa & 0xFFFFFFFFFFFFFF00ull;
But in the updated code, if you have 36 bit, you store these 2 bits in the lower
part, which was originally cleared. Can you confirm this is exactly what you
wanted ? And if so add a comment ? It could also be written has (but this is
just me considering this more readable, I also prefer | (or) rather then +, and
hates casting):
tile_info_buf[tile_info_base + 1] = (tile_buf_pa & 0xFFFFFFFFFFFFFF00ull) |
(tile_buf_pa & 0x0000000F00000000ull) >> 32;
> tile_info_buf[tile_info_base + 2] = (tile_buf_pa % 16) << 3;
Is this the same as ?
tile_info_buf[tile_info_base + 2] = (tile_buf_pa & 0x00FFull) << 3;
>
>
> sb_boundary_x_m1 =
More information about the linux-arm-kernel
mailing list