[PATCH v3 04/11] media: videobuf2: Stop define VB2_MAX_FRAME as global
Tomasz Figa
tfiga at chromium.org
Wed Sep 6 21:15:23 PDT 2023
On Fri, Jul 28, 2023 at 4:25 PM Hsia-Jun Li <Randy.Li at synaptics.com> wrote:
>
>
>
> On 7/28/23 15:03, Tomasz Figa wrote:
> > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> >
> >
> > On Fri, Jul 28, 2023 at 3:55 PM Hsia-Jun Li <Randy.Li at synaptics.com> wrote:
> >>
> >>
> >>
> >> On 7/28/23 14:46, Tomasz Figa wrote:
> >>> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>>
> >>>
> >>> On Mon, Jul 17, 2023 at 4:44 PM Hsia-Jun Li <Randy.Li at synaptics.com> wrote:
> >>>>
> >>>>
> >>>> On 7/12/23 18:48, Tomasz Figa wrote:
> >>>>> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>>>>
> >>>>>
> >>>>> On Mon, Jul 03, 2023 at 04:35:30PM +0800, Hsia-Jun Li wrote:
> >>>>>> On 7/3/23 16:09, Benjamin Gaignard wrote:
> >>>>>>> CAUTION: Email originated externally, do not click links or open
> >>>>>>> attachments unless you recognize the sender and know the content is
> >>>>>>> safe.
> >>>>>>>
> >>>>>>>
> >>>>>>> Le 30/06/2023 à 11:51, Hsia-Jun Li a écrit :
> >>>>>>>> On 6/22/23 21:13, Benjamin Gaignard wrote:
> >>>>>>>>> CAUTION: Email originated externally, do not click links or open
> >>>>>>>>> attachments unless you recognize the sender and know the content is
> >>>>>>>>> safe.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> After changing bufs arrays to a dynamic allocated array
> >>>>>>>>> VB2_MAX_FRAME doesn't mean anything for videobuf2 core.
> >>>>>>>> I think make it 64 which is the VB2_MAX_FRAME in Android GKI kernel is
> >>>>>>>> more reasonable.
> >>>>>>>>
> >>>>>>>> It would be hard to iterate the whole array, it would go worse with a
> >>>>>>>> filter. Such iterate may need to go twice because you mix
> >>>>>>>> post-processing buffer and decoding buffer(with MV) in the same array.
> >>>>>>> Here I don't want to change drivers behavior so I keep the same value.
> >>>>>>> If it happens that they need more buffers, like for dynamic resolution
> >>>>>>> change
> >>>>>>> feature for Verisilicon VP9 decoder, case by case patches will be needed.
> >>>>>>>
> >>>>>> I just don't like the idea that using a variant length array here.
> >>>>>>
> >>>>> "I don't like" is not an argument. We had a number of arguments for
> >>>>> using a generic helper (originally idr, but later decided to go with
> >>>>> XArray, because the former is now deprecated) that we pointed out in
> >>>>> our review comments for previous revisions. It wasn't really about the
> >>>>> size being variable, but rather avoiding open coding things in vb2 and
> >>>>> duplicating what's already implemented in generic code.
> >>>>
> >>>> I just want to say I don't think we need a variable length array to
> >>>> store the buffer here.
> >>>>
> >>>> And the below is the reason that such a case could be avoided in the
> >>>> first place.
> >>>>
> >>>>>
> >>>>>> And I could explain why you won't need so many buffers for the performance
> >>>>>> of decoding.
> >>>>>>
> >>>>>> VP9 could support 10 reference frames in dpb.
> >>>>>>
> >>>>>> Even for those frequent resolution changing test set, it would only happen
> >>>>>> to two resolutions,
> >>>>>>
> >>>>>> 32 would be enough for 20 buffers of two resolution plus golden frames. It
> >>>>>> also leaves enough slots for re-order latency.
> >>>>>>
> >>>>>> If your case had more two resolutions, likes low->medium->high.
> >>>>>>
> >>>>>> I would suggest just skip the medium resolutions, just allocate the lower
> >>>>>> one first for fast playback then the highest for all the possible
> >>>>>>
> >>>>>> medium cases. Reallocation happens frequently would only cause memory
> >>>>>> fragment, nothing benefits your performance.
> >>>>>>
> >>>>> We have mechanisms in the kernel to deal with memory fragmentation
> >>>>> (migration/compaction) and it would still only matters for the
> >>>>> pathologic cases of hardware that require physically contiguous memory.
> >>>>> Modern hardware with proper DMA capabilities can either scatter-gather
> >>>>> or are equipped with an IOMMU, so the allocation always happens in page
> >>>>> granularity and fragmentation is avoided.
> >>>>
> >>>> Unfortunately, there are more devices that didn't have a IOMMU attached
> >>>> to it, supporting scatter gather is more odd.
> >>>>
> >>>> It would be more likely that IOMMU would be disabled for the performance
> >>>> reason.
> >>>
> >>> These days IOMMU is totally mandatory if you want to think about
> >>> having any level of security in your system. Sure, there could be some
> >>> systems that are completely isolated from any external environment,
> >>> like some offline industry automation machines, but then arguably
> >>> their running conditions would also be quite static and require very
> >>> little memory re-allocation.
> >> Vendor just decided not to included such hardware.
> >> That is why From ION to DMA-heap, people like to allocate from a cavout
> >> out memory.
> >>>
> >>> I also don't buy the performance reason. CPUs have been behind MMUs
> >>> for ages and nobody is running them with paging disabled for
> >>> performance reasons. Similarly, most of the modern consumer systems
> >> Page lookup would increase the delay. Besides a few upstream devices
> >> prove them only use a level 1 page table without TBL.
> >
> > That's just an excuse for a bad hardware design/implementation. As I
> > said, there are good IOMMU implementations out there that don't suffer
> > from performance issues.
> >
> I could do nothing about that.
> Besides, even with TLB, cache missing could happen frequently,
> especially we need to access many (5~16, 10 usually) buffers and more
> 11MBytes each in a hardware processing.
> You can't have a very large TLB.
Right, but as I wrote in my previous emails, we have the right methods
in the kernel for providing drivers with contiguous memory and those
can be used for those special cases.
> >>> (mobile phones, PCs) run with IOMMUs enabled for pretty much anything
> >>> because of the security reason and they don't seem to be having any
> >> If the page is secure, you can't operate it in a insecure IOMMU or MMU.
> >> The most security way here, we should use a dedicated memory(or a zone
> >> in unified memory).
> >
> > You still need something to enforce that the hardware is not accessing
> > memory that it's not supposed to access. How do you do that without an
> > IOMMU?
> >
> If you know the arm security pipeline and security controller, you could
> found we could reserved a range of memory for a security id(devices in
> secure world may be a different security domain).
> Besides, a MMU or security MPU could mark some pages for the secure
> world access only, it doesn't mean the device need an IOMMU to access
> them. The MPU could filter the access through the AXI id.
> >> I believe there are more users in mobile for DMA-heap than kernel's dma
> >> allocation API.
> >
> > Yes, but that's completely separate from whether there is an IOMMU or
> > not. It's just a different allocation API.
> >
> The memory heap would mean a dedicated memory usually(we don't talk
> about system heap or why there are many vendor heaps). Dedicated memory
> means contiguous memory in the most of cases.
No, and no.
First no - DMA-buf heap doesn't imply dedicated memory and usually one
wants to completely avoid carving out memory, because it becomes
useless if specific use case is not active.
Second no - there are ways to provide dedicated memory regions to the
DMA mapping API, such as shared or restricted DMA pool [1].
[1] https://elixir.bootlin.com/linux/latest/source/Documentation/devicetree/bindings/reserved-memory/shared-dma-pool.yaml
Best regards,
Tomasz
> >>> performance issues. In fact, it can improve the performance, because
> >>> memory allocation is much easier and without contiguous careouts (as
> >>> we used to have long ago on Android devices) the extra memory can be
> >>> used for buffers and caches to improve system performance.
> >>>
> >>> Best regards,
> >>> Tomasz
> >>>
> >>>>
> >>>>>
> >>>>> Best regards,
> >>>>> Tomasz
> >>>>>
> >>>>>>>>> Remove it from the core definitions but keep it for drivers internal
> >>>>>>>>> needs.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard at collabora.com>
> >>>>>>>>> ---
> >>>>>>>>> drivers/media/common/videobuf2/videobuf2-core.c | 2 ++
> >>>>>>>>> drivers/media/platform/amphion/vdec.c | 1 +
> >>>>>>>>> .../media/platform/mediatek/vcodec/vdec/vdec_vp9_req_lat_if.c | 2 ++
> >>>>>>>>> drivers/media/platform/qcom/venus/hfi.h | 2 ++
> >>>>>>>>> drivers/media/platform/verisilicon/hantro_hw.h | 2 ++
> >>>>>>>>> drivers/staging/media/ipu3/ipu3-v4l2.c | 2 ++
> >>>>>>>>> include/media/videobuf2-core.h | 1 -
> >>>>>>>>> include/media/videobuf2-v4l2.h | 4 ----
> >>>>>>>>> 8 files changed, 11 insertions(+), 5 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c
> >>>>>>>>> b/drivers/media/common/videobuf2/videobuf2-core.c
> >>>>>>>>> index 86e1e926fa45..899783f67580 100644
> >>>>>>>>> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> >>>>>>>>> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> >>>>>>>>> @@ -31,6 +31,8 @@
> >>>>>>>>>
> >>>>>>>>> #include <trace/events/vb2.h>
> >>>>>>>>>
> >>>>>>>>> +#define VB2_MAX_FRAME 32
> >>>>>>>>> +
> >>>>>>>>> static int debug;
> >>>>>>>>> module_param(debug, int, 0644);
> >>>>>>>>>
> >>>>>>>>> diff --git a/drivers/media/platform/amphion/vdec.c
> >>>>>>>>> b/drivers/media/platform/amphion/vdec.c
> >>>>>>>>> index 3fa1a74a2e20..b3219f6d17fa 100644
> >>>>>>>>> --- a/drivers/media/platform/amphion/vdec.c
> >>>>>>>>> +++ b/drivers/media/platform/amphion/vdec.c
> >>>>>>>>> @@ -28,6 +28,7 @@
> >>>>>>>>>
> >>>>>>>>> #define VDEC_MIN_BUFFER_CAP 8
> >>>>>>>>> #define VDEC_MIN_BUFFER_OUT 8
> >>>>>>>>> +#define VB2_MAX_FRAME 32
> >>>>>>>>>
> >>>>>>>>> struct vdec_fs_info {
> >>>>>>>>> char name[8];
> >>>>>>>>> diff --git
> >>>>>>>>> a/drivers/media/platform/mediatek/vcodec/vdec/vdec_vp9_req_lat_if.c
> >>>>>>>>> b/drivers/media/platform/mediatek/vcodec/vdec/vdec_vp9_req_lat_if.c
> >>>>>>>>> index 6532a69f1fa8..a1e0f24bb91c 100644
> >>>>>>>>> --- a/drivers/media/platform/mediatek/vcodec/vdec/vdec_vp9_req_lat_if.c
> >>>>>>>>> +++ b/drivers/media/platform/mediatek/vcodec/vdec/vdec_vp9_req_lat_if.c
> >>>>>>>>> @@ -16,6 +16,8 @@
> >>>>>>>>> #include "../vdec_drv_if.h"
> >>>>>>>>> #include "../vdec_vpu_if.h"
> >>>>>>>>>
> >>>>>>>>> +#define VB2_MAX_FRAME 32
> >>>>>>>>> +
> >>>>>>>>> /* reset_frame_context defined in VP9 spec */
> >>>>>>>>> #define VP9_RESET_FRAME_CONTEXT_NONE0 0
> >>>>>>>>> #define VP9_RESET_FRAME_CONTEXT_NONE1 1
> >>>>>>>>> diff --git a/drivers/media/platform/qcom/venus/hfi.h
> >>>>>>>>> b/drivers/media/platform/qcom/venus/hfi.h
> >>>>>>>>> index f25d412d6553..bd5ca5a8b945 100644
> >>>>>>>>> --- a/drivers/media/platform/qcom/venus/hfi.h
> >>>>>>>>> +++ b/drivers/media/platform/qcom/venus/hfi.h
> >>>>>>>>> @@ -10,6 +10,8 @@
> >>>>>>>>>
> >>>>>>>>> #include "hfi_helper.h"
> >>>>>>>>>
> >>>>>>>>> +#define VB2_MAX_FRAME 32
> >>>>>>>>> +
> >>>>>>>>> #define VIDC_SESSION_TYPE_VPE 0
> >>>>>>>>> #define VIDC_SESSION_TYPE_ENC 1
> >>>>>>>>> #define VIDC_SESSION_TYPE_DEC 2
> >>>>>>>>> diff --git a/drivers/media/platform/verisilicon/hantro_hw.h
> >>>>>>>>> b/drivers/media/platform/verisilicon/hantro_hw.h
> >>>>>>>>> index e83f0c523a30..9e8faf7ba6fb 100644
> >>>>>>>>> --- a/drivers/media/platform/verisilicon/hantro_hw.h
> >>>>>>>>> +++ b/drivers/media/platform/verisilicon/hantro_hw.h
> >>>>>>>>> @@ -15,6 +15,8 @@
> >>>>>>>>> #include <media/v4l2-vp9.h>
> >>>>>>>>> #include <media/videobuf2-core.h>
> >>>>>>>>>
> >>>>>>>>> +#define VB2_MAX_FRAME 32
> >>>>>>>>> +
> >>>>>>>>> #define DEC_8190_ALIGN_MASK 0x07U
> >>>>>>>>>
> >>>>>>>>> #define MB_DIM 16
> >>>>>>>>> diff --git a/drivers/staging/media/ipu3/ipu3-v4l2.c
> >>>>>>>>> b/drivers/staging/media/ipu3/ipu3-v4l2.c
> >>>>>>>>> index e530767e80a5..6627b5c2d4d6 100644
> >>>>>>>>> --- a/drivers/staging/media/ipu3/ipu3-v4l2.c
> >>>>>>>>> +++ b/drivers/staging/media/ipu3/ipu3-v4l2.c
> >>>>>>>>> @@ -10,6 +10,8 @@
> >>>>>>>>> #include "ipu3.h"
> >>>>>>>>> #include "ipu3-dmamap.h"
> >>>>>>>>>
> >>>>>>>>> +#define VB2_MAX_FRAME 32
> >>>>>>>>> +
> >>>>>>>>> /******************** v4l2_subdev_ops ********************/
> >>>>>>>>>
> >>>>>>>>> #define IPU3_RUNNING_MODE_VIDEO 0
> >>>>>>>>> diff --git a/include/media/videobuf2-core.h
> >>>>>>>>> b/include/media/videobuf2-core.h
> >>>>>>>>> index 77921cf894ef..080b783d608d 100644
> >>>>>>>>> --- a/include/media/videobuf2-core.h
> >>>>>>>>> +++ b/include/media/videobuf2-core.h
> >>>>>>>>> @@ -20,7 +20,6 @@
> >>>>>>>>> #include <media/media-request.h>
> >>>>>>>>> #include <media/frame_vector.h>
> >>>>>>>>>
> >>>>>>>>> -#define VB2_MAX_FRAME (32)
> >>>>>>>>> #define VB2_MAX_PLANES (8)
> >>>>>>>>>
> >>>>>>>>> /**
> >>>>>>>>> diff --git a/include/media/videobuf2-v4l2.h
> >>>>>>>>> b/include/media/videobuf2-v4l2.h
> >>>>>>>>> index 5a845887850b..88a7a565170e 100644
> >>>>>>>>> --- a/include/media/videobuf2-v4l2.h
> >>>>>>>>> +++ b/include/media/videobuf2-v4l2.h
> >>>>>>>>> @@ -15,10 +15,6 @@
> >>>>>>>>> #include <linux/videodev2.h>
> >>>>>>>>> #include <media/videobuf2-core.h>
> >>>>>>>>>
> >>>>>>>>> -#if VB2_MAX_FRAME != VIDEO_MAX_FRAME
> >>>>>>>>> -#error VB2_MAX_FRAME != VIDEO_MAX_FRAME
> >>>>>>>>> -#endif
> >>>>>>>>> -
> >>>>>>>>> #if VB2_MAX_PLANES != VIDEO_MAX_PLANES
> >>>>>>>>> #error VB2_MAX_PLANES != VIDEO_MAX_PLANES
> >>>>>>>>> #endif
> >>>>>>>>> --
> >>>>>>>>> 2.39.2
> >>>>>>>>>
> >>>>>> --
> >>>>>> Hsia-Jun(Randy) Li
> >>>>>>
> >>>> --
> >>>> Hsia-Jun(Randy) Li
> >>>>
> >>
> >> --
> >> Hsia-Jun(Randy) Li
>
> --
> Hsia-Jun(Randy) Li
More information about the Linux-rockchip
mailing list