[PATCH v2] mmc: mtk-sd: reduce CIT for better performance
AngeloGioacchino Del Regno
angelogioacchino.delregno at collabora.com
Thu Jun 1 05:21:16 PDT 2023
Il 01/06/23 14:08, Wenbin Mei (梅文彬) ha scritto:
> On Thu, 2023-06-01 at 12:00 +0200, AngeloGioacchino Del Regno wrote:
>>
>> External email : Please do not click links or open attachments until
>> you have verified the sender or the content.
>> Il 01/06/23 05:16, Wenbin Mei (梅文彬) ha scritto:
>>> On Wed, 2023-05-31 at 10:18 +0200, AngeloGioacchino Del Regno
>> wrote:
>>> External email : Please do not click links or open attachments
>> until you have verified the sender or the content.
>>>
>>> Il 31/05/23 09:32, Wenbin Mei (梅文彬) ha scritto:
>>>
>>>> On Thu, 2023-05-18 at 11:13 +0200, AngeloGioacchino Del Regno
>> wrote:
>>>
>>>>> External email : Please do not click links or open attachments
>> until
>>>
>>>>> you have verified the sender or the content.
>>>
>>>>>
>>>
>>>>>
>>>
>>>>> Il 10/05/23 03:58, Wenbin Mei ha scritto:
>>>
>>>>>> CQHCI_SSC1 indicates to CQE the polling period to use when using
>>>
>>>>>> periodic
>>>
>>>>>> SEND_QUEUE_STATUS(CMD13) polling.
>>>
>>>>>> The default value 0x1000 that corresponds to 150us, let's
>> decrease
>>>
>>>>>> it to
>>>
>>>>>
>>>
>>>>> The default value 0x1000 (4096) corresponds to 4096 * 52.08uS =
>>>
>>>>> 231.33uS
>>>
>>>>> ...so the default is not 150uS.
>>>
>>>>>
>>>
>>>>> If I'm wrong, this means that the CQCAP field is not 0, which
>> would
>>>
>>>>> mean
>>>
>>>>> that the expected 3uS would be wrong.
>>>
>>>>>
>>>
>>>>> Also, since the calculation can be done dynamically, this is what
>> we
>>>
>>>>> should
>>>
>>>>> actually do in the driver, as this gives information to the next
>>>
>>>>> engineer
>>>
>>>>> checking this piece of code.
>>>
>>>>>
>>>
>>>>> Apart from this, by just writing 0x40 to the CQHCI_SSC1 register,
>> you
>>>
>>>>> are
>>>
>>>>> assuming that the CQCAP value requirement is fullfilled, but you
>>>
>>>>> cannot
>>>
>>>>> assume that the bootloader has set the CQCAP's ITCFVAL and
>> ITCFMUL
>>>
>>>>> fields
>>>
>>>>> as you expect on all platforms: this means that implementing this
>>>
>>>>> takes
>>>
>>>>> a little more effort.
>>>
>>>>>
>>>
>>>>> You have two ways to implement this:
>>>
>>>>> *** First ***
>>>
>>>>> 1. Read ITCFMUL and ITCFVAL, then:
>>>
>>>>> tclk_mul = itcfmul_to_mhz(ITCFMUL); /* pseudo function
>>>
>>>>> interprets reg value*/
>>>
>>>>> tclk = ITCFVAL * tclk_mul;
>>>
>>>>>
>>>
>>>>> 2. Set SSC1 so that we get 3nS:
>>>
>>>>> #define CQHCI_SSC1_CIT GENMASK(15, 0)
>>>
>>>>> poll_time = cit_time_ns_to_regval(3);
>>>
>>>>> sscit = FIELD_PREP(CQHCI_SSC1_CIT, poll_time)
>>>
>>>>> cqhci_writel( ... )
>>>
>>>>>
>>>
>>>>> *** Second **
>>>
>>>>>
>>>
>>>>> 1. Pre-set ITCFMUL and ITCFVAL to
>>>
>>>>> ITCFVAL = 192 (decimal)
>>>
>>>>> ITCFMUL = 2 (where 2 == 0.1MHz)
>>>
>>>>>
>>>
>>>>> 2. Set SSC1 so that we get 3nS:
>>>
>>>>> #define CQHCI_SSC1_CIT GENMASK(15, 0)
>>>
>>>>> poll_time = cit_time_ns_to_regval(3);
>>>
>>>>> sscit = FIELD_PREP(CQHCI_SSC1_CIT, poll_time)
>>>
>>>>> cqhci_writel( ... )
>>>
>>>>>
>>>
>>>>> I would implement the first way, as it paves the way to extend
>> this
>>>
>>>>> to different
>>>
>>>>> tclk values if needed in the future.
>>>
>>>>>
>>>
>>>>> Regards,
>>>
>>>>> Angelo
>>>
>>>> Hi Angelo,
>>>
>>>>
>>>
>>>> Sorry for lately reply.
>>>
>>>>
>>>
>>>> For Mediatek mmc host IP, ITCFMUL is 0x2(0x1MHz), ITVFVAL reports
>> 182,
>>>
>>>> and these fields are the same and are readonly for all IC, but
>> since
>>>
>>>> Mediatek CQE uses msdc_hclk(273MHz), CMD13'interval calculation
>> driver
>>>
>>>> should use 273MHz to get the actual time, so the actual clock is
>>>
>>>> 27.3MHz.
>>>
>>>>
>>>
>>>
>>> You're right, I've misread the datasheet, just rechecked and it
>> reports RO.
>>>
>>>
>>>> If CIT is 0x1000 by default, CMD idle time: 0x1000 * 1 / 27.3MHz =
>>>
>>>> around 150us.
>>>
>>>>
>>>
>>>> In addition the bootloader will not set the CQCAP's ITCFVAL and
>> ITCFMUL
>>>
>>>> fields, because these fields of CQCAP register is RO(readonly), so
>> we
>>>
>>>> can ignore the change for the CQCAP's ITCFVAL and ITCFMUL fields.
>>>
>>>>
>>>
>>>
>>> Yes, that's right, again - this means that you should go for the
>> first
>>>
>>> proposed implementation, as future MediaTek SoCs may (or may not)
>> change
>>>
>>> that: if you implement as proposed, this is going to be a one-time
>> thing
>>>
>>> and future SoCs won't need specific changes.
>>>
>>>
>>> That implementation also documents the flow about how we're getting
>> to
>>>
>>> the actual value, which is important for community people reading
>> this
>>>
>>> driver in the future for debugging purposes.
>>>
>>>
>>> Regards,
>>>
>>> Angelo
>>>
>>>
>>>
>>> Thanks for your proposal.
>>>
>>>
>>> I have discussed with our designer, and this fields of CQCAP's
>> ITCFVAL and ITCFMUL will not change.
>>> If we add more code for it, these codes will also affect the
>> execution efficiency, even if it has a very
>>> small effect.
>>> I think if it's just for reading convenience, we can add mode
>> comments to make it easier to read the code.
>>> Do you think it's okay to add more comments?
>>>
>>
>> This isn't a performance path, but anyway, if you think that it will
>> be at some
>> point, you can read the two registers at probe time as part of the
>> MMC_CAP2_CQE
>> if branch, and then cache the invariable values to `struct
>> msdc_host`: this
>> will make you able to never perform register reads for ITCFVAL/FMUL
>> in
>> msdc_cqe_enable(), resolving the efficiency issue.
>>
>> Even better, instead of caching ITCFVAL/FMUL to two variables, since
>> the idle
>> timer value likely won't ever change during runtime, you can directly
>> perform
>> the calculation for SSC1 at probe time and cache that value instead,
>> so that
>> in msdc_cqe_enable() you will have something like...
>>
>> /* Set the send status command idle timer */
>> cqhci_writel(cq_host, host->cq_ssc1_time, CQHCI_SSC1);
>>
>> where cq_ssc1_time is
>> struct msdc_host {
>> .......
>> u32 cq_ssc1_time;
>> ....
>> }
>>
>> and where your probe function is
>>
>> static int msdc_drv_probe(struct platform_device *pdev)
>> {
>> ......
>>
>> if (mmc->caps2 & MMC_CAP2_CQE) {
>> host->cq_host = ......
>> ........
>> read itcfval;
>> read itcfmul;
>> host->cq_ssc1_time = calculated-value;
>> ........
>> }
>>
>> .......
>> }
>>
> Yes, I think it's okay for me.
> Another problem, ITCFVAL reports 182 for MediaTek SoCs, but we can not
> use it to calculate, as i said earlier, since our CQE uses
> msdc_hclk(273MHz), CMD13' interval calculation drivers should use
> 273MHz to get the actual time, not 182MHz.
> If we use ITCFVAL, we will get a wrong value.
> So I think it's meaningless.
clk_get_rate(msdc_hclk) gives you the current msdc_hclk clock rate: use it
in place of reading ITCFVAL, that's your solution.
I would imagine that *at least* ITCFMUL is correct on MediaTek SoCs, so you
can use that one as it is.
Regards,
Angelo
>
> Begards,
> Wenbin
>> Regards,
>> Angelo
>>
>>
>>> Begards,
>>> Wenbin
>>>
>>>> Thanks
>>>
>>>> Wenbin
>>>
>>>>>
>>>
>>>>>> 0x40 that corresponds to 3us, which can improve the performance
>> of
>>>
>>>>>> some
>>>
>>>>>> eMMC devices.
>>>
>>>>>>
>>>
>>>>>> Signed-off-by: Wenbin Mei <wenbin.mei at mediatek.com>
>>>
>>>>>> ---
>>>
>>>>>> drivers/mmc/host/mtk-sd.c | 4 ++++
>>>
>>>>>> 1 file changed, 4 insertions(+)
>>>
>>>>>>
>>>
>>>>>> diff --git a/drivers/mmc/host/mtk-sd.c b/drivers/mmc/host/mtk-
>> sd.c
>>>
>>>>>> index edade0e54a0c..ffeccddcd028 100644
>>>
>>>>>> --- a/drivers/mmc/host/mtk-sd.c
>>>
>>>>>> +++ b/drivers/mmc/host/mtk-sd.c
>>>
>>>>>> @@ -2453,6 +2453,7 @@ static void
>> msdc_hs400_enhanced_strobe(struct
>>>
>>>>>> mmc_host *mmc,
>>>
>>>>>> static void msdc_cqe_enable(struct mmc_host *mmc)
>>>
>>>>>> {
>>>
>>>>>> struct msdc_host *host = mmc_priv(mmc);
>>>
>>>>>> + struct cqhci_host *cq_host = mmc->cqe_private;
>>>
>>>>>>
>>>
>>>>>> /* enable cmdq irq */
>>>
>>>>>> writel(MSDC_INT_CMDQ, host->base + MSDC_INTEN);
>>>
>>>>>> @@ -2462,6 +2463,9 @@ static void msdc_cqe_enable(struct
>> mmc_host
>>>
>>>>>> *mmc)
>>>
>>>>>> msdc_set_busy_timeout(host, 20 * 1000000000ULL, 0);
>>>
>>>>>> /* default read data timeout 1s */
>>>
>>>>>> msdc_set_timeout(host, 1000000000ULL, 0);
>>>
>>>>>> +
>>>
>>>>>> + /* decrease the send status command idle timer to 3us */
>>>
>>>>>> + cqhci_writel(cq_host, 0x40, CQHCI_SSC1);
>>>
>>>>>> }
>>>
>>>>>>
>>>
>>>>>> static void msdc_cqe_disable(struct mmc_host *mmc, bool
>> recovery)
>>>
>>>>>
>>>
>>>>>
>>>
>>>
>>>
>>
More information about the linux-arm-kernel
mailing list