eMMC tuning issue on Odroid C2 and a possible solution

Heiner Kallweit hkallweit1 at gmail.com
Thu Oct 12 11:17:02 PDT 2017


Am 12.10.2017 um 17:22 schrieb Jerome Brunet:
> On Wed, 2017-10-11 at 22:46 +0200, Heiner Kallweit wrote:
>> Hi Jerome,
>>
>> we have the known issue that the latest next kernel still fails on
>> Odroid C2 with a 128GB eMMC card (w/o adjusting the initial tx phase).
>>
>> I found the time to dig a little deeper into it and reason is that
>> there are certain rx/tx phase combinations which are perfectly fine
>> when tuning but fail in real life.
> 
> You did not mention it in this mail, but I going to guess this is done, again,
> in hs400 ...
> 
Sorry. I use the default, hs200 with 200MHz.

>> Don't ask me how this can happen, I just see it happen.
> 
> Oh I have pretty good idea of what is happening here, and to be absolutely it is
> not a regression.
> 
> I think I already explained this but, before the MMC series that went in this
> cycle, the frequency reported by the mmc driver was twice what it really was
> for all DDR mode, including HS400.
> 
> So, up to v4.13, when you thought you were doing hs400 @ 200MHz, the frequency
> was really 100Mhz
> 
> With all the mmc patches applied (including the recent tuning fix) the frequency
> in DDR mode is set as requested by the framework. So on your target, it has
> increased from 100MHz to 166Mhz. This is a huge difference. You did not change
> the DT but the frequency did change.
> 
> The symptoms you are describing, I have seen them while trying SDR104 on boards
> where the layout proved to be unable to cope with the clock rate involved. Some
> cards worked well, some did not at all. When looking at the signal amplitude,
> the problem was clear, the signal quality was just not good enough. The
> communication might be OK for a short while (while doing the tuning) but may
> fail while doing "real life" transfers.
> 
> So why does the tuning succeed and you get errors later on ? Simply because the
> tuning is not the problem. The HW (and the frequency used) is.
> 
> As far as I can tell, your eMMC card + your odroidc2 is simply not able to cope
> (reliably) with 166Mhz. The fact that you continue to have CRC errors with your
> "hand picked phases" is an evidence of this fact.
> 
When setting the default tx phase to 0 I have a rock-stable system w/o any CRC
error (hs200 with DT 200MHz).

When leaving the default tx phase at 270, after tuning I end up with rx phase
90 and tx phase 300. This combination works perfect when tuning but fails
in real life.

I saw in the chip spec that there are few emmc-related calibration values in
SD_EMMC_ADJUST and SD_EMMC_CALOUT. However there's no documentation how to
use them. Looking at the vendor driver might help, though.
Did you have a closer look at these values ?

> This is not the only platform out there which can't cope with hs400 @ 166Mhz,
> the p200 eMMC is hs400 capable, but the platform can't cope with it. That's how
> it is.
> 
> Lower your frequency or change the mode to hs200 (which is the setting in the
> upstream DT BTW)
> 
>>
>> To deal with such cases I added some code to avoid known invalid phase
>> values when retuning. In addition I added some code to deal with the
>> following (more or less corner) case:
>> Let's say we have rx = 0° and tx = 0° and working is only combination
>> rx = 180° and tx = 180°.
>> Then just tuning rx only or tx only will never result in a working
>> combination.
>>
>> Following patch makes my system work. I just see one CRC error when
>> the initally tuned rx/tx phase combination fails and then the
>> retuning results in a stable system w/o further errors.
>>
>> I'd appreciate if you could check that this patch doesn't break any
>> of your systems.
>>
>> Rgds, Heiner
>>
>> ---
>>  drivers/mmc/host/meson-gx-mmc.c | 29 ++++++++++++++++++++++-------
>>  1 file changed, 22 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
>> index 85745ef1..95cb439d 100644
>> --- a/drivers/mmc/host/meson-gx-mmc.c
>> +++ b/drivers/mmc/host/meson-gx-mmc.c
>> @@ -120,6 +120,7 @@
>>  #define SD_EMMC_DESC_CHAIN_MODE BIT(1)
>>  
>>  #define MUX_CLK_NUM_PARENTS 2
>> +#define MAX_TUNING_ATTEMPTS 10
>>  
>>  struct sd_emmc_desc {
>>  	u32 cmd_cfg;
>> @@ -687,15 +688,23 @@ static int meson_mmc_find_tuning_point(unsigned long
>> *test)
>>  static int meson_mmc_clk_phase_tuning(struct mmc_host *mmc, u32 opcode,
>>  				      struct clk *clk)
>>  {
>> -	int point, ret;
>> +	int point, ret, old_phase;
>>  	DECLARE_BITMAP(test, CLK_PHASE_POINT_NUM);
>>  
>> +	old_phase = clk_get_phase(clk);
>> +	if (old_phase < 0)
>> +		return old_phase;
>> +
>>  	dev_dbg(mmc_dev(mmc), "%s phase/delay tunning...\n",
>>  		__clk_get_name(clk));
>>  	bitmap_zero(test, CLK_PHASE_POINT_NUM);
>>  
>>  	/* Explore tuning points */
>>  	for (point = 0; point < CLK_PHASE_POINT_NUM; point++) {
>> +		/* when retuning avoid the surrounding of where we failed */
>> +		if (mmc->doing_retune)
>> +			if (abs(point * CLK_PHASE_STEP - old_phase) <= 45)
>> +				continue;
> 
> This is just a fake alteration of the working window.
> The point of this whole tuning is to find the middle of the window to get the
> most stable setting
> 
> If you still get CRC error with the center of the window, the signal quality is
> just too low to cope with the timing set.
> 
> This is patch is just a (convoluted) hack that will force the tuning algorithm
> to come up with new values each times, hoping to pick setting where the problem
> is bit masked.
> 
> 
>>  		clk_set_phase(clk, point * CLK_PHASE_STEP);
>>  		ret = mmc_send_tuning(mmc, opcode, NULL);
>>  		if (!ret)
>> @@ -704,8 +713,11 @@ static int meson_mmc_clk_phase_tuning(struct mmc_host
>> *mmc, u32 opcode,
>>  
>>  	/* Find the optimal tuning point and apply it */
>>  	point = meson_mmc_find_tuning_point(test);
>> -	if (point < 0)
>> +	if (point < 0) {
>> +		/* prevent from getting stuck if we failed */
>> +		clk_set_phase(clk, (old_phase + 90) % 360);
>>  		return point; /* tuning failed */
>> +	}
>>  
>>  	clk_set_phase(clk, point * CLK_PHASE_STEP);
>>  	dev_dbg(mmc_dev(mmc), "success with phase: %d\n",
>> @@ -716,7 +728,7 @@ static int meson_mmc_clk_phase_tuning(struct mmc_host
>> *mmc, u32 opcode,
>>  static int meson_mmc_execute_tuning(struct mmc_host *mmc, u32 opcode)
>>  {
>>  	struct meson_host *host = mmc_priv(mmc);
>> -	int ret;
>> +	int i, ret;
>>  
>>  	/*
>>  	 * If this is the initial tuning, try to get a sane Rx starting
>> @@ -729,11 +741,14 @@ static int meson_mmc_execute_tuning(struct mmc_host
>> *mmc, u32 opcode)
>>  			return ret;
>>  	}
>>  
>> -	ret = meson_mmc_clk_phase_tuning(mmc, opcode, host->tx_clk);
>> -	if (ret)
>> -		return ret;
>> +	for (i = 0; i < MAX_TUNING_ATTEMPTS; i++) {
>> +		meson_mmc_clk_phase_tuning(mmc, opcode, host->tx_clk);
>> +		ret = meson_mmc_clk_phase_tuning(mmc, opcode, host->rx_clk);
>> +		if (!ret)
>> +			return 0;
> 
> With the hack that is going to skip 25% on the phase, that is up to:
> 12 phase * 2 (TX/RX) * 10 * 0.75 = 180 MMC tunes !! that's way too much and
> unnecessary.
> 
>> +	}
>>  
>> -	return meson_mmc_clk_phase_tuning(mmc, opcode, host->rx_clk);
>> +	return ret;
>>  }
>>  
>>  static void meson_mmc_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
> 
> 




More information about the linux-amlogic mailing list