GPMI iMX6ull timeout on DMA

Miquel Raynal miquel.raynal at bootlin.com
Mon Oct 4 08:33:51 PDT 2021


Hi Michael, Christian,

michael at amarulasolutions.com wrote on Mon, 4 Oct 2021 08:27:54 +0200:

> Hi Christian
> 
> On Mon, Oct 4, 2021 at 7:54 AM Christian Eggers <ceggers at arri.de> wrote:
> >
> > On Monday, 29 July 2019, 08:41:51 CEST, Greg Ungerer wrote:  
> > > Hi Miquel,
> > >
> > > I am experiencing a problem with NAND flash DMA timeouts on
> > > iMX6ull based boards. The problem is very similar to that
> > > described in:
> > >
> > >    https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma
> > >
> > > That didn't come to any specific resolution that I could see
> > > in that thread.  
> >
> > Hi all,
> >
> > I am joining this thread because I am also affected by this problem. I use
> > kernel 5.10.65-rt53 but I have seen this issue on many previous versions. In the
> > past I only recognized this on my development setup but now this has been found
> > by our testing team.
> >
> > In our test setup we simply perform a reboot every 30s. After 5 to 200 cycles
> > the test stops due to this error.
> >
> > The kernel version I use already includes:
> >  
> > > Han Xu <han.xu at nxp.com>
> > > mtd: rawnand: gpmi: Fix the random DMA timeout issue  
> >
> > Additionally I tried ...
> >  
> > > Michael Trimarchi <michael at amarulasolutions.com>
> > > mtd: nand: Calculate the clock before enable it  
> >
> > ... but the problem still persists.
> >
> > In my case, some registers show different values (annotated below):
> >  
> > >
> > > The boot trace on the console for me looks like this:
> > >
> > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda  
> >   nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xdc  
> > > nand: Micron MT29F2G08ABAEAWP  
> >   nand: Micron MT29F4G08ABADAH4  
> > > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
> > > gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA
> > > gpmi-nand 1806000.gpmi-nand: Show GPMI registers :
> > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002
> > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c
> > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101
> > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336
> > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee
> > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001
> > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001  
> >   gpmi-nand 1806000.nand-controller: offset 0x0c0 : 0x00000202  
> > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000
> > > gpmi-nand 1806000.gpmi-nand: Show BCH registers :
> > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100  
> >   gpmi-nand 1806000.nand-controller: offset 0x000 : 0x00000000  
> > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010
> > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080  
> >   gpmi-nand 1806000.nand-controller: offset 0x080 : 0x070a4080  
> > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080  
> >   gpmi-nand 1806000.nand-controller: offset 0x090 : 0x10da4080  
> > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080
> > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080
> > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080
> > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080
> > > gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080
> > > gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080
> > > gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342
> > > gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000
> > > gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000
> > > gpmi-nand 1806000.gpmi-nand: BCH Geometry :
> > > GF length              : 13
> > > ECC Strength           : 8
> > > Page Size in Bytes     : 2110
> > > Metadata Size in Bytes : 10
> > > ECC Chunk0 Size in Bytes: 512
> > > ECC Chunkn Size in Bytes: 512  
> >   ECC Chunk Size in Bytes: 512  
> > > ECC Chunk Count        : 4
> > > Payload Size in Bytes  : 2048
> > > Auxiliary Size in Bytes: 16
> > > Auxiliary Status Offset: 12
> > > Block Mark Byte Offset : 1999
> > > Block Mark Bit Offset  : 0  
> >
> > Please let me know if further information is required.  
> 
> I need to continue on it, during the following days. I have stopped
> moving to LTS 4.19.y and with my partial revert.
> The problem as usual was to go to production on some devices. Anyway I
> have the device that has this problem. I can
> restart next weekend. One of the thing I notice that make not work on imx28 is:
> 
>        if (sdr->tRC_min >= 30000) {
>                /* ONFI non-EDO modes [0-3] */
>                hw->clk_rate = 22000000;
>                wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_4_TO_8NS;
>        } else if (sdr->tRC_min >= 25000) {
>                /* ONFI EDO mode 4 */
>                hw->clk_rate = 80000000;
>                wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY;
>        } else {
>                /* ONFI EDO mode 5 */
>                hw->clk_rate = 100000000;
>                wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY;
>        }
> 
> Here there is an assumption that your clk_rate can be set to that rate
> but on imx28, the parent clock of the NAND one can not
> let it go to those speed. Changing it let it really set to the wrong
> value, so imx28 was totally broken. The other computation was based
> not on fixed clock rate but I think even on clk_get_rate

Interesting finding. I guess we should try to apply the desired block
rate and if the final clock rate is too far from what is achievable and
works we should refuse the requested configuration. The core will
automatically try the slowest -but perhaps working- modes.

Thanks,
Miquèl



More information about the linux-mtd mailing list