GPMI iMX6ull timeout on DMA

Han Xu han.xu at nxp.com
Mon Oct 4 09:06:20 PDT 2021


On 21/10/04 05:33PM, Miquel Raynal wrote:
> Hi Michael, Christian,
> 
> michael at amarulasolutions.com wrote on Mon, 4 Oct 2021 08:27:54 +0200:
> 
> > Hi Christian
> > 
> > On Mon, Oct 4, 2021 at 7:54 AM Christian Eggers <ceggers at arri.de> wrote:
> > >
> > > On Monday, 29 July 2019, 08:41:51 CEST, Greg Ungerer wrote:  
> > > > Hi Miquel,
> > > >
> > > > I am experiencing a problem with NAND flash DMA timeouts on
> > > > iMX6ull based boards. The problem is very similar to that
> > > > described in:
> > > >
> > > >    https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flinux-mtd.infradead.narkive.com%2FJIUulfFB%2Fgpmi-imx6ull-timeout-on-dma&data=04%7C01%7Chan.xu%40nxp.com%7C278d7b93edbb4b72923408d9874c5ffe%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637689584362563293%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=uSkVTsEF9yhHt5ZstJMbPIjUQbHzjhiHMjO9eDgFSg0%3D&reserved=0
> > > >
> > > > That didn't come to any specific resolution that I could see
> > > > in that thread.  
> > >
> > > Hi all,
> > >
> > > I am joining this thread because I am also affected by this problem. I use
> > > kernel 5.10.65-rt53 but I have seen this issue on many previous versions. In the
> > > past I only recognized this on my development setup but now this has been found
> > > by our testing team.
> > >
> > > In our test setup we simply perform a reboot every 30s. After 5 to 200 cycles
> > > the test stops due to this error.
> > >
> > > The kernel version I use already includes:
> > >  
> > > > Han Xu <han.xu at nxp.com>
> > > > mtd: rawnand: gpmi: Fix the random DMA timeout issue  
> > >
> > > Additionally I tried ...
> > >  
> > > > Michael Trimarchi <michael at amarulasolutions.com>
> > > > mtd: nand: Calculate the clock before enable it  
> > >
> > > ... but the problem still persists.
> > >
> > > In my case, some registers show different values (annotated below):
> > >  
> > > >
> > > > The boot trace on the console for me looks like this:
> > > >
> > > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda  
> > >   nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xdc  
> > > > nand: Micron MT29F2G08ABAEAWP  
> > >   nand: Micron MT29F4G08ABADAH4  
> > > > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
> > > > gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA
> > > > gpmi-nand 1806000.gpmi-nand: Show GPMI registers :
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001  
> > >   gpmi-nand 1806000.nand-controller: offset 0x0c0 : 0x00000202  
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000
> > > > gpmi-nand 1806000.gpmi-nand: Show BCH registers :
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100  
> > >   gpmi-nand 1806000.nand-controller: offset 0x000 : 0x00000000  
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080  
> > >   gpmi-nand 1806000.nand-controller: offset 0x080 : 0x070a4080  
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080  
> > >   gpmi-nand 1806000.nand-controller: offset 0x090 : 0x10da4080  
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000
> > > > gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000
> > > > gpmi-nand 1806000.gpmi-nand: BCH Geometry :
> > > > GF length              : 13
> > > > ECC Strength           : 8
> > > > Page Size in Bytes     : 2110
> > > > Metadata Size in Bytes : 10
> > > > ECC Chunk0 Size in Bytes: 512
> > > > ECC Chunkn Size in Bytes: 512  
> > >   ECC Chunk Size in Bytes: 512  
> > > > ECC Chunk Count        : 4
> > > > Payload Size in Bytes  : 2048
> > > > Auxiliary Size in Bytes: 16
> > > > Auxiliary Status Offset: 12
> > > > Block Mark Byte Offset : 1999
> > > > Block Mark Bit Offset  : 0  
> > >
> > > Please let me know if further information is required.  

Could you please try to add clock dis/enable when setting clock rate, in case
clock glitches.

clk_disable_unprepare(r->clock[0]);
clk_set_rate(r->clock[0], hw->clk_rate);
clk_prepare_enable(r->clock[0]);

> > 
> > I need to continue on it, during the following days. I have stopped
> > moving to LTS 4.19.y and with my partial revert.
> > The problem as usual was to go to production on some devices. Anyway I
> > have the device that has this problem. I can
> > restart next weekend. One of the thing I notice that make not work on imx28 is:
> > 
> >        if (sdr->tRC_min >= 30000) {
> >                /* ONFI non-EDO modes [0-3] */
> >                hw->clk_rate = 22000000;
> >                wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_4_TO_8NS;
> >        } else if (sdr->tRC_min >= 25000) {
> >                /* ONFI EDO mode 4 */
> >                hw->clk_rate = 80000000;
> >                wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY;
> >        } else {
> >                /* ONFI EDO mode 5 */
> >                hw->clk_rate = 100000000;
> >                wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY;
> >        }
> > 
> > Here there is an assumption that your clk_rate can be set to that rate
> > but on imx28, the parent clock of the NAND one can not
> > let it go to those speed. Changing it let it really set to the wrong
> > value, so imx28 was totally broken. The other computation was based
> > not on fixed clock rate but I think even on clk_get_rate
> 
> Interesting finding. I guess we should try to apply the desired block
> rate and if the final clock rate is too far from what is achievable and
> works we should refuse the requested configuration. The core will
> automatically try the slowest -but perhaps working- modes.
> 
> Thanks,
> Miquèl



More information about the linux-mtd mailing list