[PATCH v2 3/3] mtd: rawnand: Support for sequential cache reads

Mon Jul 17 00:19:00 PDT 2023

Hi Måns & Alexander,

mans at mansr.com wrote on Sun, 16 Jul 2023 18:46:26 +0100:

> Miquel Raynal <miquel.raynal at bootlin.com> writes:
> 
> > Hello all,
> >
> > So here is a summary of the situation:
> > - we have two bug reports regarding the use of sequential page reads
> > - both are on TI OMAP platforms: AM33XX and AM3517. I believe both are
> >   using the same omap2.c driver
> > - they use a Micron and a Samsung NAND chip
> >
> > All these information gives me the hint that it is related to the
> > controller driver which does something silly during the exec_op phase.
> >
> > Alexander and Måns, can you please tell me:
> > - Are you using a gpio for the waitrdy thing or do you leverage
> >   nand_soft_waitrdy()? If you are using the gpio, can you both try with
> >   the soft implementation and see if it changes something?  
> 
> There's no gpio specified in the devicetree, so I guess it must be using
> nand_soft_waitrdy().
> 
> > - Are you using any POLL or DMA prefetch mode? Can you please force the
> >   default in and out helpers by using only omap_nand_data_in() and
> >   omap_nand_data_out() to see if it changes something?  
> 
> It was using the default PREFETCH_POLLED mode.  Switching it to POLLED
> (and thus omap_nand_data_in/out()) doesn't change anything.
> 
> > I believe there is something wrong in the timings, while properly
> > implemented in theory there might be some cases where we miss a barrier
> > or something like that. I would like to try the following two hacks,
> > and see if we can find what is the timing that is not observed, despite
> > the lack of probing. The first one is a real hack, the second one might
> > actually look like a real fix. Please let me know, both of you, if you
> > see different behaviors.
> >
> > *** HACK #1 ***
> >
> > --- a/drivers/mtd/nand/raw/omap2.c
> > +++ b/drivers/mtd/nand/raw/omap2.c
> > @@ -2113,6 +2113,9 @@ static int omap_nand_exec_instr(struct nand_chip *chip,
> >         case NAND_OP_CMD_INSTR:
> >                 iowrite8(instr->ctx.cmd.opcode,
> >                          info->reg.gpmc_nand_command);
> > +               if (instr->ctx.cmd.opcode == NAND_CMD_READCACHESEQ ||
> > +                   instr->ctx.cmd.opcode == NAND_CMD_READCACHEEND)
> > +                       udelay(50);
> >                 break;
> >
> >         case NAND_OP_ADDR_INSTR:
> >
> > *** HACK #2 ***
> >
> > --- a/drivers/mtd/nand/raw/omap2.c
> > +++ b/drivers/mtd/nand/raw/omap2.c
> > @@ -2143,8 +2146,10 @@ static int omap_nand_exec_instr(struct nand_chip *chip,
> >                 break;
> >         }
> >
> > -       if (instr->delay_ns)
> > +       if (instr->delay_ns) {
> > +               mb();
> >                 ndelay(instr->delay_ns);
> > +       }
> >
> >         return 0;
> >  }  
> 
> Neither of these help.

I am also pasting Alexander's answer here:

> We are using GPIO for waitrdy. However, even when switching to soft
> implementation,
> the system still behaves incorrectly.
> We are using prefetch-dma mode for xfer. Changing to the default implementation
> does not result in any improvements.
> 
> Both patches don't help :(

Thanks a lot to both of you for testing.

So, I should have done that earlier but, could you please slow the
whole operation down, just to see if there is something wrong with the
timings or if we should look in another direction.

Maybe you could add a boolean to flag if the last CMD was a
READCACHESEQ, READCACHESTART or READCACHEEND, and if the flag is
true, please get the jiffies before and after each waitrdy and
delay_ns. Finally, please print the expected delay and the actual one
and compare to see if something was too fast compared to what we
expected.

In a second test, you could simply add a udelay(50); at the end of
omap_nand_exec_instr().

Thanks,
Miquèl