[BUG] pxa3xx: wait time out when scanning for bb

Greg Cook greg at morpheus.ws
Mon Dec 11 22:01:31 PST 2017


On 12 December 2017 at 05:16, Boris Brezillon
<boris.brezillon at free-electrons.com> wrote:
> On Mon, 11 Dec 2017 13:24:56 -0300
> Ezequiel Garcia <ezequiel at vanguardiasur.com.ar> wrote:
>
>> Greg,
>>
>> On 11 December 2017 at 10:18, Greg Cook <greg at morpheus.ws> wrote:
>> > Sean,
>> >
>> > I am not completely up-to-date on this, but everything in your traces reads
>> > like the same issue I was having on bringup for Armada 385 nand (under 4.9).
>> > I've been stuck on another project, so I haven't had time to follow up
>> > further, but I just diffed against linux-stable v4.12 pxa3xx_nand.c and it
>> > looks like the problem is still there.
>> >
>> > As far as I can see, the driver is broken for OOB reads when BCH is enabled
>> > because the setup in prepare_set_command() results in drain_fifo() not
>> > reading enough words from the read fifo in the nfc2 IP block.
>
> May I ask how you you get to this conclusion? What makes you think
> there's still unread data in the FIFO?

Because I added some debug lines to count the number of bytes read and
I could  see handle_data_pio() was reading 2048 bytes, followed by 32
bytes of spare. If you have Marvell MV-S109094-00 REV C handy, you can
see in Table 36 that this is not correct. The correct number of spare
bytes for the NFCv2 IP block is
- 64 bytes when SPARE_EN==1 and ECC_EN==0
- 32 bytes when SPARE_EN==1 and ECC_EN==1

>> >
>>
>> Yes, this sounds just like the bug I was expecting for OOB reads.
>>
>> > The patch we are using is below. I have the following in my DTS.
>> > nand-keep-config is commented out because I was having some issues with
>> > u-boot at the time and it may no longer be relevant:
>>
>> Probably.
>>
>> > flash at d0000 {
>> > status = "okay";
>> > num-cs = <1>;
>> > //marvell,nand-keep-config;
>> > marvell,nand-enable-arbiter;
>> > nand-on-flash-bbt;
>> > nand-ecc-strength = <4>;
>> > nand-ecc-step-size = <512>;
>> > };
>> >
>> > --- /home/user/build/linux-stable/drivers/mtd/nand/pxa3xx_nand.c
>> > +++
>> > /home/user/build/beam/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/linux-mvebu/linux-4.9.34/drivers/mtd/nand/pxa3xx_nand.c
>> > @@ -668,7 +669,7 @@
>> >
>> >  static void drain_fifo(struct pxa3xx_nand_info *info, void *data, int len)
>> >  {
>> > - if (info->ecc_bch) {
>> > + if (info->use_ecc && info->ecc_bch) {
>
> This one might explain timeouts occurring when you drain the FIFO for
> an operation that does not enable the ECC engine (like
> READ_PARAM_PAGE).
>
> So this fix is indeed valid, but I'm almost sure it won't fix Sean's
> problem.

>From memory, I don't *think* specific line wasn't causing me any
issues, but I noticed it was not correct because of the way use_ecc
and ecc_bch flags are initialised and then used. As you say, without
this patch, you can get timeouts on non-ECC operations when you have
BCH ECC enabled generally.

>
>> >   u32 val;
>> >   int ret;
>> >
>> > @@ -1012,7 +1014,11 @@
>> >
>> >   if (info->cur_chunk < info->nfullchunks) {
>> >   info->step_chunk_size = info->chunk_size;
>> > - info->step_spare_size = info->spare_size;
>> > + if (info->use_ecc) {
>> > + info->step_spare_size = info->spare_size;
>> > + } else {
>> > + info->step_spare_size = info->spare_size + info->ecc_size;
>> > + }
>
> The only case this change would fix is when you try to read/write pages
> in raw mode, and I'm pretty sure this driver does not support raw
> accesses.

Why do you say it would only affect raw mode? This code is not
specific to raw mode, nor
is the related code in handle_data_pio(), which uses the value of
step_spare_size to drain
the hardware FIFO.

>> >   } else {
>> >   info->step_chunk_size = info->last_chunk_size;
>> >   info->step_spare_size = info->last_spare_size;
>> >
>>
>> Looks like a decent change.
>>
>> Boris: do you think this is stable material worth backporting?
>> In other words, does it make sense to fix it, given the reworked driver?
>



More information about the linux-mtd mailing list