Marvell NFC timings on CN9130

Mon May 22 02:03:30 PDT 2023

Hi Chris,

Chris.Packham at alliedtelesis.co.nz wrote on Mon, 22 May 2023 04:53:54
+0000:

> On 22/05/23 10:53, Chris Packham wrote:
> >
> > On 17/05/23 14:22, Chris Packham wrote:  
> >>
> >> On 17/05/23 05:25, Miquel Raynal wrote:  
> >>> Hi Chris!
> >>>
> >>> Chris.Packham at alliedtelesis.co.nz wrote on Tue, 16 May 2023 04:46:38
> >>> +0000:
> >>>  
> >>>> Hi Miquel, Thomas,
> >>>>
> >>>> A hardware colleague reported a concern to me about a new design we 
> >>>> have
> >>>> using the Marvell CN9130 SoC (which I think was called Armada-8K 
> >>>> before
> >>>> they rebranded).
> >>>>
> >>>> Basically their concern is that the tWC timing they observe is faster
> >>>> (~18ns) than the documented minimum in the hardware datasheet for the
> >>>> CN9130 (25ns). Aside from not meeting the datasheet spec we've not
> >>>> observed any other issue (yet).  
> >>> I would have expected the controller to support almost any kind of
> >>> timings, including SDR EDO mode 5. tWC is 25ns with mode 4, but 20 on
> >>> mode 5 (ONFI). So I believe you're running a system with a chip that is
> >>> not compatible with the fastest mode. If that is the case, it may
> >>> explain why you don't see errors with this chip: it may support
> >>> slightly faster timings than it advertises.
> >>>
> >>> Anyway, if your findings are true, it means the current implementation
> >>> is slightly out of spec and the timing calculation might require to be
> >>> tweaked a little bit to reduce tWC.
> >>>  
> >>>> I notice in the marvell_nand.c driver that marvell_nfc_init() sets the
> >>>> NAND Clock Frequency Select bit (0xF2440700:0) to 1 which runs 
> >>>> according
> >>>> to the datasheet the NAND flash at 400MHz . But the calculations in
> >>>> marvell_nfc_setup_interface() use the value from
> >>>> clk_get_rate(nfc->core_clk) which is still 250MHz so I'm wondering if
> >>>> maybe the fact that the NAND flash is being run faster is having an
> >>>> impact on timings that are calculated around the core_clk frequency.  
> >>> What if you reset this bit? Do you observe different timings? I hope
> >>> you do, otherwise this is a dead-end.  
> >> Yes if we clear the bit the timings go from ~18ns to about 30ns.  
> >>> The timings are derived from this clock but I remember seeing different
> >>> rates than the ones I expected with no obvious explanation (see the "*
> >>> 2" in the calculation of period_ns and the comment right below). So
> >>> maybe this is due to the 400MHz vs. 250MHz issue you are reporting, or
> >>> there is an undocumented pre-scaler in-between (this is my original
> >>> guess).  
> >>
> >> I wondered if the * 2 was because of this or because of the comment 
> >> that the ECC_CLK is 2*NF_CLK. That probably also means that a number 
> >> of SoCs are running with an extra *2 that don't need to be (e.g. 
> >> Armada-385).  
> > Interestingly cp110-system-controller.c is aware of the 400MHz option 
> > but that's only effective if it's been set prior to the kernel 
> > booting. I'm not really familiar with clk drivers but I assume it must 
> > be possible to make it look up the frequency dynamically instead of 
> > using a single fixed value.  
> >>  
> >>>> Do you think that the timings calculations should take the NAND Clock
> >>>> Frequency Select setting into account?  
> >>> There is not much about this clock in the manual, so if the clock is
> >>> feeding the logic of the controller generating the signals on the bus,
> >>> then yes. You can verify this with the test mentioned above.
> >>>
> >>> Could you check the values set to tWP and tWH with and without the bit
> >>> and probe the signals in both cases? Maybe the "* 2" in the
> >>> period_ns calculation will vanish if we use 400MHz as input clock 
> >>> rather
> >>> than clk_get_rate() (or better, expose the bit as a mux-clock and use
> >>> it to tell the CCF the right frequency) and you'll get a sharper tWC in
> >>> the end, which hopefully should match the spec this time.  
> >>
> >> I was going to have a look to see if I can get the NAND clock to 
> >> correctly reflect the value when the NAND Clock Frequency Select bit 
> >> is set. In the meantime I'll also do some experiments removing the * 
> >> 2 and hard-coding the frequency at 400MHz.  
> 
> I learnt something over the course of the day. Given timezones I thought 
> it might be worthwhile getting them out there even if I don't have a 
> patch to offer.

Of course :)

> It appears that only the first set of timings calculated by 
> marvell_nfc_setup_interface() are used. This is because 
> marvell_nfc_select_target() returns early if we are addressing the same 
> chip. So even when we take the SDR timings into account we don't make 
> full use of them in NDTR0/1.

The logic in the core has changed in the past, it is possible
that we did not catch a corner case in this driver.

Maybe this would cleanly solve the problem (there is similar operation
somewhere else in the driver):

--- a/drivers/mtd/nand/raw/marvell_nand.c
+++ b/drivers/mtd/nand/raw/marvell_nand.c
@@ -2457,6 +2457,9 @@ static int marvell_nfc_setup_interface(struct nand_chip *chip, int chipnr,
                        NDTR1_WAIT_MODE;
        }
 
+       /* Ensure the next *_select_target() call will write the timing registers */
+       nfc->selected_chip = NULL;
+
        return 0;
 }

> The original problem I reported was from a customized kernel which 
> included a change to write out the NDTR0/1 registers at the end of 
> marvell_nfc_setup_interface(). So I can make my problem go away by 
> removing the writes to NDTR0/1 but then instead of being too fast they 
> are now way too slow. That'd probably keep the HW engineers happy but it 
> feels a bit wrong.

Yeah, not the right approach.

Thanks,
Miquèl