[PATCH 2/2] mtd: nand: driver for Conexant Digicolor NAND Flash Controller

Thu Feb 19 02:52:23 PST 2015

Hi Baruch,

On Thu, 19 Feb 2015 11:43:01 +0200
Baruch Siach <baruch at tkos.co.il> wrote:

> Hi Boris,
> 
> On Thu, Feb 19, 2015 at 12:17:04AM +0100, Boris Brezillon wrote:
> > On Thu, 12 Feb 2015 13:10:19 +0200
> > Baruch Siach <baruch at tkos.co.il> wrote:
> > > This commit adds driver for the NAND flash controller on the CX92755 SoC. 
> > > This
> > > SoC is one of the Conexant Digicolor series, and this driver should support
> > > other SoCs in this series.
> > 
> > I haven't done any coding style review here, so make sure to fix
> > all errors/warnings reported by checkpatch if any.
> > 
> > > Only hardware syndrome ECC mode is currently supported.
> > > 
> > > This driver was tested on the Equinox CX92755 EVK, with the Samsung K9GAG08U0E
> > > NAND chip (MLC, 8K pages, 436 bytes OOB). Test included attach of UBI volume,
> > > mount of UBIFS filesystem, and files read/write.
> > 
> > Could you also run mtd test (kernel module tests) and provide their
> > results in your next version ?
> 
> OK. Will do.
> 
> [...]
> 
> > > +#define CMD_NUMBER_BYTES(n)		((n) << 8) /* ALE command */
> > 
> > Unless you have good reason to keep this name I would name it
> > CMD_ADDR_CYCLES.
> 
> I used the wording of the datasheet. Actually this field also applies to the 
> CLE command, but currently we assume that CLE is always a single byte.

Okay, then adding that information (stating that this field is used for
both CMD and ADDR cycles) in the comment would be interesting.

> 
> [...]
> 
> > > +struct digicolor_nfc {
> > > +	void __iomem		*regs;
> > > +	struct mtd_info		mtd;
> > > +	struct nand_chip	nand;
> > > +	struct device		*dev;
> > > +
> > > +	unsigned long		clk_rate;
> > > +
> > > +	u32 ale_cmd;
> > > +	u32 ale_data;
> > > +	int ale_data_bytes;
> > > +
> > > +	u32 nand_cs;
> > > +	int t_ccs;
> > > +};
> > 
> > Sorry, you're the first one I'll bother with this.
> > Even if most drivers are already mixing the NAND chips and NAND
> > controller concepts, I really think those 2 elements should be properly
> > separated.
> > 
> > Correct me if I'm wrong, but I'm pretty sure _nfc stands for NAND Flash
> > Controller, and as such, NFC related fields should be part of your NAND
> > controller implementation (inherited from the struct nand_hw_control)
> > and not your NAND chip implementation (inherited from nand_chip).
> > 
> > If you need an example of such NAND chip/NAND controller separation,
> > you can take a look at the sunxi driver ;-). 
> 
> I understand you concern. However, since I don't have the needed hardware to 
> test the multi nand chip setup I want to keep the code simple. These two 
> fields are separated from the others by a blank line on purpose, and I'll also 
> add a comment to explain that these are per nand chip files.

I'm not asking you to support multiple chips right now.
What I'm asking here is to dispatch NFC and NAND fields in different
structures:

struct digicolor_nand {
	struct mtd_info		mtd;
	struct nand_chip	nand;

	u32 nand_cs;
	int t_ccs;
}

struct digicolor_nfc {
	struct nand_hw_control	base;
	void __iomem		*regs;
	struct device		*dev;

	unsigned long		clk_rate;

	u32 ale_cmd;
	u32 ale_data;
	int ale_data_bytes;

	/* Only support one NAND for now */
	struct digicolor_nand	nand;
};

You'll keep the same logic except that this separation will be clearly
visible.

> 
> > > +/*
> > > + * Table of BCH configuration options. The index of this table (0 - 5) is set
> > > + * in the BchTconfig field of the NFC_CONTROL register.
> > > + */
> > > +struct bch_configs_t {
> > 
> > I haven't seen any struct defined with an _t, AFAIK _t are reserved for
> > typedef definitions.
> 
> Well, there are some precedents:
> 
> $ git grep 'struct.*_t {$' -- include/ |grep -v typedef
> include/crypto/vmac.h:struct vmac_ctx_t {
> include/linux/dcache.h:struct dentry_stat_t {
> include/linux/dma-buf.h:	struct dma_buf_poll_cb_t {
> include/linux/kgdb.h:struct dbg_reg_def_t {
> include/linux/sctp.h:struct sctp_shutdown_chunk_t {
> include/net/9p/client.h:struct p9_req_t {
> include/net/tso.h:struct tso_t {
> include/uapi/linux/fs.h:struct inodes_stat_t {
> include/uapi/linux/pkt_cls.h:struct tcf_t {
> 
> The 'typedef' variant for structs is somewhat more prevalent but not 
> overwhelmingly so.

I'll let Brian decide on that aspect.

> 
> > > +	int bits;	/* correctable error bits number (strength) per step */
> > > +	int r_bytes;	/* extra bytes needed per step */
> > > +} bch_configs[] = {
> > > +	{  6, 11 },
> > > +	{  7, 13 },
> > > +	{  8, 14 },
> > > +	{ 24, 42 },
> > > +	{ 28, 49 },
> > > +	{ 30, 53 },
> > > +};
> > 
> > Just giving my opinion here, but I don't like when values assignment
> > and struct (or type) definitions are mixed.
> 
> OK. I'll change that.

Thanks.

> 
> [...]
> 
> > > +static int digicolor_nfc_wait_ready(struct digicolor_nfc *nfc, int op)
> > > +{
> > > +	unsigned long timeout = jiffies + msecs_to_jiffies(TIMEOUT_MS);
> > > +	u32 mask;
> > > +
> > > +	switch (op) {
> > > +	case CMD:		mask = NFC_INT_CMDREG_READY; break;
> > > +	case DATA_READ:		mask = NFC_INT_READ_DATAREG_READY; break;
> > > +	case DATA_WRITE:	mask = NFC_INT_WRITE_DATAREG_READY; break;
> > > +	case ECC_STATUS:	mask = NFC_INT_ECC_STATUS_READY; break;
> > > +	}
> > 
> > I had a look at digicolor_nfc_wait_ready callers, and IMHO this
> > op -> mask conversion is pretty much useless.
> > Callers already know what they expect and could easily pass flags
> > directly.
> 
> It's just that the INT flag name is quite long, and it make the code using it 
> less readable IMO. Maybe some macro trickery would do the job here.

Yep, if that's about avoiding over 80 character lines you could define
such a macro:

#define NFC_RDY(flag)		NFC_INT_ ## flag ## _READY

> 
> [...]
> 
> > > +static void digicolor_nfc_cmd_write(struct digicolor_nfc *nfc, u32 data)
> > > +{
> > > +	if (digicolor_nfc_wait_ready(nfc, CMD))
> > > +		return;
> > > +	writel_relaxed(data, nfc->regs + NFC_COMMAND);
> > 
> > Are you sure you shouldn't provide a return code here ?
> > If the wait_ready call times out, you're just assuming it succeed,
> > which is not really safe.
> 
> Right. I'll change that.
> 
> > > +static int digicolor_nfc_ecc_status(struct digicolor_nfc *nfc)
> > > +{
> > > +	u32 status;
> > > +
> > > +	if (digicolor_nfc_wait_ready(nfc, ECC_STATUS))
> > > +		return -1;
> > 
> > 		return -ETIMEDOUT
> > 
> > would be more appropriate (or just returning the
> > digicolor_nfc_wait_ready result if it's not 0).
> 
> Originally I intended for this return value to be an internal error 
> indication. You are right though that the code should handle the timeout error 
> differently.
> 
> > > +	status = readl_relaxed(nfc->regs + NFC_STATUS_1);
> > > +	writel_relaxed(NFC_INT_ECC_STATUS_READY, nfc->regs + NFC_INTFLAG_CLEAR);
> > > +
> > > +	if (status & NFC_STATUS_1_UNCORRECTED_ERROR)
> > > +		return -1;
> > 
> > 		return -EIO;
> > 
> > (or something else, but I don't recall).
> 
> OK.
> 
> [...]
> 
> > > +static int digicolor_nfc_dev_ready(struct mtd_info *mtd)
> > > +{
> > > +	struct nand_chip *chip = mtd->priv;
> > > +	struct digicolor_nfc *nfc = chip->priv;
> > > +	u32 readready;
> > > +	u32 cs = nfc->nand_cs;
> > > +
> > > +	readready = CMD_WAITREADY | CMD_CHIP_ENABLE(cs) | CMD_RB_MASK(cs)
> > > +		| CMD_TOGGLE | CMD_RB_DATA;
> > > +	digicolor_nfc_cmd_write(nfc, readready);
> > > +
> > > +	return 1;
> > 
> > Is your device always ready ?
> 
> I have no control of that. This command goes into a pipe that is managed at 
> the hardware level. If the nand device does not activate the ready/busy 
> signal, the pipe will just stall, and the command FIFO will overflow (the 
> command FIFO has 8 entries). So you will notice the error eventually.

My point is: you should not return 1 if the NAND is not ready,
waiting forever is not what I'm suggesting here, but you should find a
way to return the actual NAND chip R/B status.
If you don't have any solution to do that from your controller then you
might consider relying on the default dev_ready implementation which is
sending a NAND STATUS command to retrieve the R/B status.

What I'll say is in contradiction with what's done in the sunxi
driver :-) (actually I'm considering fixing that), but after taking a
closer look, dev_ready should only return the status of the NAND, and
not wait for the NAND to be ready.
The function responsible for waiting is waitfunc, maybe this is the one
you should overload.

> 
> There is an option to flag this command, and let the hardware indicate when it 
> has completed successfully. But the current upper layer only gives up to 20ms 
> for the nand chip to become ready. Depending on the commands currently waiting 
> for processing it might take longer.

It's a bit more for erase operations (400ms), anyway if you need a
different behavior you should provide your own waitfunc.

> 
> > What if your digicolor_nfc_cmd_write timed out ?
> 
> What should I return in this case? Returning 0 means that the device is not 
> YET ready. But that's not what actually happens here. I'm not sure that the 
> right solution is here.

That's why putting that in waitfunc would be more appropriate.

> 
> > > +static void digicolor_nfc_cmd_ctrl(struct mtd_info *mtd, int cmd,
> > > +				   unsigned int ctrl)
> > > +{
> > > +	struct nand_chip *chip = mtd->priv;
> > > +	struct digicolor_nfc *nfc = chip->priv;
> > > +	u32 cs = nfc->nand_cs;
> > > +
> > > +	if (ctrl & NAND_CLE) {
> > > +		digicolor_nfc_cmd_write(nfc,
> > > +					CMD_CLE | CMD_CHIP_ENABLE(cs) | cmd);
> > > +		if (cmd == NAND_CMD_RNDOUTSTART || cmd == NAND_CMD_RNDIN) {
> > > +			digicolor_nfc_wait_ns(nfc, nfc->t_ccs);
> > > +		} else if (cmd == NAND_CMD_RESET) {
> > > +			digicolor_nfc_wait_ns(nfc, 200);
> > > +			digicolor_nfc_dev_ready(mtd);
> > > +		}
> > 
> > These wait and dev_ready calls are supposed to be part of the generic
> > cmdfunc implementation, did you encounter any issues when relying on the
> > default implementation ?
> 
> The generic code just waits. This doesn't help here. Hardware does all the 
> command processing in its own schedule. To make the hardware wait I must 
> explicitly insert a wait command into the pipe.

I'm not sure to understand here.
There's a difference between:
1/ waiting for a NAND command to be send on the NAND bus
2/ waiting for the NAND to be ready (for DATA read or after a PROGRAM
   operation)

NAND core is taking care of the 2/, but 1/ is your responsibility
(and this is what you have to wait for here).

> 
> [...]
> 
> > > +static uint8_t digicolor_nfc_rw_byte(struct digicolor_nfc *nfc, int byte)
> > > +{
> > > +	bool read = (byte == -1);
> > > +	u32 cs = nfc->nand_cs;
> > > +
> > > +	digicolor_nfc_cmd_write(nfc, read ? CMD_PAGEREAD : CMD_PAGEWRITE
> > > +				| CMD_CHIP_ENABLE(cs));
> > > +	digicolor_nfc_cmd_write(nfc, 1);
> > > +
> > > +	if (digicolor_nfc_wait_ready(nfc, read ? DATA_READ : DATA_WRITE))
> > > +		return 0;
> > > +
> > > +	if (read)
> > > +		return readl_relaxed(nfc->regs + NFC_DATA);
> > > +	else
> > > +		writel_relaxed(byte & 0xff, nfc->regs + NFC_DATA);
> > > +
> > > +	return 0;
> > > +}
> > 
> > Is there a real need to keep read and write handling in the same
> > function ?
> 
> Well, I really hate seeing two functions that are almost identical next to 
> each other. Maybe I take the DRY principle too extremely, but that is not the 
> case here IMO.

I don't think so.
IMO, the code duplication induced by this dispatch is quite limited and
improve readability.
But let's stop bike shedding, we'll wait for other reviews to get more
feedback.

> 
> > You're testing twice the operation type in a ~10 lines function.
> 
> Three times, actually. But I trust the compiler to be smart enough to test it 
> only once when generating code.
> 
> > Just move the appropriate code in the following functions.
> 
> I don't agree. These operations are almost identical. Keeping them together 
> should make future refactoring easier.
> 
> If I added function parameters for CMD_ and DATA_ (thus eliminating two 
> operation tests) would it make the code better in your opinion?

Not really, but again, let's wait other reviews before taking any
decision.

> 
> > > +static uint8_t digicolor_nfc_read_byte(struct mtd_info *mtd)
> > > +{
> > > +	struct nand_chip *chip = mtd->priv;
> > > +	struct digicolor_nfc *nfc = chip->priv;
> > > +
> > > +	return digicolor_nfc_rw_byte(nfc, -1);
> > > +}
> > > +
> > > +static void digicolor_nfc_write_byte(struct mtd_info *mtd, uint8_t byte)
> > > +{
> > > +	struct nand_chip *chip = mtd->priv;
> > > +	struct digicolor_nfc *nfc = chip->priv;
> > > +
> > > +	digicolor_nfc_rw_byte(nfc, byte);
> > > +}
> 
> [...]
> 
> > > +	while (len >= 4) {
> > > +		if (digicolor_nfc_wait_ready(nfc, op))
> > > +			return;
> > > +		if (op == DATA_READ)
> > > +			*pr++ = readl_relaxed(nfc->regs + NFC_DATA);
> > > +		else
> > > +			writel_relaxed(*pw++, nfc->regs + NFC_DATA);
> > > +		len -= 4;
> > > +	}
> > 
> > How about using readsl/writesl here (instead of this loop) ?
> 
> How can I add the wait condition in readsl/writesl?

Are you sure you have to wait after each readl access (is NFC_DATA
interfaced with a FIFO or directly doing PIO access) ?

> 
> > > +	if (len > 0) {
> > > +		if (digicolor_nfc_wait_ready(nfc, op))
> > > +			return;
> > > +		if (op == DATA_READ)
> > > +			buf_tail = readl_relaxed(nfc->regs + NFC_DATA);
> > > +		for (i = 0; i < len; i++) {
> > > +			u8 *tr = (u8 *)pr;
> > > +			const u8 *tw = (const u8 *)pw;
> > > +
> > > +			if (op == DATA_READ) {
> > > +				tr[i] = buf_tail & 0xff;
> > > +				buf_tail >>= 8;
> > > +			} else {
> > > +				buf_tail <<= 8;
> > > +				buf_tail |= tw[i];
> > > +			}
> > > +		}
> > 
> > I still don't get that part, but I guess you have a good reason for
> > doing that.
> > Could add a comment explaining what you're doing and why you're doing
> > it ?
> 
> This code processes tail of the buffer that is the reminder of the /4 
> division. Originally I relied on the fact that pages size are always a 
> multiple of 4. But later I added OOB read/write using this routine, so this 
> assumption is no longer true.
> 
> Anyway, I'll add a comment.

Fine.

> 
> > > +		if (op == DATA_WRITE)
> > > +			writel_relaxed(buf_tail, nfc->regs + NFC_DATA);
> > > +	}
> > > +}
> > 
> > Again, the code in this function should be dispatched in the
> > digicolor_nfc_read/write_buf functions.
> 
> See above.

Ditto

> 
> [...]
> 
> > > +static int digicolor_nfc_read_page_syndrome(struct mtd_info *mtd,
> > > +					    struct nand_chip *chip,
> > > +					    uint8_t *buf, int oob_required,
> > > +					    int page)
> > > +{
> > > +	struct digicolor_nfc *nfc = chip->priv;
> > > +	int step, ecc_stat;
> > > +	struct nand_oobfree *oobfree = &chip->ecc.layout->oobfree[0];
> > > +	u8 *oob = chip->oob_poi + oobfree->offset;
> > > +	unsigned int max_bitflips = 0;
> > > +
> > > +	for (step = 0; step < chip->ecc.steps; step++) {
> > > +		digicolor_nfc_rw_buf(nfc, buf, NULL, chip->ecc.size, true);
> > > +
> > > +		ecc_stat = digicolor_nfc_ecc_status(nfc);
> > 
> > If the returned error is a timeout you might want to stop the whole
> > operation.
> 
> Right. I'll fix that.
> 
> [...]
> 
> > > +static void digicolor_nfc_hw_init(struct digicolor_nfc *nfc)
> > > +{
> > > +	unsigned int ns_per_clk = NSEC_PER_SEC / nfc->clk_rate;
> > > +	u32 timing = 0;
> > > +
> > > +	writel_relaxed(NFC_CONTROL_LOCAL_RESET, nfc->regs + NFC_CONTROL);
> > > +	udelay(10);
> > > +	writel_relaxed(0, nfc->regs + NFC_CONTROL);
> > > +	udelay(5);
> > > +	/*
> > > +	 * Maximum assert/deassert time; asynchronous SDR mode 0
> > > +	 * Deassert time = max(tWH,tREH) = 30ns
> > > +	 * Assert time = max(tRC,tRP,tWC,tWP) = 100ns
> > > +	 * Sample time = 0
> > > +	 */
> > > +	timing |= TIMING_DEASSERT(DIV_ROUND_UP(30, ns_per_clk));
> > > +	timing |= TIMING_ASSERT(DIV_ROUND_UP(100, ns_per_clk));
> > > +	timing |= TIMING_SAMPLE(0);
> > > +	writel_relaxed(timing, nfc->regs + NFC_TIMING_CONFIG);
> > > +	writel_relaxed(NFC_CONTROL_ENABLE, nfc->regs + NFC_CONTROL);
> > 
> > Helper functions are provided to extract timings from ONFI timing modes
> > (either those defined by the chip if it supports ONFI commands, or
> > those extracted from the datasheet):
> > 
> > http://lxr.free-electrons.com/source/include/linux/mtd/nand.h#L976
> 
> I wanted to keep that for future improvement. The NAND chip on the EVK board 
> is actually an old style one, neither ONFI nor JEDEC. I expect to test this 
> driver on our target hardware that should have something newer.

Timings are not only related to ONFI chips, and
onfi_timing_mode_default is extracted from the nand_ids table which is
describing non-ONFI chips.

This is not my call to make, but IMHO, dynamic NAND timing
configuration should be mandatory for new drivers.

> 
> > > +static int digicolor_nfc_ecc_init(struct digicolor_nfc *nfc,
> > > +				  struct device_node *np)
> > > +{
> > > +	struct mtd_info *mtd = &nfc->mtd;
> > > +	struct nand_chip *chip = &nfc->nand;
> > > +	int bch_data_range, bch_t, steps, mode, i;
> > > +	u32 ctrl = NFC_CONTROL_ENABLE | NFC_CONTROL_BCH_KCONFIG;
> > > +	struct nand_ecclayout *layout;
> > > +
> > > +	mode = of_get_nand_ecc_mode(np);
> > > +	if (mode < 0)
> > > +		return mode;
> > > +	if (mode != NAND_ECC_HW_SYNDROME) {
> > > +		dev_err(nfc->dev, "unsupported ECC mode\n");
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	bch_data_range = of_get_nand_ecc_step_size(np);
> > > +	if (bch_data_range < 0)
> > > +		return bch_data_range;
> > > +	if (bch_data_range != 512 && bch_data_range != 1024) {
> > > +		dev_err(nfc->dev, "unsupported nand-ecc-step-size value\n");
> > > +		return -EINVAL;
> > > +	}
> > > +	if (bch_data_range == 1024)
> > > +		ctrl |= NFC_CONTROL_BCH_DATA_FIELD_RANGE_1024;
> > > +	steps = mtd->writesize / bch_data_range;
> > > +
> > > +	bch_t = of_get_nand_ecc_strength(np);
> > > +	if (bch_t < 0)
> > > +		return bch_t;
> > 
> > You should fallback to datasheet values (ecc_strength_ds and
> > ecc_step_ds) if ECC strength and step are not specified in the DT.
> 
> I can only choose from a fix number of strength configuration alternatives. 
> Should I choose the lowest value that is larger than ecc_strength_ds?

Yep, this applies to both cases (information extracted from the DT and
_ds values).

> 
> > > +MODULE_DEVICE_TABLE(of, sunxi_nfc_ids);
> > 
> > Hm, let me guess, you've based your work on the sunxi driver, isn't
> > it ? :-)
> 
> You got me. Actually it was the only NAND driver that I found from recent 
> time. All other are 3+ years old. But since last night we also have the 
> hisi504_nand in mainline. Interesting times.
> 
> > That's all I got for now.
> > 
> > I might have missed some details, but all in all I really like the way
> > this driver was designed (but I'm not sure to be objective since this
> > one is based on the sunxi driver ;-)):
> > - pretty straightforward implementation
> > - make use, as much as possible, of the NAND infrastructure (no specific
> >   cmdfunc, seems to support raw accesses, ...)
> > 
> > The only missing parts are:
> > - proper timing configuration
> > - replace active waits (polling) by passive waits (interrupt +
> >   waitqueue)
> > 
> > But that should be fixed quite easily.
> 
> It's on my TODO. I don't think it should block the driver, though.

That's not my decision to make ;-).

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com