[PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

Michal Suchánek msuchanek at suse.de
Sat Jul 16 01:20:27 PDT 2022


On Fri, Jul 15, 2022 at 02:50:17PM +0530, Pratyush Yadav wrote:
> On 15/07/22 12:07AM, Michal Suchánek wrote:
> > On Thu, Jul 14, 2022 at 11:51:56PM +0200, Michael Walle wrote:
> > > Am 2022-07-14 22:55, schrieb Michal Suchánek:
> > > > On Thu, Jul 14, 2022 at 09:41:48PM +0200, Michael Walle wrote:
> > > > > Hi,
> > > > > 
> > > > > Am 2022-07-14 21:19, schrieb Michal Suchanek:
> > > > > > It is normal that devices are designed with multiple types of storage,
> > > > > > and only some types of storage are present.
> > > > > >
> > > > > > The kernel can handle this situation gracefully for many types of
> > > > > > storage devices such as mmc or ata but it reports and error when spi
> > > > > > flash is not present.
> > > > > >
> > > > > > Only print a notice that the storage device is missing when no response
> > > > > > to the identify command is received.
> > > > > >
> > > > > > Consider reply buffers with all bits set to the same value no response.
> > > > > 
> > > > > I'm not sure you can compare SPI with ATA and MMC. I'm just speaking
> > > > > of
> > > > > DT now, but there, for ATA and MMC you just describe the controller
> > > > > and
> > > > > it will auto-detect the connected storage. Whereas with SPI you
> > > > > describe
> > > > 
> > > > Why does mmc assume storage and SDIO must be descibed? Why the special
> > > > casing?
> > > 
> > > I can't follow you here. My SDIO wireless card just works in an SD
> > > slot and doesn't have to be described.
> > > 
> > > > > both the controller and the flash. So I'd argue that your hardware
> > > > > description is wrong if it describes a flash which is not present.
> > > > 
> > > > At any rate the situation is the same - the storage may be present
> > > > sometimes. I don't think assuming some kind of device by defualt is a
> > > > sound practice.
> > > 
> > > Where is the assumption when the DT tells you there is a flash
> > > on a specific chip select but actually there it isn't. Shouldn't
> > > the DT then be fixed?
> > 
> > The DT says there isn't a flash on a specific chip select when there is.
> > Shouldn't that be fixed?
> 
> If the board has a flash chip, then DT should describe it. The it does 
> not have one, then DT should not describe it.
> 
> So if DT says there isn't a flash on a specific CS when there is, then 
> DT should be fixed to describe the flash, and then we can probe it. You 
> both seem to be saying the same thing here, and I agree.

The disagreement is about the situation when there is sometimes a flash
chip.

As of now the chip is described in the device tree but is disabled, and
there is no mechanism for enabling it.

If it were enabled the driver could probe it but it is not.

The only real argument against enabling it I head was that if it is
enabled and missing users would see the kernel printing an error nad
come here, wasting everyones time.

So here is a patch the makes the kernle not print an error when the chip
is missing, and limit the error only to the situation when a chip is
present but not recognized.

> 
> > 
> > > Maybe I don't understand your problem. What are you trying to
> > > solve? I mean this just demotes an error to an info message.
> > 
> > Many boards provide multiple storage options - you get a PCB designed to
> > carry different kinds of storage, some may be socketed, some can be
> > soldered on in some production batches and not others.
> 
> Usually for non-enumerable components you can plug in or out, you should 
> use DT overlays to add the correct nodes as needed. For example, CSI 
> cameras just plug into a slot on the board. So you can easily remove one 
> and add another. That is why we do not put the camera node in the board 
> dts, but instead apply it as an overlay from the bootloader.

However, here the device is already enumerated in the device tree, the
only missing bit of information is if the device is present.

Sure, device tree overlays are useful and the right tool for the job for
some jobs, but not this one.

Please don't fall into the thinking that when you have a hammer
everything looks like a nail.

> 
> > 
> > The kernel can handle this for many kinds of storage but not SPI flash.
> > 
> > I don't see any reason why SPI flash should be a second class storage.
> > 
> > > > However, when the board is designed for a specific kind of device which
> > > > is not always present, and the kernel can detect the device, it is
> > > > perfectly fine to describe it.
> > > > 
> > > > The alternative is to not use the device at all, even when present,
> > > > which is kind of useless.
> > > 
> > > Or let the bootloader update your device tree and disable the device
> > > if it's not there?
> > 
> > But then it must be in the device tree?
> > 
> > And then people will complain that if the bootloader does not have this
> > feature then the kernel prints an error message?
> 
> Then add the feature to the bootloader? Adding a node in DT for a flash 
> that is not present is wrong.

And how would the bootloader know that it shouild look for a flash if
it's not described in the device tree?

> 
> > 
> > > Or load an overlay if it is there?
> > 
> > Or maybe the kernel could just detect if the storage is present?
> 
> It can't. This is a non-enumerable bus, unlike USB or PCI. And there is 
> no way you can actually detect the presence or absence of a flash 
> reliably. For example, TI chips come with a flash that is capable of 
> 8D-8D-8D mode. If the flash is handed to the kernel in 8D-8D-8D mode, 
> the read ID command will return all 0xff bytes since the kernel expacts 
> communication in 1S-1S-1S mode. With your patch you will say that no 
> flash memory is detected. In reality the flash is present but the kernel 
> just failed to properly detect it.

This is a strawman argument. If your flash chip starts up in 8D-8D-8D
and you don't tell the kernel to talk to it in 8D-8D-8D it can't talk to
it no matter what. The id command will fail all the same if the chip is
there and try to probe it's type - all 0xff is not a valid id. The
problem is not that you are probing the presence of the chip but that
you did not describe the chip to the kernel correctly.

Not to mention that the controller in question does not support any
advanced modes, anyway.

> 
> > 
> > It's not like we don't have an identify command.
> 
> We do, but it does not let us detect the _absence_ of a flash.

Actually, it does.

Thanks

Michal



More information about the linux-arm-kernel mailing list