[PATCH v3] mtd: nand: Add driver for M-sys / Sandisk diskonchip G4
Mike Dunn
mikedunn at newsguy.com
Thu Nov 10 22:31:34 EST 2011
On 11/10/2011 02:06 PM, Robert Jarzmik wrote:
> Mike Dunn <mikedunn at newsguy.com> writes:
>
>> Hi Robert, thanks for taking a look and commenting.
>>
>> On 11/10/2011 11:49 AM, Robert Jarzmik wrote:
>>>> + doc_nop(docptr);
>>>> + doc_nop(docptr);
>>>> + doc_nop(docptr);
>>>> + doc_nop(docptr);
>>>> + doc_nop(docptr);
>>> Wouldn't that be better doc_nop(docptr, 5) ?
>>
>> No. If it's a function that loops, you're inserting too much delay due to loop
>> overhead (unless the compiler unrolls it, but you don't know that) and you may
>> as well use some generic delay function and not bother with the nop register at
>> all. I wanted to use the precise delay observed in the TrueFFS code to (1)
>> avoid too much delay for the sake of performance, (2) in case the timing is
>> critical and too much delay would cause problems, or (3) "nop" really isn't what
>> it says (i.e. no operation). If there were a C preprocessor directive
>> equivalent to the assembly .rept directive, I would use it.
> As you wish, but :
> (a) The nops are here to add a minimum delay
> (b) The performance would not be an issue, and yes, the compiler could unroll
> the loop as the number is a compile time constant
> (c) nop operation is a delay, that's actually how I understand it
> (d) it does work well on docg3
> (e) and you can do as you wish, it's your driver :)
Then it's probably just a delay, as advertised. I wanted to be paranoid and
duplicate exactly what I was observing during PalmOS operation. In that case,
why not dispense with the nop reg altogether and use some short generic delay?
Anyway, minor issue.
>> No actually ecc.write_oob - which this function defines - is a nand interface
>> function. I never observed oob-only writes while reverse engineering (read
>> oob-only yes, but not write). Can you write oob-only on the G3? Even if it
>> were possible, the nand_write utility wants to write the oob *before* the page
>> data. This hack allows that utility to work. Maybe the comment should say "oob
>> can't be written before the page data".
> Yes, I can write OOB only, page only, or both.
Interesting. We took separate paths in this reverse engineering effort. I
observed activity during operation of the native OS using TrueFFS library, and
you engaged more in trial-and-error, guided by inspection of disassembled code
(if I'm not mistaken). You may have made some more discoveries beyond my
observation. I have to inspect your code. I know that reading oob-only was a
different "sequence" than a normal page read. Never saw oob-only write.
> And I have the same problem with nandwrite.
> Now consider you have 2 nandwrites working in parallel :
> nandwrite1 nandwrite2
> write_oob(OOB1)
> write_oob(OOB2)
> write_page(page1) ------------------------> OOB1 was overrun
> write_page(page2)
>
> That really really bothers me. It's not about your code, I think you have no
> choice. My concern is rather about nandwrite utility. My point is that it could
> have better used write_oob(page1, OOB1).
> And in that respect, I think there's something broken in it.
Ah, yes. Never considered this. If I understand you correctly, you are
pointing out the fact that my hack for deferring oob write until after the page
data is written breaks when multiple nandwrite processes are running. I haven't
tested with access from multiple processes yet. But yes, the hack assumes only
one process is accessing the device.
Another problem with my hack is that the oob-only write is quietly ignored if it
is not immediately followed by a write of the data on the same page. One of the
mtd tests in the kernel tests oob-only writes, and it fails miserably because
they are all ignored. Should probably take this hack out, and fix nand_write,
or just use a simplified, fixed version. Yes, mtd_write should not assume oob
can be written first. Even if you can write oob-only, you can't subsequently
write the page data (with or without writing its ecc bytes), can you?
>>> If it's the same as on docg3, each bit is a marker for one block, and the
>>> following formula could apply:
>>> is_good = bbt[block >> 3] & (1 << (block & 0x7));
>>
>> Thanks. Do any of your devices have bad blocks marked in this table? Do you
>> know how to modify the table?
> No, mine G3 has no such blocks.
Then how do you know it's one bit per block?
> And I feel we can't modify them. After all, they are in the OTP area. My guess
> is that they are set up at chip creation as initial bad blocks (ie. factory bad
> blocks).
> I think there is a difference between bad block (factory bad blocks) and worn
> out blocks (after erases, having more bitflips that ECC can fix).
You're probably right. I had no ambitions of trying to update the table
anyway. Only to read it and update the bbt table in RAM accordingly,
Thanks,
Mike
More information about the linux-mtd
mailing list