flash read performance
tpiepho at freescale.com
Fri Nov 7 00:36:08 EST 2008
On Fri, 7 Nov 2008, Jamie Lokier wrote:
> Based on Andre's observation, I will soon try enabling cache for my
> NOR, and see if it makes a difference to cold-cache read performance.
> I don't expect it, but it's worth a try.
It possible it could help by efficiently doing the reads back-to-back with
no wasted cycles between them. I think it's necessary if you want to
benefit from page mode, but that's something I haven't tried yet.
> You might also find the write operation to be unreliable, if the
> caching mode is write-back rather than write-through.
Proper use of "sync" instructions, or whatever the arch uses to insure that
writel() is strictly ordered should fix that.
> Really, you should use an uncached mapping to write commands to the
> flash, flush the cached mapping (for reads) when commands are written,
> and prevent any access during the writes (this is in MTD normally).
> You could optimise by flushing only the cached read regions which are
> affected by write and erase commands.
Yes, that is probably the best. Most NOR flash writing is so slow that the
cache flushes shouldn't be too expensive.
>> Mapping Speed (MB/sec) (MB = 1048576 bytes)
>> un-cached and guarded 12.30
>> cached and gaurded 14.24
>> cached and un-guarded 14.31
>> un-cached and un-guarded 14.66
>> I measured by reading flash linearly from beginning to end 32-bits at a time.
>> Since the flash is bigger than the cache, ever read should have come from the
>> flash. If I just read the same 1k over and over that would obviously be much
>> faster if it could come from the cache.
> That's nice to see that cache helps cold-read performance too, not
> just cached reads. Thanks :-)
Though if the mapping is not in guarded mode, turning cache on hurts
performance. That surprises me.
>> The biggest bootup delay I have now is waiting for the ethernet phy to get
>> online, which takes almost 3 seconds.
> Do you need to delay everything else for that, or can you parallelise?
It's parallel. I start the phy very early in the boot loader, and linux is
done booting and sitting in userspace waiting for it for a few hundred ms
before it's ready.
More information about the linux-mtd