[PATCH] Orion NAND: Make dword load asm volatile to avoid GCC optimizing it away
Nicolas Pitre
nico at cam.org
Wed Aug 19 10:47:14 EDT 2009
On Wed, 19 Aug 2009, Simon Kagstrom wrote:
> On Tue, 14 Jul 2009 12:40:33 -0400 (EDT)
> Nicolas Pitre <nico at cam.org> wrote:
>
> > On Tue, 14 Jul 2009, Simon Kagstrom wrote:
> >
> > > GCC 4.3.3 happily removes the dword load instruction in
> > > orion_nand_read_buf. This patch makes the inline assembly volatile to
> > > avoid this situation.
> >
> > If GCC does optimize away this inline asm then this is a serious GCC
> > bug. The inline asm uses an output operand for which a dependency
> > exists on the very next line.
>
> I've been away on vacation, sorry for not replying until now.
>
> Anyway, I spoke to the GCC people at gcc-help, and they insist that the
> inlined assembly is wrong. The problem is that GCC doesn't infer from
> the instruction that it accesses memory, and it can thereby move it out
> of the loop.
Oh! OK. That's different though. In your initial report you said that
the inline asm was _removed_. I can understand why it might be factored
out of the loop though, but it cannot be removed entirely.
> So here comes an updated patch.
>
> // Simon
> --
> Orion NAND: Clobber memory to make sure that GCC does not move ldrd
>
> GCC 4.3.3 happily removes the dword load instruction in
> orion_nand_read_buf. This patch clobbers memory, and makes the
> instruction volatile to avoid the issue. I've discussed this at
> gcc-help, refer to the thread at
>
> http://gcc.gnu.org/ml/gcc-help/2009-08/msg00187.html
>
> Signed-off-by: Simon Kagstrom <simon.kagstrom at netinsight.net>
> ---
> drivers/mtd/nand/orion_nand.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/mtd/nand/orion_nand.c b/drivers/mtd/nand/orion_nand.c
> index 7ad9722..845f9a9 100644
> --- a/drivers/mtd/nand/orion_nand.c
> +++ b/drivers/mtd/nand/orion_nand.c
> @@ -61,7 +61,7 @@ static void orion_nand_read_buf(struct mtd_info *mtd, uint8_t *buf, int len)
> buf64 = (uint64_t *)buf;
> while (i < len/8) {
> uint64_t x;
> - asm ("ldrd\t%0, [%1]" : "=r" (x) : "r" (io_base));
> + asm volatile ("ldrd\t%0, [%1]" : "=&r" (x) : "r" (io_base) : "memory");
I think that the early clobber buys you nothing, and the memory clobber
is way overkill here. The volatile ought to be all that is needed. Can
you confirm that only the addition of volatile makes the code OK? My
gcc version is 4.3.2 and that makes no difference what so ever as the
code as is produces the correct result already in my case.
Nicolas
More information about the linux-mtd
mailing list