[PATCH] ARM:VFPv3:enable {d16-d31} access

Wed May 26 07:43:24 EDT 2010

Russell,

-----Original Message-----
From: Russell King - ARM Linux [mailto:linux at arm.linux.org.uk] 
Sent: Wednesday, May 26, 2010 12:28 AM
To: DebBarma, Tarun Kanti
Cc: linux-omap at vger.kernel.org; linux-arm-kernel at lists.infradead.org
Subject: Re: [PATCH] ARM:VFPv3:enable {d16-d31} access

On Tue, May 25, 2010 at 02:39:17PM +0530, DebBarma, Tarun Kanti wrote:
>  #ifdef CONFIG_VFPv3
>  	@ d16 - d31 registers
> -	.irp	dr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
> -1:	mrrc	p11, 3, r0, r1, c\dr	@ fmrrd	r0, r1, d\dr
> +	.irp	dr,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
> +1:	fmrrd	r0, r1, d\dr

The existing code is correct.

For every fmrrd instruction, there is a corresponding mrrc version which
assembles to exactly the same opcode.

mrrc instructions take:

1. Co-processor number, range 0-15.
2. Opcode number N, range 0-15.
3. Destination register 1, range 0-15.
4. Destination register 2, range 0-15.
5. Co-processor register number R, range 0-15.

For fmrrd encodings, the first 16 registers are encoded using N=1 with
R=0 to 15.  The second 16 registers are encoded using N=3 with R=0 to 15.
Specifying a co-processor register number greater than 15 is illegal,
hence why the 'irp' specifies the numbers 0 to 15.

If we look at the instruction encodings, for MRRC:

 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
|  cond |1 1 0 0 0 1 0 1|  Rn   |  Rm   |CP Num |   N   |   R   |

For FMRRD:

 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
|  cond |1 1 0 0 0 1 0 1|  Rn   |  Rm   |1 0 1 1 0 0 M 1|   R   |

where "M" and "R" together define the register.

As I said above, the existing code is correct.  What problem are you
actually trying to solve here?

######################
I have a test case which exercise all VFP general purpose registers by writing a known value and reading it back using vfp_put_double() and vfp_get_double() APIs.

	long long d1=777, d2=0;
	int i=31;
	for (; i>=0; i--){
		vfp_put_double(d1, i);
		d2 = vfp_get_double(i);
		printk("D%d read=%lld\n",i, (long long)d2);
		d2 = 0.0;
	}

1) With the existing implementation I am not able to correctly write/read {d0-d15} but not the {d16-d31} set

2) With my changes I am able to write/read correctly.

Tarun