recipes/gcc/gcc-4.3.3/ep93xx/arm-crunch-readme.patch


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107

--- gcc-4.3.2-orig/gcc/config/arm/README-Maverick	1970-01-01 01:00:00.000000000 +0100
+++ gcc-4.3.2/gcc/config/arm/README-Maverick	2009-03-10 22:04:26.000000000 +0000
@@ -0,0 +1,104 @@
+Cirrus Logic MaverickCrunch FPU support
+=======================================
+
+The MaverickCrunch is an FPU coprocessor that only exists in combination
+with an arm920t (ARMv4t arch) integer core in the 200MHz EP93xx devices.
+Code generation for it is usually selected with
+	-mcpu=ep9312 -mfpu=maverick (and most likely -mfloat-abi=softfp)
+
+Within GCC, the names "cirrus" "maverick" and "crunch" are used randomly
+in filenames and identifiers, but they all refer to the same thing.
+
+Initial support was mainlined by RedHat in gcc-3 but this never generated
+working code.  Cirrus Logic funded the company Nucleusys to produce a modified
+GCC for it, but this never worked either.  The first set of patches to pass
+the testsuite were made by Hasjim Williams for Open Embedded, though they
+did this by disabling various features and optimisations, therby incurring
+a small negative impact on regular ARM code generation.
+The OE ideas were reimplemented by Martin Guy to produce a working compiler
+with no negwative impact on regular code generation.
+
+The FPU's characteristics
+-------------------------
+Like most ARM coprocessors, it runs in parallel with the ARM though its
+instructions are inserted into the regular ARM instructions stream.
+It has 16 64-bit registers that can be operated as 32-bit or 64-bit integers
+or as 32-bit or 64-bit floats, three 72-bit saturating multiplier-accumulators.
+It can add, sub, mul, cmp, abs and neg these types, convert between them and
+transfer values between its registers and the ARM registers or main memory.
+
+Comparisons performed in the Maverick unit set the condition codes differently
+from the ARM/FPA/VFP instructions:
+
+	ARM/FPA/VFP - (cmp*):		MaverickCrunch - (cfcmp*):
+		N  Z  C  V		        N  Z  C  V
+	A == B  0  1  1  0		A == B  0  1  0  0
+	A <  B  1  0  0  0		A <  B  1  0  0  0
+	A >  B  0  0  1  0		A >  B  1  0  0  1
+	unord   0  0  1  1		unord   0  0  0  0
+
+which means that the same conditions have to be tested for with different ARM
+conditions after a Maverick comparison.  Furthermore, some conditions cannot
+be tested with a single condition.
+This was already true on ARM/FPA/VFP for conditions UNEQ and LTGT; 
+on Maverick comparisons it is GE UNLT ORDERED and UNORDERED that cannot.
+(GE after a floating point comparison, that is, not after an integer comarison)
+
+GCC's use of the Maverick unit
+------------------------------
+GCC only uses the 32-bit and 64-bit floating point modes and the 64-bit
+integer mode. It does not use the 72-bit accumulators or the 32-bit integer
+mode because, from "GCC Machine Descriptions":
+    "It is obligatory to support floating point `movm' instructions
+    into and out of any registers that can hold fixed point values,
+    because unions and structures (which have modes SImode or DImode)
+    can be in those registers and they may have floating point members."
+(search also for "grief" in arm.c).
+
+It does not use the 64-bit integer comparison instruction because it can only
+do a signed or an unsigned comparison, while GCC expect the comparison to set
+the conditions codes for both modes and then to use the signed or unsigned
+mode when the condition code bits are tested.
+
+The different setting of the condition codes is tracked with an additional
+CCMAV mode for the condition code register, set when a comparison is performed
+in the Maverick unit. This always indicates a floating point comparison since
+the Maverick's 64-bit comparison is not used.
+
+Hardware bugs and workarounds
+-----------------------------
+All silicon implementations of the FPU have a dozen hardware bugs, mostly
+timing-dependent bugs that can write garbage into registers or memory or get
+conditional tests wrong, as well as a widespread failure to respect
+denormalised values.  See http://wiki.debian.org/ArmEabiMaverickCrunch
+
+There used to be a -mcirrus-fix-invalid-instructions flag that claimed
+to avoid bugs in revision D0 silicon but its code was broken junk.
+Currently GCC always avoids the timing bugs in revision D1 to E2 silicon,
+while the many extra timing bugs in the now rare revision D0 are not handled.
+
+By default, the instructions that drop denoermalized values are enabled
+so as to obtain maximum speed at lower precision.  By default, the cfnegs
+and cfnegd instrutions are disabled, since they also fail to produce negative
+zero.  They can be enabled with -fno-signed-zeros.
+
+When -mfpu=maverick is selected, an additional -mieee flag is active that
+gives full IEEE precision by performs all the non-denorm-respecting
+floating point instructions in the softfloat library routines or in the
+integer registers.
+
+The 64-bit integer support is still buggy so it is disabled unless the
+-mcirrus-di flag is supplied. As well as the having unidentified
+hardware bugs which make openssl's testsuite fail in */sha/sha512.c and in
+*/bn/bn_adm.c, 64-bit shifts only work up to 31 places left or 32 right.
+
+Other bugs
+----------
+There seems to be no way to configure GCC to select Maverick code generation
+as the default.
+
+--with-arch=ep9312 the assembler barfs saying that ep9312 is not a
+recognised architecture.
+--with-arch=armv4t the build fails when it tries to compile hard FPA
+instructions into libgcc,
+--with-cpu=ep9312 it compiles armv5t instructions into libgcc