1 *** Armish - An assembler quite contraire 2 3 Armish is the Amish of the computer-world. The Amish plough forth through life 4 under the twin burden of an extreme religion and an accompanying extreme way of 5 life. As a result they're marginalized and they watch life pass them by. But 6 they might just hold the seed for a future way of living, you know, with global 7 warming and all. It has worked in the past; You never know. You never know I 8 say. 9 10 Armish ploughs through bit-space under the twin burden of being an assembler 11 and being written in lisp. As a result it will be marginalized and it will 12 watch with an unending waxing headache how programmers pass it by, riding their 13 high octane steeds of high-level languages. But it might just hold the seed for 14 a future mass acceptance of languages long forgotten, garbage collected os-ses 15 of yonder, recursive programming-joys untold... (*sigh*) ... You never 16 know... You never know, da#@^$it!! 17 18 19 20 ** License - LLGPL, see the LICENSE file 21 22 ** Authors 23 24 - Jeff Massung 25 - Ties Stuij 26 27 28 29 ** Concept 30 31 Armish is an assembler like any other. With parenthesis around the instructions. 32 It's Armish's aim is to be a bit on the general side. It assembles arm and thumb 33 code, but new features of the latest processors are not (yet/ever) supported. 34 They're quite backwards compatible though, those processors, last time I checked. 35 Armish should be usable for arm architectures 3 through 5 (till arm9), and of those 36 just the core. No enhanced dsp instructions or such. Complaints about missing 37 features are welcome. See Liards, for a lib that actually uses it. 38 39 40 41 ** Installation 42 43 To get the latest development version, do a darcs get: 44 45 darcs get http://common-lisp.net/project/armish/darcs/armish 46 47 Versioned releases have been put to a halt atm. 48 49 Armish depends on Umpa-Lumpa, Arnesi, Split-Sequence and FiveAM. 50 51 darcs get http://common-lisp.net/project/liards/darcs/umpa-lumpa 52 darcs get http://common-lisp.net/project/bese/repos/arnesi_dev 53 darcs get http://common-lisp.net/project/bese/repos/fiveam 54 http://ww.telent.net/cclan/split-sequence.tar.gz 55 56 Once you've got those wired into your asdf machinery together with Armish, 57 just fire it up. 58 59 60 ** Testing 61 62 You can check if (a subset of) all the standard instructions work by first 63 building aasm from the supplied c source if you're not on linux or windows, and 64 be sure to name it aasm. For linux the executable is already there. Make the 65 executable executable (`chmod 755 aasm' or equivalent) and then execute: (run! 66 'arm-suite) 67 68 There might by errors here caused by the fact i faultily tried to support your 69 implementation or operating system. The test tries to access an external 70 assembler supplied with Armish to test the instructions; however the functions 71 to read and write to other processes is implementation-dependent. Armish is 72 only tested on sbcl on Linux, so if you can, please fix by editing run-prog and 73 process-output in helpers.lisp and control-check in test.lisp. Oh! and send a 74 fix if you want! Thanks. Or just remove your implementation from the two 75 feature-query lists in control-check. 76 77 This will compare the output of Armish with that of a reference implementation, 78 included in this release. This shouldn't be to interesting for the average 79 user. I would myself at least expect the tested functionality to work. It might 80 give you a warm fuzzy feeling to know this assembler doesn't just fool you 81 along, like it would me. More interesting is perhaps that you can check the 82 syntax of this assembler by looking at the test-cases in the test.lisp file. 83 84 More practical for if you do feel the need to hack on this assembler-thing (i 85 for one encourage it) is that you'll find some handy functions in the test.lisp 86 file that can help with debugging. 87 88 89 90 ** Exported functions 91 92 - assemble - Assembles forms into a list of opcodes for specified chip and 93 processor mode. syntax: (assemble chip mode forms) where: 94 95 chip - decides which chipset to assemble for - is a number or a symbol 96 97 Atm assemble accepts the first value of the pairings below, and it translates 98 into the latter. The higher the value, the newer the chip. Check the function 99 get-version to see the latest supported symbols or to add your own. 100 101 0 0 'all 0 3 3 4 4 'version-4 4 '4t 4.2 'ARM7TDMI 4.2 'arm7 4.2 5 5 'version-5 102 5 '5TExP 5.3 '5TE 5.4 'ARM946E-S 5.4 'arm9 5.4 103 104 105 mode - decides if you want to start in arm or thumb mode - accepts one of the 106 following symbols: 'arm 'code32 'thumb 'code16 107 108 109 forms - are, yes, the forms to be assembled 110 111 example usage: 112 (assemble 'arm9 'arm 113 '(:label (mov r3 r4) (b :label))) 114 115 ==> (4 48 160 225 253 255 255 234) 116 117 118 - emit-asm - emits a list of assembly forms ready to be fed to assemble. Emit-asm 119 escapes variables which are not part of the assembler syntax, so we can enrich the 120 assembler with variables, while still keeping a clean, uniform syntax. Inside 121 emit-asm, escape forms you want to evaluate as Common Lisp with (ea ...), as in 122 escape assembler. 123 124 syntax: (emit-asm form1 form2 ...) 125 126 example usage: 127 128 (let ((foo 'r4) 129 (bar 'r6)) 130 (emit-asm 131 :loop 132 (stmib r3 (r3 foo_bar)) 133 (b :loop))) 134 135 ==> (:LOOP (STMIB R3 (R3 R4_R6)) (B :LOOP)) 136 137 138 - align - aligns a list of bytes to a bytes byte boundry by padding 139 zeroes. Defaults to four if bytes is not supplied 140 141 syntax: (align byte-lst &optional bytes) 142 143 example usage: (align '(1 2 3 4 5)) ==> (1 2 3 4 5 0 0 0) 144 145 146 - aligned - returns the next bytes byte aligned address. Defaults to four if 147 bytes is not supplied 148 149 syntax: (aligned address &optional bytes) 150 151 example usage: (aligned (length '(1 2 3 4 5))) ==> 8 152 153 There are some other functions exported (actually just one at the time of 154 writing), but those are simple helper functions that might aid the programmer 155 on a general level. I had a seperate package for that, but decided to cut it to 156 keep the package-count down. But i might reintroduce it again. Look in 157 helpers.lisp for functions that might aid you. 158 159 160 - set-armish-string-encoding - sets the armish string encoding. Armish passes 161 the encoding to the arnesi string-to-octets fuction, which tries to do the right 162 thing. From the documentation: "We gurantee that :UTF-8, :UTF-16 and :ISO-8859-1 will 163 work as expected. Any other values are simply passed to the underlying lisp's 164 function and the results are implementation dependant. 165 166 syntax: (set-armish-string-encoding :keyword) 167 168 example usage: (set-armish-string-encoding :ISO-8859-1) ==> :ISO-8859-1 169 170 171 172 ** Instruction syntax 173 174 Basically this assembler follows the arm assembler syntax but then lispified: 175 Wrap the expressions in parenthesis, get rid of the comma's, substitute curly 176 braces and braces for parenthesis, substitute - for _ in register lists for 177 multiple load/store instructions and get rid of the pound signs before 178 literals and immediates. Barrel rolling modifiers and labels are keywords. 179 180 examples: 181 standard assembly syntax lisp syntax 182 183 ldmhied r12!, {r2, r15, r13-r14} -> (ldmhied r12! (r2 r15 r13_r14)) 184 ldrbt r5, [r2], -r1 ror #12 -> (ldrbt r5 (r2) -r1 :ror 12) 185 186 For a pretty extensive, case by case comparison of the instructions, see the 187 test cases in the test.lisp file. 188 189 190 191 ** Assembler format, features and conventions 192 193 194 * Directives 195 196 code16 - assemble as thumb code32 - assemble as arm 197 198 pool - dump the literary pool 199 200 align - align code to a 4 byte boundary align-hw - align code to a 2 byte 201 boundary (align &optional bytes) - aligns code to a bytes byte boundary; 202 defaults to four if bytes is not supplied 203 204 (dcb byte &rest bytes) - define one or more bytes (byte byte &rest bytes) - 205 same as dcb 206 207 (dcw byte &rest bytes) - define one or more 16 bit words (hword byte &rest 208 bytes) - same as dcw 209 210 (dcd byte &rest bytes) - define one or more 32 bit words (word byte &rest 211 bytes) - same asl dcd 212 213 (dword byte &rest bytes) - define one or more 64 bit words (quad byte &rest 214 bytes) -define one or more 64 bit words 215 216 (bin byte-size bin-list) - a more general directive than the previous ones. 217 First specify how many bytes of storage space the data items of the bin-list 218 should take, then supply the list itself. For example assembling 219 (bin 2 (1 2 3 4)) results in (1 0 2 0 3 0 4 0) 220 221 (binae lis-of-bin-lists) - same as above, but now we specify a list of 222 byte-size - bin-list lists. For example assembling 223 (binae ((2 (1 2)) (4 (3 4)))) results in (1 0 2 0 3 0 0 0 4 0 0 0) 224 225 (space size &optional (fill 0)) - pad assembly output with size amount of 226 bytes, all of value fill 227 228 "string" - a literal string will be encoded as *string-encoding* (defaults to :utf-8) 229 specified transformed bytes, at arnesi's string-to-octets discretion and will be 230 *string-end* (defaults to 0) terminated 231 232 (string &rest strings) - strings will be concatenated and then encoded as 233 *string-encoding* (defaults to :utf-8) transformed bytes, at arnesi's 234 string-to-octets discretion. If the symbol :null-terminated is present, 235 the concatenated (so NOT the individual) strings will be null-terminated. 236 237 :label - an unadulterated keyword will be treated as a label 238 239 For convenience the to be compiled forms are appended with the :code-end label, 240 for if one wants to jump to code which might be placed directly after the compiled 241 code. 242 243 244 * pseudo-instructions 245 246 (ldr register literal) - loads the value of literal in register. Encodes in two's 247 complement 248 (ldr register literal :pi) - loads the value of literal in register, as Positive 249 Integer, understand (or 0) 250 251 (adr register label) - loads the address of label in register 252 (nop) - no opcode; translates into (mov r0 r0) in arm and 253 (mov r8 r8) in thumb 254 255 256 * register and coprocessor syntax 257 258 Registers can be written in the familiar way: rx, where x is a number from 0 to 259 15. the lr, sp, and pc can be written like lr, sp and pc. 260 261 Coprocessor registers can be written as cx or crx where x is a number from 0 to 262 15. 263 264 Coprocessors can be written as px or cpx where x is a number from 0 to 15. 265 266 267 268 ** History 269 270 The core of the thing is a file called thumb.lisp, which Jeff Massung was so 271 kind to dig up from his digital archive. It was a beta version of a thumb 272 assembler and it assembled thumb opcodes if you fed it instructions. It has 273 been expanded upon a bit by the Armish team of one by modifying the thumb code 274 a bit and by adding arm instructions and facilities to make it more like a 275 traditional assembler. Who knows, maybe you can even have some use for it. 276 277 278 279 ** Todo 280 281 - no arm or thumb adrl pseudo-instruction 282 - in the arm ldr instruction, encode a constant load more efficiently if possible 283 in stead of always loading from memory 284 - write enhanced dsp instructions 285 - document code think about a comment-extractor