/
/README
  1 *** Armish - An assembler quite contraire
  2 
  3 Armish is the Amish of the computer-world. The Amish plough forth through life
  4 under the twin burden of an extreme religion and an accompanying extreme way of
  5 life. As a result they're marginalized and they watch life pass them by. But
  6 they might just hold the seed for a future way of living, you know, with global
  7 warming and all. It has worked in the past; You never know. You never know I
  8 say.
  9 
 10 Armish ploughs through bit-space under the twin burden of being an assembler
 11 and being written in lisp. As a result it will be marginalized and it will
 12 watch with an unending waxing headache how programmers pass it by, riding their
 13 high octane steeds of high-level languages. But it might just hold the seed for
 14 a future mass acceptance of languages long forgotten, garbage collected os-ses
 15 of yonder, recursive programming-joys untold... (*sigh*) ...  You never
 16 know... You never know, da#@^$it!!
 17 
 18 
 19 
 20 ** License - LLGPL, see the LICENSE file
 21 
 22 ** Authors
 23 
 24 - Jeff Massung 
 25 - Ties Stuij
 26 
 27 
 28 
 29 ** Concept
 30 
 31 Armish is an assembler like any other. With parenthesis around the instructions.
 32 It's Armish's aim is to be a bit on the general side. It assembles arm and thumb
 33 code, but new features of the latest processors are not (yet/ever) supported.
 34 They're quite backwards compatible though, those processors, last time I checked.
 35 Armish should be usable for arm architectures 3 through 5 (till arm9), and of those
 36 just the core. No enhanced dsp instructions or such. Complaints about missing
 37 features are welcome. See Liards, for a lib that actually uses it.
 38 
 39 
 40 
 41 ** Installation
 42 
 43 To get the latest development version, do a darcs get:
 44 
 45 darcs get http://common-lisp.net/project/armish/darcs/armish
 46 
 47 Versioned releases have been put to a halt atm.
 48 
 49 Armish depends on Umpa-Lumpa, Arnesi, Split-Sequence and FiveAM.
 50 
 51 darcs get http://common-lisp.net/project/liards/darcs/umpa-lumpa
 52 darcs get http://common-lisp.net/project/bese/repos/arnesi_dev
 53 darcs get http://common-lisp.net/project/bese/repos/fiveam
 54 http://ww.telent.net/cclan/split-sequence.tar.gz
 55 
 56 Once you've got those wired into your asdf machinery together with Armish,
 57 just fire it up.
 58 
 59 
 60 ** Testing
 61 
 62 You can check if (a subset of) all the standard instructions work by first
 63 building aasm from the supplied c source if you're not on linux or windows, and
 64 be sure to name it aasm. For linux the executable is already there. Make the
 65 executable executable (`chmod 755 aasm' or equivalent) and then execute: (run!
 66 'arm-suite)
 67 
 68 There might by errors here caused by the fact i faultily tried to support your
 69 implementation or operating system. The test tries to access an external
 70 assembler supplied with Armish to test the instructions; however the functions
 71 to read and write to other processes is implementation-dependent. Armish is
 72 only tested on sbcl on Linux, so if you can, please fix by editing run-prog and
 73 process-output in helpers.lisp and control-check in test.lisp. Oh! and send a
 74 fix if you want! Thanks. Or just remove your implementation from the two
 75 feature-query lists in control-check.
 76 
 77 This will compare the output of Armish with that of a reference implementation,
 78 included in this release. This shouldn't be to interesting for the average
 79 user. I would myself at least expect the tested functionality to work. It might
 80 give you a warm fuzzy feeling to know this assembler doesn't just fool you
 81 along, like it would me. More interesting is perhaps that you can check the
 82 syntax of this assembler by looking at the test-cases in the test.lisp file.
 83 
 84 More practical for if you do feel the need to hack on this assembler-thing (i
 85 for one encourage it) is that you'll find some handy functions in the test.lisp
 86 file that can help with debugging.
 87 
 88 
 89 
 90 ** Exported functions
 91 
 92 - assemble - Assembles forms into a list of opcodes for specified chip and
 93 processor mode.  syntax: (assemble chip mode forms) where:
 94 
 95 chip - decides which chipset to assemble for - is a number or a symbol
 96 
 97 Atm assemble accepts the first value of the pairings below, and it translates
 98 into the latter. The higher the value, the newer the chip. Check the function
 99 get-version to see the latest supported symbols or to add your own.
100 
101 0 0 'all 0 3 3 4 4 'version-4 4 '4t 4.2 'ARM7TDMI 4.2 'arm7 4.2 5 5 'version-5
102 5 '5TExP 5.3 '5TE 5.4 'ARM946E-S 5.4 'arm9 5.4
103 
104 
105 mode - decides if you want to start in arm or thumb mode - accepts one of the
106 following symbols: 'arm 'code32 'thumb 'code16
107 
108 
109 forms - are, yes, the forms to be assembled
110 
111 example usage:
112 (assemble 'arm9 'arm
113   '(:label (mov r3 r4) (b :label)))
114 
115 ==> (4 48 160 225 253 255 255 234)
116 
117 
118 - emit-asm - emits a list of assembly forms ready to be fed to assemble. Emit-asm
119 escapes variables which are not part of the assembler syntax, so we can enrich the
120 assembler with variables, while still keeping a clean, uniform syntax. Inside
121 emit-asm, escape forms you want to evaluate as Common Lisp with (ea ...), as in
122 escape assembler.
123 
124 syntax: (emit-asm form1 form2 ...)
125 
126 example usage:
127 
128 (let ((foo 'r4)
129       (bar 'r6))
130   (emit-asm
131    :loop
132    (stmib r3 (r3 foo_bar))
133    (b :loop)))
134 
135 ==> (:LOOP (STMIB R3 (R3 R4_R6)) (B :LOOP))
136 
137 
138 - align - aligns a list of bytes to a bytes byte boundry by padding
139 zeroes. Defaults to four if bytes is not supplied
140 
141 syntax: (align byte-lst &optional bytes)
142 
143 example usage: (align '(1 2 3 4 5)) ==> (1 2 3 4 5 0 0 0)
144 
145 
146 - aligned - returns the next bytes byte aligned address. Defaults to four if
147 bytes is not supplied
148 
149 syntax: (aligned address &optional bytes)
150 
151 example usage: (aligned (length '(1 2 3 4 5))) ==> 8
152 
153 There are some other functions exported (actually just one at the time of
154 writing), but those are simple helper functions that might aid the programmer
155 on a general level. I had a seperate package for that, but decided to cut it to
156 keep the package-count down. But i might reintroduce it again. Look in
157 helpers.lisp for functions that might aid you.
158 
159 
160 - set-armish-string-encoding - sets the armish string encoding. Armish passes
161 the encoding to the arnesi string-to-octets fuction, which tries to do the right
162 thing. From the documentation: "We gurantee that :UTF-8, :UTF-16 and :ISO-8859-1 will
163 work as expected. Any other values are simply passed to the underlying lisp's
164 function and the results are implementation dependant.
165 
166 syntax: (set-armish-string-encoding :keyword)
167 
168 example usage: (set-armish-string-encoding :ISO-8859-1) ==> :ISO-8859-1
169 
170 
171 
172 ** Instruction syntax
173 
174 Basically this assembler follows the arm assembler syntax but then lispified:
175 Wrap the expressions in parenthesis, get rid of the comma's, substitute curly
176 braces and braces for parenthesis, substitute - for _ in register lists for
177 multiple load/store instructions and get rid of the pound signs before
178 literals and immediates. Barrel rolling modifiers and labels are keywords.
179 
180 examples:
181 standard assembly syntax            lisp syntax
182 
183 ldmhied r12!, {r2, r15, r13-r14} -> (ldmhied r12! (r2 r15 r13_r14))
184 ldrbt r5, [r2], -r1 ror #12      -> (ldrbt r5 (r2) -r1 :ror 12)
185 
186 For a pretty extensive, case by case comparison of the instructions, see the
187 test cases in the test.lisp file.
188 
189 
190 
191 ** Assembler format, features and conventions
192 
193 
194 * Directives
195 
196 code16 - assemble as thumb code32 - assemble as arm
197 
198 pool - dump the literary pool
199 
200 align - align code to a 4 byte boundary align-hw - align code to a 2 byte
201 boundary (align &optional bytes) - aligns code to a bytes byte boundary;
202 defaults to four if bytes is not supplied
203 
204 (dcb byte &rest bytes) - define one or more bytes (byte byte &rest bytes) -
205 same as dcb
206 
207 (dcw byte &rest bytes) - define one or more 16 bit words (hword byte &rest
208 bytes) - same as dcw
209 
210 (dcd byte &rest bytes) - define one or more 32 bit words (word byte &rest
211 bytes) - same asl dcd
212 
213 (dword byte &rest bytes) - define one or more 64 bit words (quad byte &rest
214 bytes) -define one or more 64 bit words
215 
216 (bin byte-size bin-list) - a more general directive than the previous ones.
217 First specify how many bytes of storage space the data items of the bin-list
218 should take, then supply the list itself. For example assembling
219 (bin 2 (1 2 3 4)) results in (1 0 2 0 3 0 4 0) 
220 
221 (binae lis-of-bin-lists) - same as above, but now we specify a list of
222 byte-size - bin-list lists. For example assembling
223 (binae ((2 (1 2)) (4 (3 4)))) results in (1 0 2 0 3 0 0 0 4 0 0 0)
224 
225 (space size &optional (fill 0)) - pad assembly output with size amount of
226 bytes, all of value fill
227 
228 "string" - a literal string will be encoded as *string-encoding* (defaults to :utf-8)
229 specified transformed bytes, at arnesi's string-to-octets discretion and will be
230 *string-end* (defaults to 0) terminated 
231 
232 (string &rest strings) - strings will be concatenated and then encoded as
233 *string-encoding* (defaults to :utf-8) transformed bytes, at arnesi's
234 string-to-octets discretion. If the symbol :null-terminated is present,
235 the concatenated (so NOT the individual) strings will be null-terminated.
236 
237 :label - an unadulterated keyword will be treated as a label
238 
239 For convenience the to be compiled forms are appended with the :code-end label,
240 for if one wants to jump to code which might be placed directly after the compiled
241 code.
242 
243 
244 * pseudo-instructions
245 
246 (ldr register literal) - loads the value of literal in register. Encodes in two's
247 complement
248 (ldr register literal :pi) - loads the value of literal in register, as Positive
249 Integer, understand (or 0)
250 
251 (adr register label) - loads the address of label in register
252 (nop) - no opcode; translates into (mov r0 r0) in arm and
253 (mov r8 r8) in thumb
254 
255 
256 * register and coprocessor syntax
257 
258 Registers can be written in the familiar way: rx, where x is a number from 0 to
259 15. the lr, sp, and pc can be written like lr, sp and pc.
260 
261 Coprocessor registers can be written as cx or crx where x is a number from 0 to
262 15.
263 
264 Coprocessors can be written as px or cpx where x is a number from 0 to 15.
265 
266 
267 
268 ** History
269 
270 The core of the thing is a file called thumb.lisp, which Jeff Massung was so
271 kind to dig up from his digital archive. It was a beta version of a thumb
272 assembler and it assembled thumb opcodes if you fed it instructions. It has
273 been expanded upon a bit by the Armish team of one by modifying the thumb code
274 a bit and by adding arm instructions and facilities to make it more like a
275 traditional assembler. Who knows, maybe you can even have some use for it.
276 
277 
278 
279 ** Todo
280 
281 - no arm or thumb adrl pseudo-instruction
282 - in the arm ldr instruction, encode a constant load more efficiently if possible
283 in stead of always loading from memory
284 - write enhanced dsp instructions
285 - document code think about a comment-extractor