|
| 1 | + |
| 2 | +See the top level README file for more information on documentation |
| 3 | +and how to run these programs. |
| 4 | + |
| 5 | +Derived from prior blinker examples and uart04, this is an example using |
| 6 | +the system timer interrupt. |
| 7 | + |
| 8 | +For starters perhaps the ARM should not be using these interrupts, they |
| 9 | +are only halfway documented. As of this writing it appears that the |
| 10 | +gpu is using counter match 0 and 2, so cm1 and 3 are not being used. |
| 11 | +This example uses CM1. |
| 12 | + |
| 13 | +The documentation says that the status flag will assert when the lower |
| 14 | +half of the system timer matches the counter match register. It also |
| 15 | +says that the software interrupt service routine should change the |
| 16 | +match register. Basically software has to keep putting the counter match |
| 17 | +out in front of the timer. If you want an interrupt every 1234 counts |
| 18 | +then each interrupt you need to add 1234 to the count match register, |
| 19 | +the hardware wont do it for you. The manual says in one place that |
| 20 | +you write zero and read dont care, but elsewhere says that you write |
| 21 | +one to clear the status bits. The write one to clear appears to be |
| 22 | +how it works. So when the counter status match flag is set then we |
| 23 | +write a 1 to that bit location in that register (write a 2 to counter |
| 24 | +status to clear counter match 1). |
| 25 | + |
| 26 | +To figure out what interrupt line this was tied to, using some uart code |
| 27 | +so I could print stuff out and see it I enabled all interrupts, wrote |
| 28 | +0xFFFFFFFF to both interrupt enable registers. Then read the current |
| 29 | +count, added 0x00400000 and wrote CM1 with that value. Then went into |
| 30 | +an infinite loop printing the interrupt status registers. By doing this |
| 31 | +both with CM1 and CM3 I figured out that irq 1 goes with CM1 and irq3 goes |
| 32 | +with CM3. |
| 33 | + |
| 34 | +The last bit of information required is how do you clear the interrupt. |
| 35 | +When writing the 1 to the counter status register to clear the match |
| 36 | +flag, that also clears the pending interrupt in the interrupt status |
| 37 | +register. |
| 38 | + |
| 39 | +This example demonstrates multiple things. First it uses generic polling |
| 40 | +of the system timer to blink the led on and off three times. Then it |
| 41 | +uses the counter match register and the status flag to time four on/off |
| 42 | +blink cycles. The blink rate is twice the speed of the first three. |
| 43 | +Next it enables the interrupt in the interrupt controller, not to the ARM, |
| 44 | +not yet, just to the chips interrupt controller. When enabled the interrupt |
| 45 | +line reflects the counter status match flag, so basically the next three |
| 46 | +blinks are done the same way as the prior four except it is sampling |
| 47 | +the match status using the interrupt controller status. These blinks |
| 48 | +are slower than the prior four. The last thing it does is enable the |
| 49 | +interrupt to the ARM. And uses the interrupt to indicate the counter |
| 50 | +match hit. The ARM code then computes a new counter match and waits |
| 51 | +for the next hit. This loop happens to make the blink rate slower each |
| 52 | +time so you can perhaps tell you are in that loop. Eventually the |
| 53 | +timer interval will be so large that it goes back to a small number |
| 54 | +and starts blinking faster then progressively slower. This should take |
| 55 | +a long while to happen. |
| 56 | + |
| 57 | +You realy need to get the ARM ARM (ARM Architectural Reference Manual) |
| 58 | +for this architecture and or get the oldest architecture on their web |
| 59 | +site which is currently the ARMv5 ARM (it includes the ARMv4 as well, |
| 60 | +this is the original ARM ARM before it was split into multiple documents). |
| 61 | +In the ARM ARM it describes the exception process in more detail. |
| 62 | + |
| 63 | +The short answer is that starting at address 0x00000000 in ARM address |
| 64 | +space there are a number of exception vectors. Unlike many other processors |
| 65 | +these are not addresses for the handers these are instructions that get |
| 66 | +executed. Being one word in size, you probably want those to be |
| 67 | +branch instructions or ldr pc instructions. |
| 68 | + |
| 69 | +The way I am using the Raspberry Pi is letting the gpu load the arm |
| 70 | +program (kernel.img) at address 0x8000. The gpu then puts an instruction |
| 71 | +at address zero (and some other stuff between 0x0000 and 0x8000 for linux) |
| 72 | +the lets the ARM boot. I am not linux so dont care about the stuff between |
| 73 | +0x0000 and 0x8000. I do need to change at least the memory location for |
| 74 | +the interrupt handler so that when the interrupt occurs the ARM executes |
| 75 | +my handler. |
| 76 | + |
| 77 | +Using basic ARM knowledge and letting the assembler and compiler do some |
| 78 | +of the work I create an exception table at 0x8000 in such a way that |
| 79 | +it can be copied to 0x0000 and still work. |
| 80 | + |
| 81 | +Looking at the beginning of vectors.s which for any of my programs to |
| 82 | +work need to be compiled and linked such that _start is at address 0x8000, |
| 83 | +the first thing in the .bin file. |
| 84 | + |
| 85 | +The assembly code uses .word to allocate 32 bit memory locations which |
| 86 | +will each hold an address to a handler. |
| 87 | + |
| 88 | +reset_handler: .word reset |
| 89 | + |
| 90 | +reset_handler is the label. .word means I want to allocate 32 bit items |
| 91 | +and reset is the name of another label. The assembler does some of the |
| 92 | +work then the linker does the rest to determine what the final value |
| 93 | +of the reset labels ARM address is. That address is placed in the binary |
| 94 | +in this allocated space. Which can be seen in the disassembly: |
| 95 | + |
| 96 | +00008020 <reset_handler>: |
| 97 | + 8020: 00008040 andeq r8, r0, r0, asr #32 |
| 98 | + |
| 99 | +... |
| 100 | + |
| 101 | +00008040 <reset>: |
| 102 | + 8040: e3a00902 mov r0, #32768 ; 0x8000 |
| 103 | + |
| 104 | + |
| 105 | +Here is where the ARM knowledge, or at least more of it, comes in. |
| 106 | +Although the disassembly shows that the instruction is loading the |
| 107 | +value 0x8020 or 0x8040 or whatever. The instruction is actually loading |
| 108 | +a pc relative address. You can partially tell this from the disassembly |
| 109 | +[pc,#24] means pc value plus 24 (0x18), it doesnt mean 0x8020, etc. |
| 110 | + |
| 111 | +00008000 <_start>: |
| 112 | + 8000: e59ff018 ldr pc, [pc, #24] ; 8020 <reset_handler> |
| 113 | + 8004: e59ff018 ldr pc, [pc, #24] ; 8024 <undefined_handler> |
| 114 | + 8008: e59ff018 ldr pc, [pc, #24] ; 8028 <swi_handler> |
| 115 | + |
| 116 | +More ARM knowledge. From a programmers perspective the PC is two |
| 117 | +instructions ahead. You are in arm mode when you hit these exceptions |
| 118 | +so the PC is 8 bytes ahead so at address 0x8000 the PC is 0x8008 when |
| 119 | +you execute that instruction add 24 (0x18) to the PC, 0x8008+0x18 = 0x8020 |
| 120 | +and you get the address 0x8020. the instructin is now ldr pc,[0x8020] |
| 121 | +Memory location 0x8020 holds the value 0x8040 which is what is loaded into |
| 122 | +the program counter and we begin executing at 0x8040 which is the reset |
| 123 | +handler, that is what we wanted. |
| 124 | + |
| 125 | +here is the tricky bit. What if we copied both the reset handler stuff |
| 126 | +and the list of addresses, all of it, to address 0x0000? (at runtime |
| 127 | +after all the compiling and linking were long over and we are running). |
| 128 | + |
| 129 | +Instead of the addresses being this: |
| 130 | + |
| 131 | + 8018: e59ff018 ldr pc, [pc, #24] ; 8038 <irq_handler> |
| 132 | + ... |
| 133 | +00008038 <irq_handler>: |
| 134 | + 8038: 000080c4 andeq r8, r0, r4, asr #1 |
| 135 | + |
| 136 | +The copy of the data/instructions would now have these addresses: |
| 137 | + |
| 138 | + 0018: e59ff018 ldr pc, [pc, #24] |
| 139 | + ... |
| 140 | +00000038 <irq_handler>: |
| 141 | + 0038: 000080c4 andeq r8, r0, r4, asr #1 |
| 142 | + |
| 143 | +When the interrupt occurs the ARM runs the instruction at address 0x0018 |
| 144 | +which says to take the value of the PC (two ahead remember so the pc is) |
| 145 | +0x20 add 24 (which is 0x18) giving 0x0038 as the address to read from. |
| 146 | +It reads 0x80C4 and loads that into the program counter so that the |
| 147 | +next instruction executed is the one at 0x80C4. Which is where our |
| 148 | +interrupt handler really is. |
| 149 | + |
| 150 | +Basically this is some position independent code with some absolute |
| 151 | +addresses for the handlers, the address stuff is done by the assembler |
| 152 | +and linker so we dont have to. |
| 153 | + |
| 154 | + |
| 155 | +These instructions right after reset perform the copy of instructions and |
| 156 | +data from where our program was loaded and started (0x8000) to where we |
| 157 | +need the exception table (0x0000). |
| 158 | + |
| 159 | + mov r0,#0x8000 |
| 160 | + mov r1,#0x0000 |
| 161 | + ldmia r0!,{r2,r3,r4,r5,r6,r7,r8,r9} |
| 162 | + stmia r1!,{r2,r3,r4,r5,r6,r7,r8,r9} |
| 163 | + ldmia r0!,{r2,r3,r4,r5,r6,r7,r8,r9} |
| 164 | + stmia r1!,{r2,r3,r4,r5,r6,r7,r8,r9} |
| 165 | + |
| 166 | +ldmia means load multiple. the IA means increment after so what it does |
| 167 | +is using the value in r0 as an address (when executing the first of the |
| 168 | +two ldmia instructions r0 is 0x8000) so it loads 8 words starting at |
| 169 | +0x8000 into register r2 through r9. Then if there is an exclamation point |
| 170 | +after the register (which there is) then it modifies that register to |
| 171 | +point to the next word after the last one we loaded so we read 8 words |
| 172 | +or 32 bytes at address 0x8000 so the last thing it does (increment after |
| 173 | +the load) is add 0x20 and save so r0 is now 0x8020. |
| 174 | + |
| 175 | +stmia is like the load but a store, r1 starts off as 0x0000 so it stores |
| 176 | +those 8 words from 0x8000 to 0x0000, then it address 0x20 to r1. |
| 177 | + |
| 178 | +so the second ldmia is going to read 8 more words from 0x8020 and the |
| 179 | +second stmia is going to write those words to 0x0020. |
| 180 | + |
| 181 | +The second ldm and stm do not have to have the exclamation point as we dont |
| 182 | +care about r0 and r1, which means they dont need the ia. The ia part |
| 183 | +of the instruction is an either or thing either you decrement before you |
| 184 | +use the address or you increment after, one bit in the instruction encoding |
| 185 | +the exclamation point is a separate bit in the instruction that enables |
| 186 | +or disables the saving of that value to the base register. So if you |
| 187 | +were to do this: |
| 188 | + |
| 189 | + ldmia r0!,{r2,r3,r4,r5,r6,r7,r8,r9} |
| 190 | + stmia r1!,{r2,r3,r4,r5,r6,r7,r8,r9} |
| 191 | + ldm r0,{r2,r3,r4,r5,r6,r7,r8,r9} |
| 192 | + stm r1,{r2,r3,r4,r5,r6,r7,r8,r9} |
| 193 | + |
| 194 | +the assembler is likely going to pick ldmia or ldmdb and when you |
| 195 | +then disassemble it might look like this: |
| 196 | + |
| 197 | + ldmia r0!,{r2,r3,r4,r5,r6,r7,r8,r9} |
| 198 | + stmia r1!,{r2,r3,r4,r5,r6,r7,r8,r9} |
| 199 | + ldmdb r0,{r2,r3,r4,r5,r6,r7,r8,r9} |
| 200 | + stmdb r1,{r2,r3,r4,r5,r6,r7,r8,r9} |
| 201 | + |
| 202 | +it was easy to cut and paste the two lines as is, and if I wanted to |
| 203 | +cut and paste more sets to copy more data it is easy. So I left that |
| 204 | +extra info on those latter instructions even though I am not using them. |
| 205 | + |
| 206 | +So what those first 6 instructions did was to basicaly copy 0x40 bytes |
| 207 | +from 0x8000 to 0x0000. Since these are very early in the boot we are |
| 208 | +not using register r2 to r9 so that made it easy to use them as scratch |
| 209 | +registers. If we had waited to copy the 0x40 bytes until later a loop |
| 210 | +or some other way of copying that data likely would have happened since |
| 211 | +many of those registers my be used by other code. |
| 212 | + |
| 213 | +Note that the Cortex-M processors from arm which only execute in thumb |
| 214 | +mode, cannot execute ARM mode instructions boot differently, have different |
| 215 | +exception tables. The Cortex-M processors have addresses not instructions |
| 216 | +in the table and each flavor of Cortex-M or worse implementation has |
| 217 | +different definitions for each of those entries. The first few are |
| 218 | +the same then it diverges and they can have hundreds of entries in the |
| 219 | +vector table. The classic ARM table though has not varied for many |
| 220 | +flavors of ARM cores and good or bad all interrupts funnel into the |
| 221 | +same handler. (or handlers if you count the fiq). |
| 222 | + |
| 223 | +The classic ARM design also has separate stack pointers for each mode. |
| 224 | +Interrupt is a mode, when you get an interrupt you switch from whatever |
| 225 | +mode you were in (service/super user) to interrupt mode, which means |
| 226 | +you are using a different stack pointer. this is all described in words |
| 227 | +and pictures in the ARM ARM. This means that if we are going to support |
| 228 | +interrupts not only do we need to set our application stack pointer but |
| 229 | +also need to set aside some memory for the interrupt stack and point |
| 230 | +the interrupt stack pointer to it. how do you change the interrupt stack |
| 231 | +pointer if you are not in interrupt mode? well you have to be in interrupt |
| 232 | +mode. How do you get into interrupt mode? Well you modify the cpsr |
| 233 | +which contains the mode bits and that magically changes you to that mode. |
| 234 | +You can do this from any mode to any mode except from user mode, you cant |
| 235 | +get out of user mode by changing the bits. We are not in user mode on |
| 236 | +boot and never swtich to it in any of my examples do we dont have to |
| 237 | +worry about getting out of it (normally you use an svc/swi instruction and |
| 238 | +have a software interrupt handler that does things that are protected |
| 239 | +from user mode). So the next bit of code after copying the exception |
| 240 | +handler stuff switches into irq mode and fiq mode and sets their |
| 241 | +stack pointers (fiq in case you want to experience that mode, mostly |
| 242 | +the same as irq, you have another bank of registers so you dont |
| 243 | +have to preserve the system registers, making the handler a little faster |
| 244 | +as in fast irq (fiq), I do not demonstrate fiq here). |
| 245 | + |
| 246 | + |
| 247 | + ;@ (PSR_IRQ_MODE|PSR_FIQ_DIS|PSR_IRQ_DIS) |
| 248 | + mov r0,#0xD2 |
| 249 | + msr cpsr_c,r0 |
| 250 | + mov sp,#0x8000 |
| 251 | + |
| 252 | + ;@ (PSR_FIQ_MODE|PSR_FIQ_DIS|PSR_IRQ_DIS) |
| 253 | + mov r0,#0xD1 |
| 254 | + msr cpsr_c,r0 |
| 255 | + mov sp,#0x4000 |
| 256 | + |
| 257 | + ;@ (PSR_SVC_MODE|PSR_FIQ_DIS|PSR_IRQ_DIS) |
| 258 | + mov r0,#0xD3 |
| 259 | + msr cpsr_c,r0 |
| 260 | + mov sp,#0x8000000 |
| 261 | + |
| 262 | + ;@ SVC MODE, IRQ ENABLED, FIQ DIS |
| 263 | + ;@mov r0,#0x53 |
| 264 | + ;@msr cpsr_c, r0 |
| 265 | + |
| 266 | + |
| 267 | +the cpsr is also where you enable or disable the arm interrupt and fast |
| 268 | +interrupt. we want to start off with interrupts disabled so when |
| 269 | +switching back to SVC mode we also make sure that they are disabled. |
| 270 | + |
| 271 | +So the irq stack starts at 0x8000 (first location is 0x7FFC) and the fiq |
| 272 | +stack is at 0x4000 (0x3FFC). If you have re-compiled this program or |
| 273 | +modified your config.txt to have the gpu load you say at address 0x0000 |
| 274 | +then these stacks may collide with your program and you need to move |
| 275 | +them. Likewise I have put the SVC stack at 0x80000000 (0x7FFFFFFC) and |
| 276 | +if you are using that memory you need to move that as well. Bare metal |
| 277 | +memory management is part of bare metal programming. YOU decide where |
| 278 | +things are and either hard code them in your code or linker script or |
| 279 | +indirectly through the linker script. |
| 280 | + |
| 281 | +The last thing I am going to say about the interrupt handler is that I |
| 282 | +made it pretty stupid and mostly in C. |
| 283 | + |
| 284 | +irq: |
| 285 | + push {r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,r10,r11,r12,lr} |
| 286 | + bl c_irq_handler |
| 287 | + pop {r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,r10,r11,r12,lr} |
| 288 | + subs pc,lr,#4 |
| 289 | + |
| 290 | +When you read the ARM ARM you will see that the proper way to return |
| 291 | +from an interrupt is using a subs pc,lr,#4. Since you interrupted |
| 292 | +application code which was likely using some most of the registers you |
| 293 | +need to preserve those registers, in particular the link register, lr. |
| 294 | +So what my assembly wrapper does is preserve all the registers call a |
| 295 | +C function, upon return from that C function restore the registers then |
| 296 | +return from interrupt. Just like any other textbook interrupt handler. |
| 297 | + |
| 298 | +You need to remember and understand that C code in an interrupt handler |
| 299 | +needs to be lean and mean, get in get out dont mess around. This example |
| 300 | +simply modifies a global variable (has to be declared as volatile to |
| 301 | +be shared properly between the handler and the rest of the app code) and |
| 302 | +the application detects that to know the interrupt happend. that adds |
| 303 | +a latency but it is okay since our eyes will not see that difference |
| 304 | +in the led blinks. |
0 commit comments