Creating a Bootloader Environment
(Freescale ColdFire Example)

In this article, a bootloader is a program that runs at processor reset and may accept a new firmware image via serial port, usb, ethernet, etc which is used to reprogram the device's flash. If the bootloader times out while waiting for a new image (typically a second or two) it exits and launches the 'application code' (the code the embedded system normally runs.)

This article describes the mechanics for implementing a bootloader in general: it doesn't delve into the details of writing the actual serial portand flash reprogramming code. It also goes into specific details for setting up the bootloader environement using the Freescale ColdFire MCF5225x family of microcontroller and the CodeWarrior build environment.

There are two distinct approaches to writing a bootloader: either
a) the bootloader and application code live in one compilable image (simpler build and programming)
b) the bootloader is built as a separate image from the application (eliminates cross calling problems)

This article deals with the case where the bootloader and application are one image.

Bootloader Requirements

Resides in its own region of flash which is never erased by the bootloader
Runs immediately following reset
If receives a 'secret signal' (i.e. special sequence of serial port characters, i/o voltage level, front panel pushbutton combination, etc) within a certain amount of time, bootloader downloads an image via RS232, USB, ethernet, etc then reprograms the application area of the flash with the received image.
If does not receive the "secret signal" within a certain time period, bootloader transfers execution to the application

Add More Code and Data Segments

Our bootloader is going to live in the two 4K pages of flash, so we will never erase that 8K with the bootloader. I have made my flash erase and programming routines check to reject any requests to erase or program that memory range.

We'll be editing the linker command file (LCF) but if you are using Processor Expert, the first thing you need to do is modify the CPU Bean's build settings to not generate an LCF, otherwise Processor Expert will overwrite any changes you make to the memory layout. Next let's add some new entries in the MEMORY part of the LCF which defines the segments.

MEMORY  {
   # First 4K block contains reset vector, ivt, cfm (flash controller)config block, 
   # and bootloader  This block never gets erased by the bootloader
   vectortable_sboot  (R)  : ORIGIN = 0x00000000, LENGTH = 0x00000300
   cfmprotrom         (R)  : ORIGIN = 0x00000400, LENGTH = 0x00000020
   code_sboot         (RX) : ORIGIN = 0x00000420, LENGTH = 0x00001BE0
   
   #Jump to the original startup function (_startup_Metroworks) coded in a fixed location
   jmp_to_startup     (RX) : ORIGIN = 0x00002000, LENGTH = 0x00000008

   #Application code
   code        		 (RX) : ORIGIN = 0x00002008, LENGTH = 0x0007DFF8		#512K flash   

   # Overlap the data segments for the bootloader and the application, since they are never
   # used simultaneously 
   data        		 (RW) : ORIGIN = 0x20000000, LENGTH = 0x00010000		#64K ram
   data_sboot  		 (RW) : ORIGIN = 0x20000000, LENGTH = 0x00010000		#64K ram
   ipsbar      		 (RW) : ORIGIN = 0x40000000, LENGTH = 0x0
}

I've added the lines in red. Code_sboot will be the new code section for the bootloader. Note that on the ColdFire, the interrupt vector table (IVT) initially occupies locations x0-x2FF and the flash controller protection block must run from x400-x420. One reason the IVT must occupy that space even though interrupts are typically disabled for a bootloader is because when the ColdFire resets, it loads the program counter with the address at location 4, the reset vector in the IVT. You wouldn't want this block to be erased, as that could potentially brick your device: putting the bootloader in the same flash page as the IVT and CFM solves this problem (as the bootloader never erases itself.)

We also add an additional data segment (data_sboot) for the bootloader. The segment defines variables that live in RAM (as well as initialization values where used) and since the bootloader and the application code never run at the same time, it's legitimate to overlap these segments in the same space to conserve RAM.

I have also renamed the vectortable (IVT) segment to vectortable_sboot, since this copy of the IVT belongs to the bootloader. The application code's IVT will be moved to RAM, more about that later.

I've also added a jmp_to_startup segment. It's purpose will be covered in the Jump to Jump to Main() section.

Okay, now that we have .code_sboot and .data_sboot segments, we need to tell the compiler and linker to use them....

Place Bootloader Code and Data

Getting the linker to place the bootloader's code into the code_sboot segment is fairly straightforward: adding the lines below tells the linker to take the .text section (all the code) and .rodata section for anything that is resident in sboot. c (the bootloader module) and put them in the newly created code_sboot segment rather than the usual .code segment. No modifcation/pragma is necessary in sboot.c to accomplish this placement.

SECTIONS {
  ...
  .text_sboot :
  {
    sboot.c (.text)
    . = ALIGN(0x4);
    sboot.c (.rodata)
    . = ALIGN(0x4);
     ___ROM_SBOOT_AT = .;
     ___DATA_ROM_SBOOT = .;
  } > code_sboot

We would also like to place all the bootloader's variables in RAM into the data_sboot segment, rather than the default data segment, for the previously mentioned reason that we want to bootloader and application code's data segments to overlap and occupy the same memory space. We do this in a similar fashion making a new copy of the .data segment called .data_sboot, and sticking the sboot.c module's data, sdata, sbss, and bss sections into that segment. (Note: the specifics of what is in each of these sections is beyond this article's scope: articles on C compilers will give you more information. For our purposes, these four sections contain all the global and static variables.

SECTIONS { 
  ...
  .data_sboot : AT( ___ROM_SBOOT_AT )
  {
    ___DATA_RAM_SBOOT = .;
    ___DATA_START_SBOOT =.;
    sboot.c (.data)
    . = ALIGN (0x4);
    ___DATA_END_SBOOT =.;
    
    __SDATA_START_SBOOT =.;
    sboot.c (.sdata)
    . = ALIGN (0x4);
    __SDATA_END_SBOOT =.;

    __SDA_BASE_SBOOT = .;
    
    __START_SBSS_SBOOT = .;
    sboot.c (.sbss)
    . = ALIGN (0x4);
    __END_SBSS_SBOOT = .;

    __START_BSS_SBOOT = .;
    sboot.c (.bss)
    . = ALIGN(0x4);
    __END_BSS_SBOOT = .;
  } > data_sboot

People who know something about the inner workings of the C environment might at this juncture point out two problems:

The global and static initialized C variables won't be initialized
The global and static uninitialized C variables are won't necessarily be zeroed on startup

We'll cover how to rectify these issues in the next section...

Startup Function Replacement

We want our bootloader to run first. Before C code can be run, C runtime setup is performed by an assembler function, often called "_startup." Because the bootloader needs a special version of the _startup function that lives in the .code_sboot segment I found the easiest way to do this was to copy the original _startup function into sboot.c, then change the name of the original _startup function to _startup_Metroworks.

The new startup function is responsible for zeroing the bss and sbss variable sections in the bootloader, and copying the initial global variable values for data and sdata from ROM to RAM. Because the original _startup function initializes the default .code and .data segments, some modifications (below in red) were necessary to make sboot's startup function properly initialize sboot's .data_sboot section.

Additionally, the bootloader's startup function will need to jump to a new main function (main_sboot) that is the start of C code in the bootloader.

__declspec(register_abi) asm void _startup(void)
{  
   //RSW - Full modified C runtime startup, adapted from Metrowork's code in startcf.c
   // disable interrupts 
   move.w        #0x2700,sr

   // Pre-init SP, in case memory for stack is not valid it should be setup using 
   // MEMORY_INIT before __initialize_hardware is called 
   lea __SP_AFTER_RESET,a7; 

   // initialize memory 
   MEMORY_INIT
 
   // setup the stack pointer
   lea           _SP_INIT,a7

   // setup A6 dummy stackframe 
   movea.l       #0,a6
   link          a6,#0

   // setup A5
   lea           _SDA_BASE_SBOOT,a5

   // zero initialize the .bss section 
   lea           _END_BSS_SBOOT, a0
   lea           _START_BSS_SBOOT, a1
   suba.l        a1, a0
   move.l        a0, d0

   beq           __skip_bss__

   // call clear_mem with base pointer in a0 and size in d0
   lea           _START_BSS_SBOOT, a0
   jsr           clear_mem

__skip_bss__:
   // zero initialize the .sbss section 
   lea           _END_SBSS_SBOOT, a0
   lea           _START_SBSS_SBOOT, a1
   suba.l        a1, a0
   move.l        a0, d0

   beq           __skip_sbss__

   // call clear_mem with base pointer in a0 and size in d0 
   lea           _START_SBSS_SBOOT, a0
   jsr           clear_mem

__skip_sbss__:
   // copy all ROM sections to their RAM locations ... 
   lea           __DATA_RAM_SBOOT, a0
   lea           __DATA_ROM_SBOOT, a1
   cmpa          a0,a1
   beq           __skip_rom_copy__
              
   move.l        #__DATA_END_SBOOT, d0
   sub.l         a0, d0                 
   jsr           __copy_rom_section

__skip_rom_copy__:
   jmp           _main_sboot
}

Okay, now our startup() assembler function runs, and after it initializes the C runtime environment, it transfers execution to our main_sboot() function in C. What happens in the main_sboot() function is beyond this article's scope, but in a nutshell either that function will decide to reprogram the device, or exit the bootloader and start the application code normally. The next section covers what happens when the bootloader timesout, exits, etc and the application code starts normally...

Jump to Jump to _startup()

When the bootloader decides to exit and start the application code, we'd like to jump to the application code's main() function, right? Actually, no that's not exactly right... we need to reinitialize the C runtime environment for the application code, by calling the original startup function, which we renamed _startup_Metroworks. This is necessary to reinitialize the stack pointer, initialize or zero any global or static variables the application code uses, etc, etc.

But we can't just jump from the bootloader to the _startup_Metroworks function, because potentially we will be reprogramming the device's application code, and any time the application is rebuilt, the memory location of the startup function could potentially move. To solve this problem, we pick a known location in the application, and stick an assembler jump to the _startup_Metroworks function there, then the bootloader can jump to the known location, and the known location knows where the _startup_Metroworks function is, because it is part of the application code.

Remember that jmp_to_startup segment we added to the linker command file earlier?

MEMORY  {
...   
   #Jump to the original startup function (_startup_Metroworks) coded in a fixed location
   jmp_to_startup     (RX) : ORIGIN = 0x00002000, LENGTH = 0x00000008
...   
}

Address x2000 is just after the end of the bootloader (which takes up the first 8K) so it's also just before the very beginning of the application code, at a known location. So if we add this function to our sboot.c module:

extern unsigned long far __JMP_TO_STARTUP;

#pragma define_section jmp_to_startup ".jmp_to_startup" far_absolute RX

static __declspec(jmp_to_startup) asm  void __declspec(register_abi) _jmp_to_startup(void)
{
	jmp   _startup_Metroworks;
}

while all the rest of the code in sboot.c is located in .code_sboot, the special pragma and _declspec combination tells the linker to put the code for this function into the jmp_to_startup segment rather than the .code_sboot default code segment for this module.

Main_sboot() does call the jmp_to_startup function

static __declspec(register_abi) void _main_sboot(void)
{
   ...

   asm(jmp __JMP_TO_STARTUP);       // Jump to a fixed location that jumps to the 
}                                   // original Metroworks startup code

however it does so through a linker symbol that refers to the address the function is stored at, rather than calling the function directly. Because the code does not actually directly call the jmp_to_startup function, the optimizer might be tempted to remove it and the entire jmp_to_startup section as well. To make sure that does not happen, we add a KEEP_SECTION directive to the LCF:

  KEEP_SECTION { .vectortable }
  KEEP_SECTION { .vectortable_sboot }
  KEEP_SECTION { .jmp_to_startup }
  KEEP_SECTION { .cfmconfig }

Additionally, because neither sboot's nor the application code's IVT contains functions that are ever called or variables that are ever referenced from the code, the linker will be tempted to remove this data table as an optimization. We would like that not to happen so we also add a KEEP_SECTION directive for the vectortable_sboot section.

Moving the Application Interrupt Vector Table to RAM

The bootloader needs an IVT at location 0 because the IVT contains the startup vector at address 4 which the ColdFire loads into the program counter at reset. The application code can't really use this IVT, because the locations of the Interrupt Service Routines (ISRs) will change when the application code is updated and rebuilt, yet the bootloader flash page which contains its IVT is never reprogrammed by the bootloader. So we need a second IVT for the application which is built and programmed with the application code...

Unfortunately the ColdFire's interrupt controller also has an additional requirement that the IVT has to be aligned on a 1M boundary (0x0010_0000.) Because the chip only has 256K or 512K of flash, the only location within the flash that's located on a 1M boundary is at zero. So to be aligned on a 1M boundary, the application's interrupt vector table will have to be located in RAM, which by default starts at 0x0020_0000.

Mechanically what happens is a copy of the application code's IVT is stored in flash, then copied to RAM by the C runtime _startup() function, just like the global and static variable initializers. If we make an addition to the LCF, we can get this to happen without writing any additional code:

  .data : AT(___ROM_AT)         # Follows code section in flash 
  {
    ___DATA_RAM = .;

    # Put a copy of the application's IVT here, this will be copied
    # to the start of RAM by _startup_Metroworks routine in the romp
    # section, which copies .data segment from flash to RAM
    ___VECTOR_RAM = .;          #Symbol tells linker where the IVT will be located in RAM   
    * (.vectortable)            # Needs to be 1M aligned, so at beginning of data is good...
    . = ALIGN(0x4);

    * (.exception)
    . = ALIGN(0x4);
    __exception_table_start__ = .;
    EXCEPTION
    __exception_table_end__   = .;

    ___sinit__ = .;
    STATICINIT

    ___DATA_START =.;

    * (.data)
    . = ALIGN (0x4);
    ___DATA_END   =.;

    __SDATA_START =.;
    * (.sdata)
    . = ALIGN (0x4);
    __SDATA_END = .;

    __SDA_BASE = .;
    . = ALIGN(0x4);
  } > data

We also set up the LCF so the ___VECTOR_RAM symbol is defined to point to the application code's IVT in RAM, and this symbol is what the _startup function loads in the Vector Base Register (VBR) which points to the IVT, so again by editing the LCF we are getting functionality without writing any additional code.

Running Flash Erase/Program from RAM not Necessary

Some flashes have a restriction that while a page of flash is being erased or programmed, the entire flash cannot be read. If this is the case with your microcontroller, then any functions which erase or program flash will have to be copied to RAM and run from there.

The ColdFire cannot read from a specific 4K logical page of flash while that page of flash is being erased or programmed, however it can read from a different flash page while a page is being erased/programmed. Because of this, it is not necessary to execute the flash erase or program routines out of RAM on a ColdFire, provide the bootloader never erase the pages where our bootloader code lives...

No Cross Calling/No C Runtime Library

The bootloader cannot call any functions in the application code, because the application code is not guaranteed to be resident in the flash (i.e. the bootloader may be in the process of erasing/reprogramming the flash.)

Also, because our bootloader is in the same image as the application, it may not call any C library functions, which will be in the application's area of flash. Generally with cautious coding this is not a problem, however if you really want/need to use C RTL functions in your code and copying the source for a small number of them into the bootloader module isn't practical, one solution you might consider is making the bootloader a separate compiled image from the application code: this is beyond this article's scope.

Moreover, the application must also not call bootloader function, since the bootloader is never reprogrammed. If you could guarantee that the bootloader would never change over the life of your device and every shipped device, it might technically be safe to call bootloader functions from the application code, but then the bootloader would have to use the same .data segment as the application (we currently have them overlayed.) It's really much simpler to not cross call from the application to the bootloader either.

A good way to test that the bootloader doesn't cross call into the application code or its C RTL functions is to erase the entire flash except the bootloader pages, then run the bootloader.

Modified Linker Command File

For reference you may download the entire modified ColdFire LCF.

Minimum Hardware Setup

Be aware that when the bootloader starts, the processor is configured in the reset state, so you must configure any hardware you are going to use. In particular, on the ColdFire you will want to set up the CPU and bus clock frequencies as well as the ColdFire Flash Module Clock Divider Register (CFMCDR.) I mention this because if the system clocks or CFMCDR are improperly configured, it is possible to destroy the ColdFire's flash relatively quickly with erase and program operations. Refer to the ColdFire Flash Module section of the ColdFire Reference Manual for more information.

What if I Release a Mistake?

What happens if you release a bootloader with a mistake in it, and your bootloader is not capable of erasing itself? Well, you could have all customers send the units back to your company to be reprogrammed by BDM or JTAG... but that would certainly make you quite unpopular.

When the bootloader exits, it will execute whatever code the jmp at location x2000 points to. It's not a stretch that you could load an application which reprograms the bootloader, run that application, then reboot and use the freshly programmed bootloader to reprogram the application area. Obviously you need to be extremely careful when reprogramming the bootloader, because if the reprogramming cycle is interrupted or the new bootloader you install is faulty, you could wind up with a bricked device.

Articles

Creating a Bootloader Environment (Freescale ColdFire Example)