Services and Modeling for Embedded Software Development
Embecosm divider strip
Prev  Next

5.4. Disassembler

The archDisassembler extends the MCDisassembler class and is centered around the getInstruction function. This function uses a memory region and decodes an instruction along with its operands, storing this information in a provided MCInst.


Information about the MCDisassembler class can be found in LLVM's documentation at

A support function can be defined which helps read data from the memory object, formatting it in a way that the decoder can use. In the case of OpenRISC 1000 , this function reads in 4 bytes (the size of an instruction) and formats it as an unsigned big endian 32-bit word.

Should a processor support both big and small endian instructions or variable length instructions, this function would instead be configured to read a variable number of bytes or to create a word which matches the target's endianness.

It should be noted that the function returns Fail if memory could not be read as this is a required step before disassembly.

static DecodeStatus readInstruction32(const MemoryObject &region,
                                      uint64_t address,
                                      uint64_t &size,
                                      uint32_t &insn) {
  uint8_t Bytes[4];

  // We want to read exactly 4 bytes of data.
  if (region.readBytes(address, 4, (uint8_t*)Bytes, NULL) == -1) {
    size = 0;
    return MCDisassembler::Fail;

  // Encoded as big-endian 32-bit word in the stream.
  insn = (Bytes[0] << 24) |
         (Bytes[1] << 16) |
         (Bytes[2] <<  8) |
         (Bytes[3] <<  0);

  return MCDisassembler::Success;

The getInstruction should first call the above function to read memory ready for decoding. Should this function return success, then it is passed to the TableGen generated function decodearch, InstructionSize which does the decoding.

This will return Success if the instruction was successfully decoded, otherwise it will return Fail. The Size parameter provided to the function is set to the size of instruction that was successfully decoded.

In the case of OpenRISC 1000 , only 32-bit instructions are supported, so a valid decode will always set this value to 4.

Embecosm divider strip