Hand-crafting an ELF file

This week I suddenly got a burning desire to write some x86 assembly code and have it run on my computer, in as minimal an environment as possible. I decided that it wouldn't be too much to ask my computer to run the code eb fe (that's two bytes of code, written in hexadecimal), which is just an infinite loop, the equivalent of label: goto label. No input, no output, not even an exit, just a single observable side-effect: running forever, or at least until ctrl-C.

As a first step to doing this, I installed the excellent QEMU project and spent a day trying to understand which command options to give it to disable most of the bells and whistles it gives you but still load a file with some binary code. Unsuccessfully.

The next day, I vaguely remembered having read an excellent blog post about creating tiny ELF files, so I decided that sticking my two bytes of assembly code in a valid ELF file might be easier. The smart thing would have been to immediately Google for this blog post and read it, tweaking as necessary, but instead I tried to look for specs and found:

Eventually I ended up with a file called make-loopy:

#!/usr/bin/python3

s = """
7f 45 4c 46 02 01 01 00
00 00 00 00 00 00 00 00
02 00 3e 00 01 00 00 00
78 00 01 00 00 00 00 00
40 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 40 00 38 00
01 00 40 00 00 00 00 00

01 00 00 00 05 00 00 00
78 00 00 00 00 00 00 00
78 00 01 00 00 00 00 00
00 00 00 00 00 00 00 00
02 00 00 00 00 00 00 00
02 00 00 00 00 00 00 00
00 10 00 00 00 00 00 00

eb fe
"""

open('loopy', 'wb').write(bytes([int(b, 16) for b in s.split()]))

It's just a Python script that writes a bunch of hex bytes to a file called loopy, and the glorious result is:

$ ./make-loopy && ./loopy
...wait for however long you like...
^C
$ readelf --file-header --program-headers loopy
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x10078
  Start of program headers:          64 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         1
  Size of section headers:           64 (bytes)
  Number of section headers:         0
  Section header string table index: 0

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000078 0x0000000000010078 0x0000000000000000
                 0x0000000000000002 0x0000000000000002  R E    1000

It works! It's a valid ELF file! Success! And it only took 120 well-crafted bytes of overhead.

gdb isn't always helpful

As a side note, my first several attempts at this resulted in

$ ./make-loopy && ./loopy
Segmentation fault

and

$ ./make-loopy && ./loopy
Segmentation fault

and more

$ ./make-loopy && ./loopy
Segmentation fault

but sometimes

$ ./make-loopy && ./loopy
Segmentation fault (core dumped)

instead, before returning some more

$ ./make-loopy && ./loopy
Segmentation fault

so at some point in the process I decided to try using gdb instead of just poking at the hex values semi-randomly. And while gdb is sometimes very helpful in telling you exactly what's going on and letting you prod at your code and memory, in this case all I got was:

$ ./make-loopy && gdb ./loopy
Reading symbols from ./loopy...(no debugging symbols found)...done.
(gdb) run
Starting program: ./loopy
During startup program terminated with signal SIGSEGV, Segmentation fault.
(gdb) info registers
The program has no registers now.
(gdb) x 0x10000
0x10000: Cannot access memory at address 0x10000

Unfortunately, the executable was causing segfaults while being loaded, so there was no program state for gdb to tell me about. À l'impossible nul n'est tenu, so I can't really blame gdb.

An annotated version

If you're curious about the details, the following version of make-loopy is probably more informative and tweakable. You too could get your computer to run some lovingly-crafted bytes!

#!/usr/bin/python3.6

magic_bytes  = "7f 45 4c 46" # "This is an ELF file"
elf_class    = "02"          # 64-bit architecture
byte_order   = "01"          # little-endian
file_type    = "02 00"       # an executable
architecture = "3e 00"       # x86-64

# the memory address where execution starts
entry_point = "78 00 01 00 00 00 00 00"

# mark the program's memory as executable and readable
# (for writable as well, use "07 00 00 00")
code_segment_flags = "05 00 00 00"

# the file offset where the code starts
# (right after the headers)
code_offset = "78 00 00 00 00 00 00 00"

# there are two bytes of code
code_size = "02 00 00 00 00 00 00 00"

# this value works on my machine
alignment = "00 10 00 00 00 00 00 00"

s = f"""
{magic_bytes} {elf_class} {byte_order} 01 00
00 00 00 00 00 00 00 00
{file_type} {architecture} 01 00 00 00
{entry_point}
40 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 40 00 38 00
01 00 40 00 00 00 00 00

01 00 00 00 {code_segment_flags}
{code_offset}
{entry_point}
00 00 00 00 00 00 00 00
{code_size}
{code_size}
{alignment}

eb fe
"""

open('loopy', 'wb').write(bytes([int(b, 16) for b in s.split()]))

I only marked a few of the ELF fields, but do look up the Wikipedia page if you're curious about the other values in there. Some points: