Linux "Hello world" in RISC-V GNU assembly

Let's write the smallest possible RISC-V Linux program that:

  • outputs "Hello world" to the standard output
  • exits successfully

Cross-compilation and RISCV-emulation packages on Ubuntu

I'm using a x86_64 machine with Ubuntu 22.04 and a RISC-V GCC toolchain from its repositories, gcc-riscv64-unknown-elf.

In order to get Linux-specific APIs for RISC-V, we'll also use package linux-libc-dev-riscv64-cross that provides RISC-V specific C headers in /usr/riscv64-linux-gnu/include.

Package qemu-user allows to run RV64 binaries in a software-emulated RV64 Linux environment.

Documentation and references:

The assembly code

First of all, the program must exit successfully. On Linux, this is done via the exit system call: https://linux.die.net/man/2/exit

How do we write RISC-V assembly to actually call it?

According to man 2 syscalls, the actual syscall numbers for the host instruction set architecture can be found in /usr/include/asm/unistd.h as __NR_xxx constants (e.g. __NR_exit for the exit syscall).

This RISC-V cross-compilation toolchain defines the actual number in /usr/riscv64-linux-gnu/include/asm-generic/unistd.h as __NR_exit.

According to man 2 syscall, the RISC-V way of making Linux system calls looks like this:

  • put the syscall number into a7: li a7, __NR_exit
  • put the syscall arguments into a0, a1, ..., a5
  • ecall performs the system call
  • returned values can be found in a0, a1

Therefore, _exit(0) in C translates to:

    li  a7, __NR_exit
    li  a0, 0
    ecall

man 2 write describes the syscall arguments:

#define STDOUT_FILENO   1

.text
    # write(STDOUT_FILENO, greeting, greetlen):
    li  a7, __NR_write
    li  a0, STDOUT_FILENO       # `int fd`
    la  a1, greeting            # `const void *buf`
    li  a2, greetlen            # `size_t count`
    ecall

Symbols greeting and greetlen are defined in section .rodata:

.section .rodata

greeting: .asciz "Hello world\n"
.equ greetlen, . - greeting

The complete assembly code in hello.S (capital .S means "assembly source, preprocessed"):

#include <asm-generic/unistd.h>

#define STDOUT_FILENO   1

.section .rodata                    # a section for read-only data

greeting: .asciz "Hello world\n"    # const char *greeting = "Hello world\n";
.equ greetlen, . - greeting         # const size_t greetlen = sizeof greeting;

.text                               # a section for executable code
.globl _start                       # export the program entrypoint symbol for the linker
_start:                             # linkers use `_start` as the default entrypoint
    li  a7, __NR_write
    li  a0, STDOUT_FILENO
    la  a1, greeting
    li  a2, greetlen
    ecall                           # write(STDOUT_FILENO, greeting, greetlen)

    li  a7, __NR_exit
    li  a0, 0
    ecall                           # _exit(0)
1:  j 1b                            # hang, in the very unlikely case `exit` failed

Compilation

By default, riscv64-unknown-elf-gcc tries to link start code from some crt0.o to enable libc functionality. Our program does not need libc, so let's add -nostdlib to LDFLAGS.

To be able to include asm-generic/unistd.h from /usr/riscv64-linux-gnu/include, adjust ASFLAGS to include files from there: -I /usr/riscv64-linux-gnu/include.

The Makefile:

CC = riscv64-unknown-elf-gcc
ASFLAGS += -I /usr/riscv64-linux-gnu/include
LDFLAGS += -nostdlib

# GNU make has an implicit rule for %: %.S which is roughly
#   $(CC) $(ASFLAGS) $(LDFLAGS) $< -o $@
hello: hello.S

# find and clean all the executables here
clean:
    -find -executable -type f -delete

# `clean` is not a file!
.PHONY: clean

Running make produces a 1208-byte ELF executable that runs via qemu-riscv64 and outputs "Hello world" (or just run it directly if qemu-user-binfmt is installed):

$ make hello
riscv64-unknown-elf-gcc -I /usr/riscv64-linux-gnu/include -nostdlib hello.S -o hello

$ stat hello
  File: hello
  Size: 1208            Blocks: 8          IO Block: 4096   regular file
  ...

$ qemu-riscv64 hello
Hello world