RV32I: Memory Access Instructions

Computational instructions only work on the contents of registers. Memory access instruction exchange 8/16/32-bit values ("bytes"/ "halfs"/"words") between registers and RAM locations.

There are 5 load ("RAM to register") instructions and 3 store ("register to RAM") instructions. They use byte addresses in RAM encoded as the value in register rs1 added with 12-bit sign-extended immediate.

2 fence instructions serialize concurrent accesses to RAM from different hardware threads.

Misaligned RAM access is allowed, but can be non-atomic and/or much slower.

Load instructions

instr	description	"C"
lw rd, imm(rs1)	"load word"	`int32_t ptr = rs1 + (int32_t)imm;` `rd = ptr;`
lh rd, imm(rs1)	"load half", sign-extend	`int16_t p = rs1 + (int32_t)imm;` `rd = (int32_t)p;`
lb rd, imm(rs1)	"load byte", sign-extend	`int8_t p = rs1 + (int32_t)imm;` `rd = (int32_t)p;`
lhu rd, imm(rs1)	"load half unsigned"	`uint16_t p = rs1 + (int32_t)imm;` `rd = (uint32_t)p;`
lbu rd, imm(rs1)	"load byte unsigned"	`uint8_t p = rs1 + (int32_t)imm;` `rd = (uint32_t)p;`

Store instructions

instr	description	"C"
sw imm(rs1), rs2	"store word"	`(int32_t )(rs1 + imm[11:0])) = rs2`
sh imm(rs1), rs2	"store half"	`(int16_t )(rs1 + imm[11:0])) = (int16_t)rs2`
sb imm(rs1), rs2	"store byte"	`(int8_t )(rs1 + imm[11:0])) = (int8_t)rs2`

Fence instructions

instr	description
fence pred, succ	an explicit barrier for the specified kinds of concurrent memory accesses
fence.i	an explicit barrier for writing and executing instructions in RAM concurrently

When multiple harts, hardware threads ("cores") are present and share the same RAM, it is necessary to control how changes by one hart are perceived by another.

Some (ahem, x86_64) architectures provide sequential consistency, which guarantees that any observed state can be described by some combination of concurrent sequential changes. This model makes it easier to reason about machine code, but can significantly complicate hardware. Under sequential consistency, speculative and out-of-order execution must maintain a separate externally visible sequentially-consistent state.

Since different harts work with different areas of RAM most of the time, RISC-V assumes a relaxed memory model, which requires explicit synchronization when needed.

A fence instruction provides an ordering guarantee between memory accesses before and after the fence. The arguments describe:

the predecessor set: kinds of accesses by prior instructions that must be completed before fence
the successor set: kinds of accesses by subsequent instructions that must not start before the fence

The kinds of accesses are:

R: "read memory"
W: "write memory"
I: "device input"
O: "device output"

E.g. fence rw, w guarantees that all reads and writes by preceding instructions appear completed before this instruction and any reordered writes by subseqent instructions must wait until this instruction. Note: reads by subsequent instructions can happen before this fence.

A fence.i allows to synchronize RAM data-access and instruction-access. E.g. if one hart writes instructions to RAM and another executes them, fence.i guarantees that preceding stores by one hart become visible to instruction fetches from another hart after.

Encoding

Stores are in S-type format:

instr	funct3	opcode
sb	`000`	`01 000 11`
sh	`001`	`01 000 11`
sw	`010`	`01 000 11`

The following instructions are in I-type format:

instr	funct3	opcode
lb	`000`	`00 000 11`
lh	`001`	`00 000 11`
lw	`010`	`00 000 11`
lbu	`100`	`00 000 11`
lhu	`101`	`00 000 11`

instr	imm[11:0]	rs1	funct3	rd	opcode
fence	`0000` pred succ	`00000`	`000`	`00000`	`00 011 11`
fence.i	`0000 0000 00000`	`00000`	`001`	`00000`	`00 011 11`

Least significant byte looks like:

03/83 for loads
23/A3 for stores
0F for fences

TODO: clarify encoding of pred/succ masks.

RISC-V notes

RV32I: Memory Access Instructions

Load instructions

Store instructions

Fence instructions

Encoding