Pydrofoil-RISC-V’s Scripting API¶
This is still a very experimental feature, we don’t guarantee API stability at this point.
In order to make it easier to interact with a SAIL ISA model, we are working on a scripting API using the PyPy Python interpreter. So far this is only enabled for the RISC-V models (but can in principle also be added for the other CPUs, given a small amount of effort).
Introduction to the API¶
Basics of running the simulated CPU¶
The binary is a normal PyPy executable, but it comes with a new special
built-in module _pydrofoil
. The module exposes two classes, RISCV32
and
RISCV64
that represent simulated RISC-V CPUs. Their constructors take a path
to an ELF file as argument, which is loaded into simulated main memory:
>>>> m = _pydrofoil.RISCV64('riscv/input/rv64-linux-4.15.0-gcc-7.2.0-64mb.bbl', dtb=True)
tohost located at 0x80007008
ELF Entry @ 0x80000000
CSR mstatus <- 0x0000000A00000000 (input: 0x0000000000000000)
The .step()
method will execute the loaded program for a single instruction:
>>>> m.step()
mem[X,0x0000000000001000] -> 0x0297
mem[X,0x0000000000001002] -> 0x0000
[0] [M]: 0x0000000000001000 (0x00000297) auipc t0, 0x0
x5 <- 0x0000000000001000
>>>> m.step()
mem[X,0x0000000000001004] -> 0x8593
mem[X,0x0000000000001006] -> 0x0202
[1] [M]: 0x0000000000001004 (0x02028593) addi a1, t0, 0x20
x11 <- 0x0000000000001020
>>>> m.step()
mem[X,0x0000000000001008] -> 0x2573
mem[X,0x000000000000100A] -> 0xF140
[2] [M]: 0x0000000000001008 (0xF1402573) csrrs a0, mhartid, zero
CSR mhartid -> 0x0000000000000000
x10 <- 0x0000000000000000
By default, every instruction prints a trace of what memory and registers it is
reading and writing, and disassembles the current instruction. For
longer-running programs this is too much output and hides the prints of the
executed program. To disable it, you can call the .set_verbosity
method:
>>>> m.set_verbosity(0)
>>>> m.step()
>>>> m.step()
>>>> m.step()
To run the program for longer periods, use the .run
method, which optionally
takes a number of instructions as argument:
>>>> m.run(500000)
bbl loader
SIFIVE, INC.
5555555555555555555555555
5555 5555
5555 5555
5555 5555
5555 5555555555555555555555
5555 555555555555555555555555
5555 5555
5555 5555
5555 5555
5555555555555555555555555555 55555
55555 555555555 55555
55555 55555 55555
55555 5 55555
55555 55555
55555 55555
55555 55555
55555 55555
55555 55555
555555555
55555
5
SiFive RISC-V Core IP
SUCCESS
Instructions: 500000
Total time (s): 0.033948
Perf: 14728.432171 Kips
If you call .run()
without arguments, the CPU will keep executing
indefinitely. You can interrupt it at any point by pressing Ctrl-C:
>>>> m.run()
[ 0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x80200000
[ 0.000000] Linux version 4.15.0-gfe92d79-dirty (mundkur@dualnic2) (gcc version 7.2.0 (GCC)) #1 SMP Wed Jun 5 14:56:25 PDT 2019
[ 0.000000] bootconsole [early0] enabled
... lots of output skipped
...
[ 0.050569] Freeing unused kernel memory: 4528K
[ 0.050583] This architecture does not have kernel memory protection.
Starting logging: OK
Starting mdev...
modprobe: can't change directory to '/lib/modules': No such file or directory
Initializing random number generator... done.
Starting network...
ip: socket: Function not implemented
ip: socket: Function not implemented
Welcome to Buildroot
buildroot login: ^CCTRL-C was pressed
Instructions: 852835752
Total time (s): 48.992306
Perf: 17407.544607 Kips
Traceback (most recent call last):
File "<python-input-21>", line 1, in <module>
m.run()
KeyboardInterrupt
>>>>
Inspecting the CPU state¶
There are a number of methods to read (and also write) the current state of the simulated CPU. A useful high-level way to do that is to ask for the disassembly of the last executed instruction:
>>>> m.disassemble_last_instruction()
'addi a3, s1, 0x128'
You can also read the CPU’s registers (the register names are the internal names that the Sail code uses):
>>>> m.read_register('pc')
bitvector(64, 0xFFFFFFE0004AFC4A)
>>>> m.read_register('x1')
bitvector(64, 0xFFFFFFE0004AFC4A)
>>>> m.read_register('cur_privilege')
'Supervisor'
To write them, use the .write_register
method. E.g. if you set the pc
to
0
, executing the next instruction will trigger a page fault:
>>>> m.set_verbosity(1)
>>>> m.write_register('pc', 0)
>>>> m.step()
mem[R,0x0000000083A58000] -> 0x0000000020E96C01
mem[R,0x0000000083A5B000] -> 0x0000000020E97801
mem[R,0x0000000083A5E000] -> 0x0000000000000000
trapping from S to S to handle fetch-page-fault
handling exc#0x0C at priv S with tval 0x0000000000000000
CSR mstatus <- 0x0000000A00000180
Similarly, you can read from the simulated main memory:
>>>> m.read_memory(0x80000000, 8)
3747295551104745583
And also write to it with .write_memory
.
Accessing the low-level functions of the Sail model directly¶
In addition to the “high-level” API described so far, we can also call the
function of the Sail model directly, via the RISC64.lowlevel
namespace. All
the Sail functions that have been compiled into the simulator binary are
exposed there. For example, we can read a register, including the
zero-register, using the rX
model function:
>>>> m.lowlevel.rX(0)
bitvector(64, 0x0000000000000000)
Or we can call the encdec_compressed
function like this, to decode the 16-bit
bitvector 1
, and then use the assembly_forwards
function to print the
disassembled string for the operation.
>>>> ast = m.lowlevel.encdec_compressed_backwards(_pydrofoil.bitvector(16, 1))
>>>> print(ast)
C_NOP()
>>>> m.lowlevel.assembly_forwards(ast)
'c.nop'
Unfortunately, not all functions of the Sail model are exposed that way at this point. Only the functions that are needed for the simulator, and that aren’t inlined can be accessed from Python at the moment. We plan to fix this in the future.
Creating struct and union instances directly¶
There is another low-level namespace, RISC64.types
, that can be used to
construct union
and struct
types directly. For example we can construct the
AST
of a MUL
instruction like this:
>>>> ast = m.types.MUL(1, 2, 8, m.types.mul_op(False, True, True))
>>>> print(ast)
MUL(bitvector(5, 0b00001), bitvector(5, 0b00010), bitvector(5, 0b01000), mul_op(False, True, True))
>>>> m.lowlevel.assembly_forwards(ast)
'mul fp, sp, ra'
Examples¶
In this section we will show some example scripts and use cases for the scripting API.
A simple PC-value profiler¶
To collect some simple statistics about which values the pc
register takes on
how often during Linux boot, and about the instructions executed, we can use
the following script:
import _pydrofoil
import time
from collections import Counter
cpu = _pydrofoil.RISCV64('riscv/input/rv64-linux-4.15.0-gcc-7.2.0-64mb.bbl', dtb=True)
cpu.set_verbosity(0)
instructions = 23_000_000
histogram_pcs = Counter()
histogram_instructions = Counter()
histogram_mnemonic = Counter()
t1 = time.time()
try:
for i in range(instructions):
cpu.step()
histogram_pcs[str(cpu.read_register("pc"))] += 1
dis = cpu.disassemble_last_instruction()
histogram_instructions[dis] += 1
histogram_mnemonic[dis.split()[0]] += 1
except KeyboardInterrupt:
pass
t2 = time.time()
print(f"instructions: {i+1}, kips: {round(i/(t2-t1)/100,2)}")
print()
for pc, value in histogram_mnemonic.most_common(20):
print(pc, value)
print()
for pc, value in histogram_instructions.most_common(20):
print(pc, value)
print()
for pc, value in histogram_pcs.most_common(20):
print(pc, value)
After every instruction, we read the value of the pc
register and store it in
a collections.Counter
. We also store the full disassembled instruction, and
just the mnemonic of the instruction in two other counters. At the end (or when
Ctrl-C is pressed), we print the top 20 most common entries of the three
Counter
objects:
$ ./pypy-c-pydrofoil-riscv simpleprofiler.py
tohost located at 0x80007008
ELF Entry @ 0x80000000
CSR mstatus <- 0x0000000A00000000 (input: 0x0000000000000000)
bbl loader
SIFIVE, INC.
5555555555555555555555555
...
instructions: 23000000, kips: 4194.8
c.addi 3043168
c.sdsp 1372473
c.ldsp 1370792
c.lw 1197915
bltu 1162016
sw 1088575
c.mv 1059511
sd 1036455
addi 812684
ld 700569
c.add 557114
c.ld 529161
c.li 474221
c.bnez 427773
c.jr 405676
c.beqz 382647
beq 368920
bne 364287
lbu 351685
andi 329342
c.lw a4, a1, 0x0 959796
c.addi a1, 0x4 956499
c.addi t6, 0x4 956497
sw a4, 0x0(t6) 956495
bltu a1, a3, -0xa 956495
c.jr ra 380815
c.addi a1, 0x1 147848
c.addi sp, -0x10 144001
c.addi sp, 0x10 143980
c.addi4spn s0, 0x10 137044
c.addi a4, 0x1 136720
c.addi a0, 0x1 109330
c.sdsp fp, 0x1 94235
c.ldsp fp, 0x1 94235
c.mv s1, a0 87493
bltu a1, a3, -0xc 82017
c.addi t6, 0x1 82016
lb a4, 0x0(a1) 82015
sb a4, 0x0(t6) 82015
c.addiw a5, 0x0 79210
bitvector(64, 0xFFFFFFE00063A070) 956495
bitvector(64, 0xFFFFFFE00063A072) 956495
bitvector(64, 0xFFFFFFE00063A074) 956495
bitvector(64, 0xFFFFFFE00063A078) 956495
bitvector(64, 0xFFFFFFE00063A07A) 956495
bitvector(64, 0xFFFFFFE00063A080) 82015
bitvector(64, 0xFFFFFFE00063A084) 82015
bitvector(64, 0xFFFFFFE00063A086) 82015
bitvector(64, 0xFFFFFFE00063A08A) 82015
bitvector(64, 0xFFFFFFE00063A08C) 82015
bitvector(64, 0x000000008020229C) 65536
bitvector(64, 0x00000000802022A0) 65536
bitvector(64, 0x00000000802022A2) 65536
bitvector(64, 0x00000000802022A4) 65536
bitvector(64, 0x00000000802022A6) 65536
bitvector(64, 0x00000000802022AA) 65536
bitvector(64, 0x00000000802022AC) 65536
bitvector(64, 0x00000000802022AE) 65536
bitvector(64, 0x00000000802022B0) 65536
bitvector(64, 0xFFFFFFE0005BE198) 48389
Decoding every RISC-V instruction¶
We can also write a simple script to decode all 2**32
full instructions and
count their types. For this we need to use the encdec
mapping from the RISC-V
Sail model, in backwards mode (we plan to make the directionality of mapping
functions work automatically, based on the argument types, but right now the
correct direction needs to be picked manually). encdec_backwards
takes a
32-bit bitvector and returns an instance of the AST
union. Given the decoded
ast
, we compute statistics based on its type.
import time
from __pypy__ import _promote
from _pydrofoil import RISCV64, bitvector
from collections import Counter
t1 = time.time()
m = RISCV64()
c = Counter()
try:
for opcode in range(2**8):
print(hex(opcode))
for i in range(2**24):
bv = bitvector(32, (i << 8) | opcode)
ast = m.lowlevel.encdec_backwards(bv)
c[type(ast)] += 1
finally:
for key, value in c.most_common():
print(key, value)
t2 = time.time()
print("took", t2 - t1, "seconds, stopped at", opcode)
(This is not entirely correct because it doesn’t take compressed RISC-V instructions into account properly.)
Running this takes a while (37 minutes on a Ryzen 9 3900X) and prints:
...
<class 'ILLEGAL'> 4047609720
<class 'UTYPE'> 67108864
<class 'RISCV_JAL'> 33554432
<class 'LOAD'> 29360128
<class 'ITYPE'> 25165824
<class 'BTYPE'> 25165824
<class 'CSR'> 25165824
<class 'STORE'> 16777216
<class 'AMO'> 4718592
<class 'ADDIW'> 4194304
<class 'RISCV_JALR'> 4194304
<class 'FENCEI_RESERVED'> 4194303
<class 'FENCE_RESERVED'> 4193792
<class 'RTYPE'> 327680
<class 'ZBB_RTYPE'> 294912
<class 'ZBS_IOP'> 262144
<class 'STORECON'> 262144
<class 'SHIFTIOP'> 196608
<class 'RTYPEW'> 163840
<class 'MUL'> 131072
<class 'ZBS_RTYPE'> 131072
<class 'SM4ED'> 131072
<class 'SM4KS'> 131072
<class 'ZBA_RTYPEUW'> 131072
<class 'SHIFTIWOP'> 98304
<class 'ZBA_RTYPE'> 98304
<class 'RISCV_RORI'> 65536
<class 'RISCV_SLLIUW'> 65536
<class 'DIV'> 65536
<class 'REM'> 65536
<class 'ZBKB_RTYPE'> 65536
<class 'ZICOND_RTYPE'> 65536
<class 'DIVW'> 65536
<class 'REMW'> 65536
<class 'ZBB_RTYPEW'> 65536
<class 'RISCV_RORIW'> 32768
<class 'RISCV_CLMUL'> 32768
<class 'RISCV_CLMULR'> 32768
<class 'RISCV_CLMULH'> 32768
<class 'RISCV_XPERM4'> 32768
<class 'RISCV_XPERM8'> 32768
<class 'AES64ES'> 32768
<class 'AES64ESM'> 32768
<class 'AES64DS'> 32768
<class 'AES64DSM'> 32768
<class 'AES64KS2'> 32768
<class 'MULW'> 32768
<class 'RISCV_FMINM_S'> 32768
<class 'RISCV_FMAXM_S'> 32768
<class 'RISCV_FLEQ_S'> 32768
<class 'RISCV_FLTQ_S'> 32768
<class 'ZBKB_PACKW'> 31744
<class 'AES64KS1I'> 11264
<class 'LOADRES'> 8192
<class 'RISCV_FROUND_S'> 6144
<class 'RISCV_FROUNDNX_S'> 6144
<class 'ZBB_EXTOP'> 3072
<class 'SHA256SUM0'> 1024
<class 'SHA256SUM1'> 1024
<class 'SHA256SIG0'> 1024
<class 'SHA256SIG1'> 1024
<class 'SHA512SUM0'> 1024
<class 'SHA512SUM1'> 1024
<class 'SHA512SIG0'> 1024
<class 'SHA512SIG1'> 1024
<class 'SM3P0'> 1024
<class 'SM3P1'> 1024
<class 'RISCV_ORCB'> 1024
<class 'AES64IM'> 1024
<class 'RISCV_CLZ'> 1024
<class 'RISCV_CTZ'> 1024
<class 'RISCV_CPOP'> 1024
<class 'RISCV_BREV8'> 1024
<class 'RISCV_REV8'> 1024
<class 'RISCV_CLZW'> 1024
<class 'RISCV_CTZW'> 1024
<class 'RISCV_CPOPW'> 1024
<class 'RISCV_FLI_S'> 1024
<class 'SFENCE_VMA'> 1024
<class 'SINVAL_VMA'> 1024
<class 'FENCE'> 256
<class 'FENCE_TSO'> 256
<class 'RISCV_ZICBOM'> 96
<class 'RISCV_ZICBOZ'> 32
<class 'FENCEI'> 1
<class 'ECALL'> 1
<class 'EBREAK'> 1
<class 'URET'> 1
<class 'SRET'> 1
<class 'WFI'> 1
<class 'SFENCE_W_INVAL'> 1
<class 'SFENCE_INVAL_IR'> 1
<class 'MRET'> 1