RISC-V Notes



src/main/scala/uncore/tilelink/Definitions.scala States the following for each steps:

  1. Acquire: used to initiate coherence protocol transactions in order to gain access to a cache blcok’s data with certain permissions enabled. … Acquires may contain data for Put or PutAtomic… After sending acquires, clients must wait for a manager to send them a Uncore Grant message in response
  2. Probe: used to force clients to release data or cede permissions on a cache block. Clients respond to probes with Release messages.
  3. Release: used to release data or permission back to the manager in response to Probe message. Can be used to volunatirly writeback data. (ex. event that dirty data must be evicted on cache miss).
  4. Grant: used to refill data or grant permissions requested of the manger agent via acquire message. Also used to ack the receipt of volunatry writeback from clients in the form of Release.
  5. Finish: used to provide global ordering of Txs. Sent as ack for receipt of grant message. When a Finish message is received, a manager knows it is safe to begin processing other transactions that touch the same cache block.

Cache Miss

On a miss, there is a block of code that adds the miss into the MSHR

// replacement policy
val replacer = p(Replacer)()
val s1_replaced_way_en = UIntToOH(replacer.way)
val s2_replaced_way_en = UIntToOH(RegEnable(replacer.way, s1_clk_en))
val s2_repl_meta = Mux1H(s2_replaced_way_en, wayMap((w: Int) => RegEnable(meta.io.resp(w), s1_clk_en && s1_replaced_way_en(w))).toSeq)

// miss handling
mshrs.io.req.valid := s2_valid_masked && !s2_hit && (isPrefetch(s2_req.cmd) || isRead(s2_req.cmd) || isWrite(s2_req.cmd))
mshrs.io.req.bits := s2_req
mshrs.io.req.bits.tag_match := s2_tag_match
mshrs.io.req.bits.old_meta := Mux(s2_tag_match, L1Metadata(s2_repl_meta.tag, s2_hit_state), s2_repl_meta)
mshrs.io.req.bits.way_en := Mux(s2_tag_match, s2_tag_match_way, s2_replaced_way_en)
mshrs.io.req.bits.data := s2_req.data
when (mshrs.io.req.fire()) { replacer.miss }
io.mem.acquire <> mshrs.io.mem_req

The miss should be processed by the MSHR by issuing an Acquire to the TileLink, and waiting for a Grant that’ll be filled into the mshrs.io.req.bits.way_en way.

In the MSHRFile class, there is a line of code as follows:

val sdq_enq = io.req.valid && io.req.ready && cacheable && isWrite(io.req.bits.cmd)

Thus I’m assuming the sdq stands for a Store Data Queue. Also, as we’re trying to prefetch misses (I think we can ignore this part for now…)

MSHR Issues Acquire Requests

 io.mem_req.valid := state === s_refill_req && fq.io.enq.ready
 io.mem_req.bits := req.old_meta.coh.makeAcquire(
 addr_block = Cat(io.tag, req_idx),
 client_xact_id = Bits(id),
 op_code = req.cmd)

This is a code snippet from the MSHR class.  The individual mshr.io.mem_req are connected via an arbiter in the MSHRFile class mem_req_arb.io. The mem_req_arb‘s out is connected to the io.mem_req which is then connected to the io.mem.acquire in the HellaCache class.

The snippet above sends out a Acquire requests if the state of the current MSHR is a refill request, and is ready to be enqueued into the Finish Queue. The address block is generated from the tag and index, and the op_code carries the command of the request. Also, an id is generated to create an client Transaction ID. However, the id the index of the MSHR in the MSHRFile.

State change: s_refill_req->s_refill_resp

The MSHR state change from the s_refill_req to s_refill_resp happens on an Acquire fire(). The fire() occurs when both the  object’s valid and ready bits are on at the same time.

object ReadyValidIO {
def fire(): Bool = target.ready && target.valid

Thus the acquire has been fed valid data, and the acquire has been switched to ready. The acquire has been successfully fired, and we’re waiting for a grant response from the uncore.

Check the code from line:304

Linux NUMA physical memory layout

Just looked through the pseudo files in the procfs.

/proc/zoneinfo has some interesting information.

There are two zones in my current system: DMA, normal. We also have 2 nodes, and thus the file is shown as:

  • Node0 DMA
  • Node0 DMA32
  • Node0 normal
  • Node1 normal

Each section has how many free pages, present pages there are. At the end of the section there is the start_pfn that shows us the physical address of the beginning of the zone.

Thus we can approximate the physical address space of our system by using the start_pfn * PAGE_SIZE and also using the number of present pages.

NUMA Linux Memory Allocation

Scope of Memory Policies

System Default Policy

Hard coded into the kernel. On system boot it uses interleaved. After bootup it uses local allocation.

Task/Process Policy

Per task policy. Task policies are inherited to child processes. Thus applications like numactl uses this property to propogate the task policy to the child process.

In multi-threaded situation where other threads exist only the thread that calls the MEMORY_POLICY_APIS will set its memory policy. All other exisitng threads will retain the prior policy.

The policy only affects memory allocation after the time the policy is set. All allocations before the change are not affected.

VMA Policy

Only applies to anonymous pages. File mapped VMAs will ignore the VMA policy if it is set to MAP_SHARED. If it is MAP_PRIVATE VMA policy will only be enforced on a write to the mapping (CoW).

VMA policies are shared by threads of the same address space. VMA policies do not persist accross exec() calls (as the address pace is wiped)

Shared Policy

Similar to VMA policy but it is shared among processes. Some more details, but skipped due to irrelevance to my work.

Memory Policies

Default Mode

Specifies the current scope does not follow a policy, fall back to larger scope’s policy. At the root it’ll follow the system policy.

Bind Mode

Memory allocated from the nodes specified by the policy. Proximity is considered first, and if enough free space exists for the closest memory node (to the allocation requestor) it’ll be granted

Preferred Mode

Allocation will be attempted from the preferred (single) node. If it fails, the nodes will be searched for free space in nearest first fashion.
Local allocation is a preferred mode where the node that initiates a page fault is the preferred node.

Interleaved Mode

For Anonymous/shared pages: Node set is indexed using the page offset to the VMA. (Address % node_nums). The indexed node is requested for a page as in Preferered mode. And if it fails follows preferred mode style.

Page cache pages: A node counter is used to wrap around and try to spread out pages among the nodes (that are specified).


Reference: What is Linux Memory Policy


Zedboard Linux Hello world execution failure

No such file or directory!

On trying to execute the hello_world.elf file I ran into a ./hello_world.elf: No such file or directory error.

The problem was due to a missing interpreter that was provided in the elf file.

readelf -a hello_world.elf

results in a big output.

If yo check the Program Headers section you see the following output:

<br />Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
EXIDX 0x0004b4 0x000104b4 0x000104b4 0x00008 0x00008 R 0x4
PHDR 0x000034 0x00010034 0x00010034 0x00100 0x00100 R E 0x4
INTERP 0x000134 0x00010134 0x00010134 0x00019 0x00019 R 0x1
[Requesting program interpreter: /lib/ld-linux-armhf.so.3]
LOAD 0x000000 0x00010000 0x00010000 0x004c0 0x004c0 R E 0x10000
LOAD 0x0004c0 0x000204c0 0x000204c0 0x0011c 0x00120 RW 0x10000
DYNAMIC 0x0004cc 0x000204cc 0x000204cc 0x000e8 0x000e8 RW 0x4
NOTE 0x000150 0x00010150 0x00010150 0x00044 0x00044 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10

The Requesting program interpreter is our problem here.

That file is not available on the zerboard’s /lib directory.

Thus just simply create a symbolic link of the interpreter on the zedboard

ln -s /lib/ld-2.17.so /lib/ld-linux-armhf.so.3


Xilinx SDK Hello World Build Errors

Cannot find libz.so.1

Got an error in the Xilinx SDK where the error message was that the arm gcc compiler was missing libz.so.1

Not sure if this will help but I tried installing the 32-bit libz library

apt-get install zlib1g:i386

Cannot find -lxil

Seems like somebody else had a problem here

And the third message had the answer:

I wasn’t able to solve it. It was a simple project, so I just remade it and it worked.

So I cleaned the build and remade it.

Driving FPGA IPs from Linux

 few links that’ll help me get started:

Zedboard.org forum




Forum Posts:



By Sven Andersson


Xilinx Wiki code


Generating DTS and DTIs from Xilinx SDK

Xilinx Tutorial

DTB compiling help

When building the kernel with the uImage we need a mkimage which is in the uboot. Also uboot requires dtc which is available on the Fetch sources.

At the moment the kernel doesn’t boot after Starting kernel ...

Building the Kernel

The Xilinx manual makes the kernel load at 0x208000 however the u-boot boots from 0x8000; thus changed the

make ARCH=arm xilinx_zynq_defconfig
make ARCH=arm menuconfig
make ARCH=arm uImage LOADADDR=0x00008000

Also, changed the linux-xlnx branch from master to the xilinx-v2016.2 tag.

More problems: Nice forum post


This post was helpful. It first identified that a problem may be in the FSBL

From the u-boot if the following command does not work, then we may need to recompile the FSBL

md.l 0x43c00000

If this does not work, the translation table in the FSBL must be incorrect. (This may be because we use the UCR-Bar’s files.Build the FSBL with the boot.bif in the UCR-Bar repo.

This page tells us how to create the FSBL Boot image, and This page how to build the FSBL from our IP configuration.

Done! – The process


Vivado HLS (2016.02) co-simulation build error on Ubuntu 14.04 (crti.o not found)

Vivado HLS 2016.02 has an error when building binary files (for co-simulation) on Ubuntu 14.04. (Don’t know about other distributions, or other versions of Ubuntu)

I got my help from this blog post.

The problem I got was as follows:

INFO: [HLS 200-10] In directory '/home/username/Vivado_HLS_projects/XorGenerator/XorGenerator/sim/wrapc'
clang: warning: argument unused during compilation: '-fno-builtin-isinf'
clang: warning: argument unused during compilation: '-fno-builtin-isnan'
INFO: [APCC 202-3] Tmp directory is /tmp/apcc_db_username/160961476017075859177
INFO: [APCC 202-1] APCC is done.
Generating cosim.tv.exe
In file included from apatb_do_xor.cpp:18:0:
/opt/Xilinx/Vivado_HLS/2016.2/include/ap_stream.h:70:2: warning: #warning AP_STREAM macros are deprecated. Please use hls::stream<> from "hls_stream.h" instead. [-Wcpp]
/usr/bin/ld: cannot find crt1.o: No such file or directory
/usr/bin/ld: cannot find crti.o: No such file or directory

So the problem was that crt1.o and crti.o are note found. Just to give you a hint, these files are under /usr/lib/x86_64-linux_gnu

However instead of adding that directory as the build path or Library path (which I think is the intuition and the right way to do it…) The workaround in the blog post is to copy the crti.o, crt1.o, and crtn.o file to the Vivado HLS's gcc directories.

The path has changed since Vivado HLS 2012.02 as in the post.

Now the path is at /opt/Xilinx/Vivado_HLS/2016.2/lnx64/tools/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.6.3/

Thus you will want to execute the following command:

cp /usr/lib/x86_64-linux-gnu/crt?.o /opt/Xilinx/Vivado_HLS/2016.2/lnx64/tools/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.6.3/

Cheers 🙂