Driving FPGA IPs from Linux

 few links that’ll help me get started:

Zedboard.org forum




Forum Posts:



By Sven Andersson


Xilinx Wiki code


Generating DTS and DTIs from Xilinx SDK

Xilinx Tutorial

DTB compiling help

When building the kernel with the uImage we need a mkimage which is in the uboot. Also uboot requires dtc which is available on the Fetch sources.

At the moment the kernel doesn’t boot after Starting kernel ...

Building the Kernel

The Xilinx manual makes the kernel load at 0x208000 however the u-boot boots from 0x8000; thus changed the

make ARCH=arm xilinx_zynq_defconfig
make ARCH=arm menuconfig
make ARCH=arm uImage LOADADDR=0x00008000

Also, changed the linux-xlnx branch from master to the xilinx-v2016.2 tag.

More problems: Nice forum post


This post was helpful. It first identified that a problem may be in the FSBL

From the u-boot if the following command does not work, then we may need to recompile the FSBL

md.l 0x43c00000

If this does not work, the translation table in the FSBL must be incorrect. (This may be because we use the UCR-Bar’s files.Build the FSBL with the boot.bif in the UCR-Bar repo.

This page tells us how to create the FSBL Boot image, and This page how to build the FSBL from our IP configuration.

Done! – The process



Vivado HLS (2016.02) co-simulation build error on Ubuntu 14.04 (crti.o not found)

Vivado HLS 2016.02 has an error when building binary files (for co-simulation) on Ubuntu 14.04. (Don’t know about other distributions, or other versions of Ubuntu)

I got my help from this blog post.

The problem I got was as follows:

INFO: [HLS 200-10] In directory '/home/username/Vivado_HLS_projects/XorGenerator/XorGenerator/sim/wrapc'
clang: warning: argument unused during compilation: '-fno-builtin-isinf'
clang: warning: argument unused during compilation: '-fno-builtin-isnan'
INFO: [APCC 202-3] Tmp directory is /tmp/apcc_db_username/160961476017075859177
INFO: [APCC 202-1] APCC is done.
Generating cosim.tv.exe
In file included from apatb_do_xor.cpp:18:0:
/opt/Xilinx/Vivado_HLS/2016.2/include/ap_stream.h:70:2: warning: #warning AP_STREAM macros are deprecated. Please use hls::stream<> from "hls_stream.h" instead. [-Wcpp]
/usr/bin/ld: cannot find crt1.o: No such file or directory
/usr/bin/ld: cannot find crti.o: No such file or directory

So the problem was that crt1.o and crti.o are note found. Just to give you a hint, these files are under /usr/lib/x86_64-linux_gnu

However instead of adding that directory as the build path or Library path (which I think is the intuition and the right way to do it…) The workaround in the blog post is to copy the crti.o, crt1.o, and crtn.o file to the Vivado HLS's gcc directories.

The path has changed since Vivado HLS 2012.02 as in the post.

Now the path is at /opt/Xilinx/Vivado_HLS/2016.2/lnx64/tools/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.6.3/

Thus you will want to execute the following command:

cp /usr/lib/x86_64-linux-gnu/crt?.o /opt/Xilinx/Vivado_HLS/2016.2/lnx64/tools/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.6.3/

Cheers 🙂

Zedboard Block RAM Size

The Zedboard’s device name is Z-7020; and if we take a look at the data sheet of the Zynq-7000, we can find that the Block RAM size is 4.9Mb, or 140 x 36Kb blocks of BRAM.

FIFO can be used as a bridge between the PS and PL, as either will have different frequencies. To prevent data loss, the FIFO is used as a shared buffer as a synchronization circuit. (Ref)

Synology Gitlab Setup SSL over Let’s Encrypt

With Let’s Encrypt and Synology, we need to take an extra step to setup certificates in the Gitlab persistent data.

First off, the synology certificates seem to be at /usr/syno/etc/certificate/system/default

I’m referring to the instructions on github and by P. Behnke.

Also, the gitlab certs directory should be placed at /volume1/docker/gitlab/certs

Three files are required in all, gitlab.key, gitlab.crt, and dhparam.pem. If any of them does not exist, it won’t work.

Generate dhparam.pem by referring to the github info.

openssl dhparam -dsaparam -out dhparam.pem 2048

Also append the fullchain.crt & cert.pem into gitlab.crt. Change the name of privkey.pem into gitlab.key

Thus in the end you’ll have the following three files:

  • dhparam.pem
  • gitlab.crt
  • gitlab.key

Restart the gitlab container, and you should check to see if you get a certificate not found error. If you don’t, then you should be set to go 🙂

Note: synology also has the dh-param files ready. (It takes very long to generate the file, I generated it on my workstation and ferried it over). Anwyays, you can find the synology generated files in the following path: /usr/syno/etc/ssl

Update: When you a SSL certification verification failed error

When using the HTTPS protocol the SSL verification sometimes seems to fail. The reason seems to be gnuTLS being picky about the order of the certificates.

fatal: unable to access ‘https://hostname:port/username/repo.git&#8217; server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none

Thus instead of the order that is shown in P.Behnke‘s blog, reverse the order of the fullchain and cert as follows:

$ cat fullchain.crt cert.pem > gitlab.crt

After restarting the docker containers, all seems to work.

Setting up Gitlab on Synology (via Docker)

Synology provides Gitlab from their package manager. However when you log on you see that the build version is too old, and Gitlab asks for upgrade ASAP.

Thus, I checked the docker registry, and downloaded the sameersbn/gitlab.

Synology seems to use Docker API v1 instead of v2, and so you can’t see all the tags. Thus logon to the synology device over ssh, and execute the following command

docker pull sameersbn/gitlab:8.12.3

The 8.12.3 was the latest tag. There is also another tag that is latest and so you may just pull from that tag.

Then you get a totally new docker image on the docker explorer on synology DSM. From here we need to copy some environment variables from the synology gitlab docker.

From the provided 8.6.2 to the 8.12.3, obviously there were a lot of change, and we need to add two more environment variables before gitlab boots properly.

This github issue helps us with the issue. Basically find the .secret file, and the secret value needs to be set as the GITLAB_SECRETS_OTP_KEY_BASE and GITLAB_SECRETS_SECRET_KEY_BASE. After you add these values gitlab should boot up.

Its a shame that the synology version of docker does not suport the pulling the latest version, and we have to do this manually. But still, synology did provide quite a lot of useful variables to refer to.

Finally set GITLAB_HTTPS to true to forward the GITLAB_PORT to HTTPS.

You should be set to go!

Scala Language

In our FPGA project we’ll be using the UC Berkeley Architecture Research team’s Rocket core. (Most likely we’ll be using it) The core is written in the Scala language and is compiled via Chisel into C++ simulator or Verilog for FPGA/ASIC.

Now I need to know Scala to fully understand the rocket core, and so I’m looking throught the Scala Tutorial, offered at the Scala Language site.

Interesting Features

Method call

Methods taking one argument can be used with an infix syntax.

df format now

These have the same meaning syntactically. Seems alot like the style of smalltalk (and Objective-C)

Function are Objects (Passing functions as arguments)

object Timer {
  def oncePerSecond(callback: () => Unit) {
    while (true) { callback(); Thread sleep 1000 }
  def timeFlies() {
    println("time flies like an arrow...")
  def main(args: Array[String]) {

Anonymous Functions

object TimerAnonymous {
  def oncePerSecond(callback: () => Unit) {
    while (true) { callback(); Thread sleep 1000 }
  def main(args: Array[String]) {
    oncePerSecond(() =>
      println("time flies like an arrow..."))

Notice that the () =&gt; denotes an anonymous function. I think it would be a lambda function 😉

Vivado License Ubuntu 14.04 Zeroed NIC Interface problem

During the Activation (License management) of Vivado HLx 2016.2 I encounterd a problem where I could copy the license (received via email), however when viewing license information, I couldn’t find my license.

I checked the host information and found that the Network Interface Card ID was zeroed.
Screen Shot 2016-10-03 at 3.40.45 PM.png

I think a possible reason is because from some point in Ubuntu, they followed Redhat’s convention for the nic names. The NIC names used to be ethX, but now they are emX because the NICs I use are embedded onto the mainboard.

I found that the module that changes the name is biosdevname and can be disabled by setting biosdevname=0 in the linux bootup parameters.

A Stackoverflow Q&A discusses this problem, and we can either:

I decided to try the grub option.

If you use static IP setting, and use /etc/network/interfaces, don’t forget to update your settings before resetting. (You won’t be able to log on, or the system will delay the boot for 60 seconds to try to get connectivity)

This issue is also reported on the Xilinx Forums, however I’m not sure if there is a fix going on.



So there are a few Wooribank branches in Beijing, and the one that’s closest I think is the San yuan Qiao sub-branch. (삼원교지행) Damn Korean sites without direct links!

Anyways, there are a few places in Beijing, and Wooribank is written 友利銀行.

wooribank map.PNG

So we are going to go to Number 7.

We ride the subway line 10 in clockwise, get off at 三元桥 station, (Exit C2) and walk.

I’ll make an account, put my money in the account, and then probably withdraw the money from Korea.

Good thing is the Sanyuanqiao station is connected to the Airport via express way, thus I think I could go from the hotel to Sanyuanqiao to the airport right away.


Prof. Steve Keckler Talk

Architectures for Deep Neural Networks

DNN is here to stay because of the big data that is available. Lots of data to analyze and run from. Also the computational ability of current systems.

Accuracy have been increased drastically. due to DNN.

Now a time for computer architects to use the creativity to improve performance instead of process technologies iproving performance.

How do DNNs work?

Training & inference

  • Training -> Large number of data.  (Back propogate to learn)
  • Inference -> Smaller varied number of incoming data

A lot of linear algebra! Output = Weight * Input

Learn the weight using back propagations

Locality: Batching! Multiple images at a time. Input matrices & output matrices (single images are vectors)

Convolutional stage and fully connected stage

Convolutional stages act as trained feature detection stage

6D loop! Dot product accumulator within the image.

  • Run a larger(deeper 100s~1000s of levels)
  • Run network faster
  • Run network more efficiently

Limit of memory capacity (Bandwidth & capacity tradeoff!)

vNN: Virtualizing training!

Trend is larger and deeper neural networks

Recurrant neural networks! (Temporal factor??) Audio recognition

Training needs to keep the memory of each layer output activations so that it an be used to backpopagation!

Volume of that data is very large! GPU memory usage is proportional to the number of layers

Computing set of gradients!

We don’t need data of the first layers until the training goes to the end, then back propagate! These data take up a lot of space! Thus offload to the DRAM (more capacity)

(forward and backward propagation)

Pipeline Writeback and prefetching

Allows training much larger networks! Incurs little overhead relative to hypothetical large memory GPUs, oeverhead will drop with faster CPU & GPU links.

Optimizing DNNs for Inference

Accessing DRAM is far more expensive than computing. But then to have more accuracy we still need large networks!


  • Reduce numerical precision: FP32->FP16
  • Prune networks: Lots of zeros as weights! Thus remove these unnecessary weights. ALso share weights among other elements in the weight matrix
  • Share weights
  • Compress netowrk
  • Re-architect network

Importance of staying local! LPPDDR 640 Pj/word, SRAM(MB) 50 -> SRAM(KB) 5

 Cost of operations

Accumulate at greater precision than your multiplications!



Lots of the wieghts are zeros! Prune unnecessary weights!

This allows reducing the network dramatically. You can also prune weights that are close to 0. The other wieghts can be used to recover from pruning! Retraining is important

Prune up to 90% of wegiths, but reiterative learning keeps the accuracy pretty high

Pruning can be aggressively done

Factor of 3x performance improvement of pruning


Weight Sharing

Similar weights can be clustered, and using simple bit indexes, point to fin-tuned centroids

Retrain to account for the change as well!

We can reduce weight table significantly!

Huffman encoding to further improve! up to 35x~49x reduction!

Rearchitect netowrks!

SqueezeNet -> Reduce filters! combination of 1×1 filters and 3×3? This allows really shrinking down the networks!

DNN Inference Accelerator Architectures

Eyeriss: ISSCC 2016 & ISCA 2016

Implications of data flow in DNNs

Do we keep filters stationary or input stationary, etc.

Maximize reuse of data within PE

Data compression! Non linear functios applied between activations. Used to be sigmoid. Now it is ReLU! (Negative values are 0)

Lots of negative values are calculated!

Reduce data footprints… Compress them before moving to offschip ememory.

Saves volume of transfered data

If there are multiplications by 0, then just disable the rest of the calculation, reduce the power!

Up to 25x Energy efficient compared to Jetson TK

EIE: Efficient inference Engine (compressed activations)

All weights are compressed and pruned, weight sharing, zero skipping!

Sparse matrix engine, and skip zeros.


Special purpose processors can maximize energy efficiencies