Debugging Guest kernel

Getting the vmlinux file of the Ubuntu kernels

I used the answer on superuser.com.

# Add ppas
echo "deb http://ddebs.ubuntu.com $(lsb_release -cs)-updates main restricted universe multiverse
deb http://ddebs.ubuntu.com $(lsb_release -cs)-security main restricted universe multiverse
deb http://ddebs.ubuntu.com $(lsb_release -cs)-proposed main restricted universe multiverse" | \
sudo tee -a /etc/apt/sources.list.d/ddebs.list
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 428D7C01
# Get the actual vmlinux file
sudo apt-get install linux-image-$(uname -r)-dbgsym
file /usr/lib/debug/boot/vmlinux-$(uname -r) # This is the vmlinux file

Setup qemu to export a serial port for debugging

I referred to the following answer on stackoverflow.

Using this, my resulting qemu code is as follows:

sudo qemu-system-x86_64 -m 512M \
-kernel /boot/vmlinuz-$(uname -r) \
-drive file=ubuntu_14.04.img,index=0,media=disk,format=raw \
-append "root=/dev/sda rw console=ttyS0 kgdboc=ttyS0,115200" \
-netdev user,id=hostnet0,hostfwd=tcp::5556-:22 \
-device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x3 \
-serial tcp::1234,server,nowait \
--nographic \
--enable-kvm

Notice the kgdboc=ttyS0,116200 on the kernel commandline, and the -serial tcp::1234,server,nowait on the qemu argument.

Start the debugging process

Startup the VM using the command above. SSH into your machine

ssh localhost -p 5556

trigger a gdb break(?) on the VM

sudo bash -c "echo g > /proc/sysrq-trigger"

Run GDB on your host

$ gdb /usr/lib/debug/boot/vmlinux-$(uname-r)
(gdb) target remote localhost:1234

And your gdb should be connected to the VM’s kernel.

Note: the vmlinux downloaded from ddebs(?) ppas will not include the source code, and so gdb will complain about source code lines not being found.
This should be substitutable, but I haven’t gotten this to work yet. (will update)

Advertisements

Making Simple KVM Image

This tutorial is aimed at quickly making a KVM image.

First off, there are lots of guides available, but I wanted to quickly, and effortlessly make a KVM image using debootstrap. Some guides on the net used nbd for some reason I cannot comprehend. this post uses the more sensible loopback interface to connect to the qemu disk and debootstrap. This guid is based on the tutorial linked above, but has my personal customization.

# Create VM foundations
$ qemu-img create -f raw ubuntu_16.04.img 20G
$ mkfs.ext4 ubuntu_14.04.img
$ mkdir mnt
$ sudo mount -o loop ubuntu_16.04.img mnt
$ sudo debootstrap --arch amd64 --include=ssh,vim xenial mnt

# Setup created environment
$ sudo chroot mnt
$ passwd # and change root password
$ adduser username
$ usermod -aG sudo username

$ sudo umount mnt

$ sudo qemu-system-x86_64 -m 512M \
-kernel /boot/vmlinuz-$(uname -r) \
-drive file=ubuntu_14.04.img,index=0,media=disk,format=raw \
-append "root=/dev/sda rw console=ttyS0" \
-netdev user,id=hostnet0,hostfwd=tcp::5556-:22 \
-device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x3 \
--nographic \
--enable-kvm

# Setup DHCP for your network in your VM.
# In the **VM console** (provided by qemu)
# login as root
$ ifconfig -a
# find the interface name other than lo
# In my case it was ens3, it could be eth0, etc.

$ vi /etc/network/interfaces
# add the following two lines:
auto ens3
iface ens3 inet dhcp

$ ifup ens3
# Now you have network connectivity!

# connect via SSH to your VM
# Now, back from your **host**
$ ssh localhost -p 5556 

 

Setup HTTPS on your dockerized Gitlab

So I use Sameersbn’s dockerized Gitlab.

I’m not sure if this is the best way to do this, but it works, so I’m sharing it, and also as a reference for myself for future deployments.

BTW I’m working off Ubuntu 14.04 (Trusty)

First off, get the certbot

$ sudo add-apt-repository ppa:certbot/certbot
$ sudo apt-get update
$ sudo apt-get install python-certbot-nginx

Also, install nginx if you haven’t.

$ sudo certbot --nginx -d example.com -d www.example.com

Be sure to change example.com to your domain name.

Fill in the prompts, and at the end your certificate path will be printed out

/etc/letsencrypt/live/yourdomain/fullchain.pem

The previous command startsup the Ubuntu installed nginx, so you may want to turn it off using

$ service nginx stop

Now, I use the following script to generate the certificate for Gitlab:

<br />#!/bin/bash

# This script updates the certificate for Gitlab with
# the (hopefully) renewed Let's Encrypt Certificate
# We need to do this because the Let's Encrypt Certificates
# are only valid for 3 months at a time, and Synology (tries to) renews it
# every month
# Refer to https://chpresearch.wordpress.com/2016/10/04/synology-gitlab-setup-ssl-over-lets-encrypt/

PATH_TO_SYNOLOGY_CERTIFICATE=/etc/letsencrypt/live/yourdomain/
PATH_TO_STORE_GITLAB_CERTIFICATE=/your/docker/gitlab/root/gitlab/certs

if [[ $# -eq 1 ]]; then
PATH_TO_STORE_GITLAB_CERTIFICATE=$1
fi

echo "Generating gitlab certificates to ${PATH_TO_STORE_GITLAB_CERTIFICATE}"

TMP_FILENAME=tmp_cert
FILES_REQUIRED=(fullchain.pem cert.pem privkey.pem)

for filename in ${FILES_REQUIRED[@]}
do
if [ ! -e ${PATH_TO_SYNOLOGY_CERTIFICATE}/$filename ];
then
echo "${PATH_TO_SYNOLOGY_CERTIFICATE}/$filename does not exist!"
exit 1
fi
done

echo "===Generating gitlab.crt==="
cat ${PATH_TO_SYNOLOGY_CERTIFICATE}/fullchain.pem ${PATH_TO_SYNOLOGY_CERTIFICATE}/cert.pem > ${TMP_FILENAME}.crt
cat ${TMP_FILENAME}.crt

echo "===Generating gitlab.key==="
cat ${PATH_TO_SYNOLOGY_CERTIFICATE}/privkey.pem > ${TMP_FILENAME}.key
#cat ${TMP_FILENAME}.key
echo "===Backing up existing Cert & Key==="
if [[ -f ${PATH_TO_STORE_GITLAB_CERTIFICATE}/gitlab.crt ]]; then
mv -v ${PATH_TO_STORE_GITLAB_CERTIFICATE}/gitlab.crt ${PATH_TO_STORE_GITLAB_CERTIFICATE}/gitlab.crt.backup
fi
if [[ -f ${PATH_TO_STORE_GITLAB_CERTIFICATE}/gitlab.key ]]; then
mv -v ${PATH_TO_STORE_GITLAB_CERTIFICATE}/gitlab.key ${PATH_TO_STORE_GITLAB_CERTIFICATE}/gitlab.key.backup
fi

echo "===Overwritting Existing Cert & Key==="
mv -v ${TMP_FILENAME}.crt ${PATH_TO_STORE_GITLAB_CERTIFICATE}/gitlab.crt
mv -v ${TMP_FILENAME}.key ${PATH_TO_STORE_GITLAB_CERTIFICATE}/gitlab.key

echo "Done"

Run the script, (I put the script in /your/docker/gitlab/root//certs directory and executed it from there)

Also, you need to generate the DHE parameters. Goto /your/docker/gitlab/root/gitlab/certs directory, and execute the following command

$ openssl dhparam <a href="http://security.stackexchange.com/a/95184" target="_blank" rel="noopener">-dsaparam </a>-out dhparam.pem 2048

Now setup your docker-compose to use HTTPS and not run with self-signed certificates. You should be up and running

Setting Up Infiniband

So we got some new IB cards, and we needed to set them up on our servers. Our servers are Ubuntu 14.04 for this post, but I believe 16.04 should be similar.

Install the cards Physically.

To check if your hardware found your cards, enter the following:

lspci -v | grep Mellanox
02:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

You should get something like the above.

Install Infiniband Driver

Refer to the Release notes of version v4_2-1_2_0_0. The reference has a list of packages that are required before installation. I found out afterwards, that the installer seems to check these dependencies and installs them itself, but why not prepare your system beforehand.

$ apt-get install perl dpkg autotools-dev autoconf libtool automake1.10 automake m4 dkms debhelper tcl tcl8.4 chrpath swig graphviz tcl-dev tcl8.4-dev tk-dev tk8.4-dev bison flex dpatch zlib1g-dev curl libcurl4-gnutls-dev python-libxml2 libvirt-bin libvirt0 libnl-3-dev libglib2.0-dev libgfortran3 automake m4 pkg-config libnuma-dev logrotate ethtool lsof

For the libnuma package and the libnl-dev package, the corresponding package names are libnuma-dev and libnl-3-dev​.

Afterwards, checkout the ConnectX-3 Pro VPI Single and Dual QSFP+ Port Adapter Card User Manual for more help with installing.

 

Now, go ahead and install the Mellanox OFED. Download the installer from the Mellanox website under Products->Software->Infiniband VPI drivers. Go for Mellanox OFED Linux and at the bottom click the Download button. If nothing shows up and you are using Chrome, make sure to enable unsafe scripts.

Download the tgz file (or iso if you prefer iso) for your distribution. Untar the file.

Install the Mellanox OFED by executing the following script:

./mlnxofedinstall [OPTIONS if applicable. I didn't need any]

Afterwards, I rebooted the system.

Assigning IP addresses to each IB

Now Infiniband supports IPoIB that seems to allow infiniband to be resoluted with IP addresses. For this part I referred to the following post. Just to make sure IPoIB is installed, check the following command

lsmod | grep ipoib

There should be a ib_ipoib module loaded.

Now check your ib interface names via ifconfig -a command. Then set your ib IP addresses in /etc/network/interfaces file.

auto ib0
iface ib0 inet static
address 10.0.0.1
netmask 255.255.255.0
broadcast 10.0.0.255

And bring up your network device (ib) up via

ifup ib0

Setting up the Subnet Manager (If your not using a IB Switch)

Now if you check the status of your ib cards, via ibstat you may find that your card states are State: Initializing. Intel developer zone has a Troubleshooting InfiniBand connection issues using OFED tools Under the state part, I found that the INIT state corresponds to a HW initialized, but subnet manager unavailable situation.

If you are in a situation like I am, where you do not have an Infiniband switch, and you are just connecting nodes directly, you need to start up a SW subnet manager. Another intel guide allowed me to start up the subnet manager.

/etc/init.d/opensmd start

Afterwards my ibstat showed that my State: Active.

I tried a few tests, ib_send_bw to check the performance between two nodes and found that my system was working as expected.

Also, to setup the subnet manager to startup at boot execute the following command

update-rc.d opensmd defaults

Setup cluster NIS Client

There are some good manuals around, but the key thing is

  1. when installing via apt-get make sure to specify the domain name of the NIS master
    1. If things don’t work well, use apt-get purge nis to remove and reinstall nis to setup your nis domain.
  2. setup the ypserver in /etc/yp.conf
    1. ypserver [full address]
  3. add nis to the appropriate lines in /etc/nsswitch.conf
    1. passwd, group, shadow, hosts
  4. Finally use the yptest to check if things are working.
  5. Xenial has an issue where the rpcbind service does not start up properly. I used the following command to set rpcbind to start at bootup.
      1. # systemctl add-wants multi-user.target rpcbind.service
      2. This solution was found on askubuntu

RISC-V Notes

screenshot-2016-11-29-17-00-00

https://riscv.org/wp-content/uploads/2015/01/riscv-rocket-chip-generator-workshop-jan2015.pdf

src/main/scala/uncore/tilelink/Definitions.scala States the following for each steps:

  1. Acquire: used to initiate coherence protocol transactions in order to gain access to a cache blcok’s data with certain permissions enabled. … Acquires may contain data for Put or PutAtomic… After sending acquires, clients must wait for a manager to send them a Uncore Grant message in response
  2. Probe: used to force clients to release data or cede permissions on a cache block. Clients respond to probes with Release messages.
  3. Release: used to release data or permission back to the manager in response to Probe message. Can be used to volunatirly writeback data. (ex. event that dirty data must be evicted on cache miss).
  4. Grant: used to refill data or grant permissions requested of the manger agent via acquire message. Also used to ack the receipt of volunatry writeback from clients in the form of Release.
  5. Finish: used to provide global ordering of Txs. Sent as ack for receipt of grant message. When a Finish message is received, a manager knows it is safe to begin processing other transactions that touch the same cache block.

Cache Miss

On a miss, there is a block of code that adds the miss into the MSHR

// replacement policy
val replacer = p(Replacer)()
val s1_replaced_way_en = UIntToOH(replacer.way)
val s2_replaced_way_en = UIntToOH(RegEnable(replacer.way, s1_clk_en))
val s2_repl_meta = Mux1H(s2_replaced_way_en, wayMap((w: Int) => RegEnable(meta.io.resp(w), s1_clk_en && s1_replaced_way_en(w))).toSeq)

// miss handling
mshrs.io.req.valid := s2_valid_masked && !s2_hit && (isPrefetch(s2_req.cmd) || isRead(s2_req.cmd) || isWrite(s2_req.cmd))
mshrs.io.req.bits := s2_req
mshrs.io.req.bits.tag_match := s2_tag_match
mshrs.io.req.bits.old_meta := Mux(s2_tag_match, L1Metadata(s2_repl_meta.tag, s2_hit_state), s2_repl_meta)
mshrs.io.req.bits.way_en := Mux(s2_tag_match, s2_tag_match_way, s2_replaced_way_en)
mshrs.io.req.bits.data := s2_req.data
when (mshrs.io.req.fire()) { replacer.miss }
io.mem.acquire <> mshrs.io.mem_req

The miss should be processed by the MSHR by issuing an Acquire to the TileLink, and waiting for a Grant that’ll be filled into the mshrs.io.req.bits.way_en way.

In the MSHRFile class, there is a line of code as follows:

val sdq_enq = io.req.valid && io.req.ready && cacheable && isWrite(io.req.bits.cmd)

Thus I’m assuming the sdq stands for a Store Data Queue. Also, as we’re trying to prefetch misses (I think we can ignore this part for now…)

MSHR Issues Acquire Requests

 io.mem_req.valid := state === s_refill_req && fq.io.enq.ready
 io.mem_req.bits := req.old_meta.coh.makeAcquire(
 addr_block = Cat(io.tag, req_idx),
 client_xact_id = Bits(id),
 op_code = req.cmd)

This is a code snippet from the MSHR class.  The individual mshr.io.mem_req are connected via an arbiter in the MSHRFile class mem_req_arb.io. The mem_req_arb‘s out is connected to the io.mem_req which is then connected to the io.mem.acquire in the HellaCache class.

The snippet above sends out a Acquire requests if the state of the current MSHR is a refill request, and is ready to be enqueued into the Finish Queue. The address block is generated from the tag and index, and the op_code carries the command of the request. Also, an id is generated to create an client Transaction ID. However, the id the index of the MSHR in the MSHRFile.

State change: s_refill_req->s_refill_resp

The MSHR state change from the s_refill_req to s_refill_resp happens on an Acquire fire(). The fire() occurs when both the  object’s valid and ready bits are on at the same time.

object ReadyValidIO {
...
def fire(): Bool = target.ready && target.valid
}

Thus the acquire has been fed valid data, and the acquire has been switched to ready. The acquire has been successfully fired, and we’re waiting for a grant response from the uncore.

Check the code from line:304

Linux NUMA physical memory layout

Just looked through the pseudo files in the procfs.

/proc/zoneinfo has some interesting information.

There are two zones in my current system: DMA, normal. We also have 2 nodes, and thus the file is shown as:

  • Node0 DMA
  • Node0 DMA32
  • Node0 normal
  • Node1 normal

Each section has how many free pages, present pages there are. At the end of the section there is the start_pfn that shows us the physical address of the beginning of the zone.

Thus we can approximate the physical address space of our system by using the start_pfn * PAGE_SIZE and also using the number of present pages.