Ubuntu KVM Host Setup for Juniper vMX

While looking at the Juniper vMX documents recently, I noticed that support has now been added for Ubuntu 16.04 with kernel 4.4.0-62-generic (see Table 2). Since I personally prefer using Debian over RHEL systems, I thought I would give this new release a try.

This guide will take you step by step through the process I used to install the vMX but it will make the following assumptions:

The steps below should be done in order.

Pre-Install Checks

Before you proceed with the install process, go into the BIOS for the server. I had to enable SR-IOV at the BIOS level as well as in the device configuration settings for the NIC's. Also make sure that the CPU has the virtualisation extensions enabled.

Ubuntu Installation

Get a copy of the Ubuntu 16.04-5 CD image (you can download it from the Ubuntu release page). The CD image should be named ubuntu-16.04.5-server-amd64.iso. Boot the CD from the image and use these steps:

  1. At the boot selection screen for the CD image, select "Install Ubuntu Server".
  2. Select the language settings/keyboard settings. It is currently a requirement for the vmx.sh script that the locale is en_US.UTF-8. See this page for details about that requirement.
  3. Configure the network settings. This will be used for SSH access, you can change this later if required.
  4. Set up the user to access the server after installation. I selected no when it asks if you want to encrypt your home directory.
  5. You may be prompted about the time zone. If so, select the appropriate time zone for your server.
  6. The "Partition disks" menu step should show now. Since I only need a very basic layout for the server (it is dedicated to running the vMX) I selected the "Guided - use entire disk" option. Unlike my previous guide for Ubuntu 14.04, there should be no issues if you decide to use LVM due to the much newer kernel support.
  7. Once the partitioning step is complete it should now show "Installing the system...". Once Ubuntu is installed you can continue to the next step.
  8. You may be prompted to enter proxy information for apt. If you are using a proxy enter the appropriate settings or continue.
  9. Apt will download/upgrade any packages that are newer from the internet automatically (as long as the system is connected to the internet).
  10. You should now be at the "Configuring tasksel" prompt. The first question is about auto updates, I have set this to "No automatic updates".
  11. At the prompt for "Software selection" select the following:
    • standard system utilities (default)
    • Virtual Machine host
    • OpenSSH server When you continue the appropriate packages will be downloaded from the internet and installed.
  12. Install the GRUB boot loader and continue.

At this stage the Ubuntu installation will be complete, the server should reboot and you will be taken to your freshly install Ubuntu system where the host configuration can begin.

Ubuntu Configuration

The host needs to be configured to work with the vMX before you can install the vMX. All commands below I am running as the root user.

Update System

First make sure the system is up to date - log in to the system and use apt-get to upgrade all packages:

apt-get update
apt-get dist-upgrade

Install Required Packages

The list of required packages is available here. Install them with apt-get:

apt-get install bridge-utils qemu-kvm libvirt-bin python python-netifaces vnc4server libyaml-dev python-yaml numactl libparted0-dev libpciaccess-dev libnuma-dev libyajl-dev libxml2-dev libglib2.0-dev libnl-3-dev python-pip python-dev libxml2-dev libxslt-dev

NOTE: One of the package requirements is libnl-dev. This package is not available for Ubuntu 16.04; use libnl-3-dev instead. The package libxslt-dev has also been renamed to libxslt1-dev.

I also suggest installing chrony or openntpd to keep the time of the host in sync:

apt-get install chrony

As I am running this on Dell hardware (a Dell R730 specifically) I will also add the Dell OpenManage repository to apt and install the iDRAC service module (dcism), the instructions were taken from here:

echo 'deb http://linux.dell.com/repo/community/openmanage/911/xenial xenial main' | tee -a /etc/apt/sources.list.d/linux.dell.com.sources.list
gpg --keyserver pool.sks-keyservers.net --recv-key 1285491434D8786F
gpg -a --export 1285491434D8786F | apt-key add -
apt-get update
apt-get install dcism
service dcismeng start

KVM Huge Pages/KSM

As I will be running the vMX with a large amount of memory (80GB for the vFP and 16GB for the 2 routing engines) I need to enable huge pages. KSM should also be disabled for optimal performance. These two settings are in the /etc/default/qemu-kvm file, they need to be set like this:

KSM_ENABLED=0
KVM_HUGEPAGES=1

APICv/PML

APNIC Virtualisation and PML should be disabled for the kvm-intel module:

echo 'options kvm-intel nested=1 enable_apicv=n pml=n' > /etc/modprobe.d/qemu-system-x86.conf

Grub configuration

The Grub configuration needs to be updated:

Edit the grub configuration file, /etc/default/grub. By default you will find the following two empty configuration options:

GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_CMDLINE_LINUX=""

Replace those two empty configuration settings with the new settings:

GRUB_CMDLINE_LINUX_DEFAULT="processor.max_cstates=1 idle=poll pcie_aspm=off intel_iommu=on"
GRUB_CMDLINE_LINUX="default_hugepagesz=1G hugepagesz=1G hugepages=96"

In the example above I am creating 96 x 1GB pages. You can adjust this to the appropriate setting for your environment.

After updating the above setting the grub configuration needs to be rebuilt:

update-grub

Host Network Setup

This step is optional. For my environment I need to configure the network for the host correctly:

The server I am setting up has this interface allocation plan:

Interface Name Interface Type Description
eno1 10G SFP+ (Intel X710 - Onboard) vMX (SR-IOV) NIC xe-0/0/0
eno2 10G SFP+ (Intel X710 - Onboard) vMX (SR-IOV) NIC xe-0/0/1
eno3 10G SFP+ (Intel X710 - Onboard) Host Management - bond0 (LACP)
eno4 10G SFP+ (Intel X710 - Onboard) Host Management - bond0 (LACP)
enp130s0f0 10/100/1000 Copper (Intel I350-t) Not Used
enp130s0f1 10/100/1000 Copper (Intel I350-t) Not Used
enp5s0f0 10G SFP+ (Intel X710 - PCI Card) vMX (SR-IOV) NIC xe-0/0/2
enp5s0f1 10G SFP+ (Intel X710 - PCI Card) vMX (SR-IOV) NIC xe-0/0/3
enp5s0f2 10G SFP+ (Intel X710 - PCI Card) Not Used
enp5s0f3 10G SFP+ (Intel X710 - PCI Card) Not Used
enp7s0f0 10G SFP+ (Intel X710 - PCI Card) vMX (SR-IOV) NIC xe-0/0/4
enp7s0f1 10G SFP+ (Intel X710 - PCI Card) vMX (SR-IOV) NIC xe-0/0/5
enp7s0f2 10G SFP+ (Intel X710 - PCI Card) Not Used
enp7s0f3 10G SFP+ (Intel X710 - PCI Card) Not Used

bond0 will be configured as a trunk port with two tagged VLAN's:

To keep things neat, I have seperated the interface configurations. My /etc/network/interfaces file is configured like this:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

source /etc/network/interfaces.d/*

# The loopback network interface
auto lo
iface lo inet loopback

I have configured the interfaces used for the host in /etc/network/interfaces.d/interfaces-host:

# Interfaces that belong to the host (eg. for management)

## Physical Interfaces
### Built In - 10G SFP+
#### Uplink to ??
auto eno3
iface eno3 inet manual
  bond-master bond0
#### Uplink to ??
auto eno4
iface eno4 inet manual
  bond-master bond0

## Bond Interfaces
auto bond0
iface bond0 inet manual
  bond-mode 802.3ad
  bond-miimon 100
  bond-lacp-rate 4
  bond-slaves none

## VLAN Interfaces
### VLAN 50 - Host Management
auto bond0.50
iface bond0.50 inet static
  address x.x.x.x/28
  gateway x.x.x.y
  dns-nameservers x.x.x.x
  dns-search mydomain.com
### VLAN 51 - vMX Management
auto bond0.51
iface bond0.51 inet static
  address x.x.z.z/28

I have configured the interfaces assigned to the vMX in /etc/network/interfaces.d/interfaces-vmx:

# Interfaces belong to the vMX - These are passed through with SR-IOV

## Physical Interfaces
### Built In - 10G SFP+
#### Uplink to ?? - Assigned to vMX xe-0/0/0
auto eno1
iface eno1 inet manual
#### Uplink to ?? - Assigned to vMX xe-0/0/1
auto eno2
iface eno2 inet manual
### PCI-e Card 1 - 10G SFP+
#### Uplink to ?? - Assigned to vMX xe-0/0/2
auto enp5s0f0
iface enp5s0f0 inet manual
#### Uplink to ?? - Assigned to vMX xe-0/0/3
auto enp5s0f1
iface enp5s0f1 inet manual
### PCI-e Card 2 - 10G SFP+
#### Uplink to ?? - Assigned to vMX xe-0/0/4
auto enp7s0f0
iface enp7s0f0 inet manual
#### Uplink to ?? - Assigned to vMX xe-0/0/5
auto enp7s0f1
iface enp7s0f1 inet manual

The remaining interfaces are configured in /etc/network/interfaces.d/interfaces-unassigned:

# These interfaces are not yet assigned.

## Physical Interfaces
### Built In - 10/100/1000 Copper
#### Not Assigned
auto enp130s0f0
iface enp130s0f0 inet manual
#### Not Assigned
auto enp130s0f1
iface enp130s0f1 inet manual
### PCI-e Card 1 - 10G SFP+
#### Not Assigned
auto enp5s0f2
iface enp5s0f2 inet manual
#### Not Assigned
auto enp5s0f3
iface enp5s0f3 inet manual
### PCI-e Card 2 - 10G SFP+
#### Not Assigned
auto enp7s0f2
iface enp7s0f2 inet manual
#### Not Assigned
auto enp7s0f3
iface enp7s0f3 inet manual

I have configured all not assigned interfaces as inet manual as they are handy to have in the enabled state with no configuration for troubleshooting purposes.

This configuration will be applied when I reboot the host.

Remove default bridge interface

By default a bridge interface will be created (virbr0 and virbr0-nic). This will not be used by the vMX and in some instances it can stop it from starting. You stop the interface and disable it starting on boot like this:

virsh net-destroy default
virsh net-autostart default --disable

Reboot

At this stage you should now reboot the host to apply the changes that have been made above. It is important that the host is rebooted before continuing. Rebooting will also ensure that the kernel version running is the latest (it will be upgraded when running apt-get dist-upgrade in the first steps above).

Intel NIC Drivers

The instructions on the Juniper vMX deployment document say to upgrade the drivers for the Intel X710 NIC's (here).

Before upgrading the drivers the archive file for the vMX must be present on the server. In my case I have extracted the archive into /home/vMX-18.1R1-S1. The vMX archive contains the Juniper modified copy of the i40e driver.

i40e

Change working directory to drivers/i40e-2.1.26/src from the vMX archive and install the driver:

cd /home/vMX-18.3R1-S1/drivers/i40e-2.1.26/src
make install

If you encounter a SSL message like this you can safely ignore it:

make[1]: Entering directory '/usr/src/linux-headers-4.4.0-140-generic'
  INSTALL /home/vMX-18.3R1-S1/drivers/i40e-2.1.26/src/i40e.ko
At main.c:222:
- SSL error:02001002:system library:fopen:No such file or directory: bss_file.c:175
- SSL error:2006D080:BIO routines:BIO_new_file:no such file: bss_file.c:178
sign-file: certs/signing_key.pem: No such file or directory
  DEPMOD  4.4.0-140-generic
make[1]: Leaving directory '/usr/src/linux-headers-4.4.0-140-generic'

i40evf

The latest version of the i40evf driver on the Intel site is 3.6.10 at the time of writing (available here). To install the i40evf driver:

cd /usr/local/src
wget https://downloadmirror.intel.com/24693/eng/i40evf-3.6.10.tar.gz
tar zxvf i40evf-3.6.10.tar.gz
cd i40evf-3.6.10/src
make install

NOTE: Since the time of writing there has been a new release of the i40evf driver. I have not tried the latest release but have heard of issues compiling it - I have archived the 3.6.10 release here if you would like to try the release I used.

initrd

The new drivers need to be added to the initd image:

update-initramfs -u -k `uname -r`

Potential Problems

NIC Issues

When deploying a server I can into an issue where the on board Intel X710 NIC's were working fine but the X710 PCI-e cards were not working. Also strangely the on board Intel I350 based ports were also not working. From the host, the link status looked like this:

root@server:~# ip link | egrep "^[0-9].*"
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
2: enp130s0f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
3: enp130s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
4: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq portid 246e96920720 state UP mode DEFAULT group default qlen 1000
5: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq portid 246e96920722 state UP mode DEFAULT group default qlen 1000
6: eno3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 portid 246e96920724 state UP mode DEFAULT group default qlen 1000
7: eno4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 portid 246e96920726 state UP mode DEFAULT group default qlen 1000
8: enp5s0f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq portid f8f21e037500 state DOWN mode DEFAULT group default qlen 1000
9: enp5s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq portid f8f21e037502 state DOWN mode DEFAULT group default qlen 1000
10: enp5s0f2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq portid f8f21e037504 state DOWN mode DEFAULT group default qlen 1000
11: enp5s0f3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq portid f8f21e037506 state DOWN mode DEFAULT group default qlen 1000
12: enp7s0f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq portid 3cfdfe311fe0 state DOWN mode DEFAULT group default qlen 1000
13: enp7s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq portid 3cfdfe311fe2 state DOWN mode DEFAULT group default qlen 1000
...

I verified that the above NIC's were all cabled correctly so the link status should be up. Checking the SFP+ module diagnostics I could see that the RX/TX values looked fine. I had also verified that the optics being used were Intel coded (they were tested in another server with X710's with no issue).

After checking the differences between this server and another working server with the same setup the NIC firmware was newer. This is for a Dell server, so the firmware versions that were not working were:

Huge Pages Issue

When deploying the vMX I ran into this error:

Check libvirt support for hugepages...............[Failed]
ls: cannot access '/HugePage_vPFE/libvirt': No such file or directory
Error! Try restarting libvirt

I verified that huge pages support was indeed enabled so the problem didn't appear to be a configuration issue. I restarted the libvirt-bin service and attempted the install again with no problems.