NOTE: I originally published this page in 2018; instructions may now be out of date.
While looking at the Juniper vMX documents recently, I noticed that support has now been added for Ubuntu 16.04 with kernel 4.4.0-62-generic
(see Table 2). Since I personally prefer using Debian over RHEL systems, I thought I would give this new release a try.
This guide will take you step by step through the process I used to install the vMX but it will make the following assumptions:
- You want to use SR-IOV for better performance. This guide uses the exact steps I myself used to get SR-IOV working on Intel XL710 NIC’s.
- The Ubuntu release you are installing is 16.04. Other versions may have problems that I have not covered.
- The JunOS release for the vMX you are installing must be 18.2 or above. For this guide I am installing 18.3R1-S1.
- The host you are setting up has no data on it. This guide will take you through the install process of Ubuntu which means any existing data will be wiped. I do recommend starting from scratch, previous configurations may cause issues for you. Once you have completed this guide you can then proceed to the Juniper vMX KVM installation guide.
The steps below should be done in order.
Pre-Install Checks
Before you proceed with the install process, go into the BIOS for the server. I had to enable SR-IOV at the BIOS level as well as in the device configuration settings for the NIC’s. Also make sure that the CPU has the virtualization extensions enabled.
Ubuntu Installation
Get a copy of the Ubuntu 16.04-5 CD image (you can download it from the Ubuntu release page). The CD image should be named ubuntu-16.04.5-server-amd64.iso
. Boot the CD from the image and use these steps:
- At the boot selection screen for the CD image, select “Install Ubuntu Server”.
- Select the language settings/keyboard settings. It is currently a requirement for the
vmx.sh
script that the locale is en_US.UTF-8. See this page for details about that requirement. - Configure the network settings. This will be used for SSH access, you can change this later if required.
- Set up the user to access the server after installation. I selected no when it asks if you want to encrypt your home directory.
- You may be prompted about the time zone. If so, select the appropriate time zone for your server.
- The “Partition disks” menu step should show now. Since I only need a very basic layout for the server (it is dedicated to running the vMX) I selected the “Guided – use entire disk” option. Unlike my previous guide for Ubuntu 14.04, there should be no issues if you decide to use LVM due to the much newer kernel support.
- Once the partitioning step is complete it should now show “Installing the system…”. Once Ubuntu is installed you can continue to the next step.
- You may be prompted to enter proxy information for apt. If you are using a proxy enter the appropriate settings or continue.
- Apt will download/upgrade any packages that are newer from the internet automatically (as long as the system is connected to the internet).
- You should now be at the “Configuring tasksel” prompt. The first question is about auto updates, I have set this to “No automatic updates”.
- At the prompt for “Software selection” select the following:
- standard system utilities (default)
- Virtual Machine host
- OpenSSH server When you continue the appropriate packages will be downloaded from the internet and installed.
- Install the GRUB boot loader and continue.
At this stage the Ubuntu installation will be complete, the server should reboot and you will be taken to your freshly install Ubuntu system where the host configuration can begin.
Ubuntu Configuration
The host needs to be configured to work with the vMX before you can install the vMX. All commands below I am running as the root
user.
Update System
First make sure the system is up to date – log in to the system and use apt-get
to upgrade all packages:
apt-get update apt-get dist-upgrade
Install Required Packages
The list of required packages is available here. Install them with apt-get
:
apt-get install bridge-utils qemu-kvm libvirt-bin python python-netifaces vnc4server \ libyaml-dev python-yaml numactl libparted0-dev libpciaccess-dev libnuma-dev \ libyajl-dev libxml2-dev libglib2.0-dev libnl-3-dev python-pip python-dev libxml2-dev \ libxslt-dev
NOTE: One of the package requirements is libnl-dev
. This package is not available for Ubuntu 16.04; use libnl-3-dev
instead. The package libxslt-dev
has also been renamed to libxslt1-dev
.
I also suggest installing chrony
or openntpd
to keep the time of the host in sync:
apt-get install chrony
As I am running this on Dell hardware (a Dell R730 specifically) I will also add the Dell OpenManage repository to apt and install the iDRAC service module (dcism), the instructions were taken from here:
echo 'deb http://linux.dell.com/repo/community/openmanage/911/xenial xenial main' | tee -a /etc/apt/sources.list.d/linux.dell.com.sources.list gpg --keyserver pool.sks-keyservers.net --recv-key 1285491434D8786F gpg -a --export 1285491434D8786F | apt-key add - apt-get update apt-get install dcism service dcismeng start
KVM Huge Pages/KSM
As I will be running the vMX with a large amount of memory (80GB for the vFP and 16GB for the 2 routing engines) I need to enable huge pages. KSM should also be disabled for optimal performance. These two settings are in the /etc/default/qemu-kvm
file, they need to be set like this:
KSM_ENABLED=0 KVM_HUGEPAGES=1
APICv/PML
APNIC Virtualisation and PML should be disabled for the kvm-intel
module:
echo 'options kvm-intel nested=1 enable_apicv=n pml=n' > /etc/modprobe.d/qemu-system-x86.conf
Grub configuration
The Grub configuration needs to be updated:
- Number and size of huge pages need to be configured
- PCIe ASPM should be disabled
- Processor c-state should be limited to C1 for performance
- IOMMU should be enabled
Edit the grub configuration file, /etc/default/grub
. By default you will find the following two empty configuration options:
GRUB_CMDLINE_LINUX_DEFAULT="" GRUB_CMDLINE_LINUX=""
Replace those two empty configuration settings with the new settings:
GRUB_CMDLINE_LINUX_DEFAULT="processor.max_cstates=1 idle=poll pcie_aspm=off intel_iommu=on" GRUB_CMDLINE_LINUX="default_hugepagesz=1G hugepagesz=1G hugepages=96"
In the example above I am creating 96 x 1GB pages. You can adjust this to the appropriate setting for your environment.
After updating the above setting the grub configuration needs to be rebuilt:
update-grub
Host Network Setup
This step is optional. For my environment I need to configure the network for the host correctly:
- During the setup only a single interface was configured. Instead I need to configure 802.3ad (LACP) on two ports and create a bond interface – this will be used for management.
- The bond interface will need 2 VLAN’s – one VLAN for management of the server and one VLAN for management of the vMX. When using the installation instructions above the two required packages I will need (
ifenslave
for the bonding interface andvconfig
for the VLAN interfaces) are installed by default.
The server I am setting up has this interface allocation plan:
Interface Name | Interface Type | Description |
---|---|---|
eno1 | 10G SFP+ (Intel X710 – Onboard) | vMX (SR-IOV) NIC xe-0/0/0 |
eno2 | 10G SFP+ (Intel X710 – Onboard) | vMX (SR-IOV) NIC xe-0/0/1 |
eno3 | 10G SFP+ (Intel X710 – Onboard) | Host Management – bond0 (LACP) |
eno4 | 10G SFP+ (Intel X710 – Onboard) | Host Management – bond0 (LACP) |
enp130s0f0 | 10/100/1000 Copper (Intel I350-t) | Not Used |
enp130s0f1 | 10/100/1000 Copper (Intel I350-t) | Not Used |
enp5s0f0 | 10G SFP+ (Intel X710 – PCI Card) | vMX (SR-IOV) NIC xe-0/0/2 |
enp5s0f1 | 10G SFP+ (Intel X710 – PCI Card) | vMX (SR-IOV) NIC xe-0/0/3 |
enp5s0f2 | 10G SFP+ (Intel X710 – PCI Card) | Not Used |
enp5s0f3 | 10G SFP+ (Intel X710 – PCI Card) | Not Used |
enp7s0f0 | 10G SFP+ (Intel X710 – PCI Card) | vMX (SR-IOV) NIC xe-0/0/4 |
enp7s0f1 | 10G SFP+ (Intel X710 – PCI Card) | vMX (SR-IOV) NIC xe-0/0/5 |
enp7s0f2 | 10G SFP+ (Intel X710 – PCI Card) | Not Used |
enp7s0f3 | 10G SFP+ (Intel X710 – PCI Card) | Not Used |
bond0
will be configured as a trunk port with two tagged VLAN’s:
bond0.50
: Host management interfacebond0.51
: vMX management interface
To keep things neat, I have seperated the interface configurations. My /etc/network/interfaces
file is configured like this:
# This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). source /etc/network/interfaces.d/* # The loopback network interface auto lo iface lo inet loopback
I have configured the interfaces used for the host in /etc/network/interfaces.d/interfaces-host
:
# Interfaces that belong to the host (eg. for management) ## Physical Interfaces ### Built In - 10G SFP+ #### Uplink to ?? auto eno3 iface eno3 inet manual bond-master bond0 #### Uplink to ?? auto eno4 iface eno4 inet manual bond-master bond0 ## Bond Interfaces auto bond0 iface bond0 inet manual bond-mode 802.3ad bond-miimon 100 bond-lacp-rate 4 bond-slaves none ## VLAN Interfaces ### VLAN 50 - Host Management auto bond0.50 iface bond0.50 inet static address x.x.x.x/28 gateway x.x.x.y dns-nameservers x.x.x.x dns-search mydomain.com ### VLAN 51 - vMX Management auto bond0.51 iface bond0.51 inet static address x.x.z.z/28
I have configured the interfaces assigned to the vMX in /etc/network/interfaces.d/interfaces-vmx
:
# Interfaces belong to the vMX - These are passed through with SR-IOV ## Physical Interfaces ### Built In - 10G SFP+ #### Uplink to ?? - Assigned to vMX xe-0/0/0 auto eno1 iface eno1 inet manual #### Uplink to ?? - Assigned to vMX xe-0/0/1 auto eno2 iface eno2 inet manual ### PCI-e Card 1 - 10G SFP+ #### Uplink to ?? - Assigned to vMX xe-0/0/2 auto enp5s0f0 iface enp5s0f0 inet manual #### Uplink to ?? - Assigned to vMX xe-0/0/3 auto enp5s0f1 iface enp5s0f1 inet manual ### PCI-e Card 2 - 10G SFP+ #### Uplink to ?? - Assigned to vMX xe-0/0/4 auto enp7s0f0 iface enp7s0f0 inet manual #### Uplink to ?? - Assigned to vMX xe-0/0/5 auto enp7s0f1 iface enp7s0f1 inet manual
The remaining interfaces are configured in /etc/network/interfaces.d/interfaces-unassigned
:
# These interfaces are not yet assigned. ## Physical Interfaces ### Built In - 10/100/1000 Copper #### Not Assigned auto enp130s0f0 iface enp130s0f0 inet manual #### Not Assigned auto enp130s0f1 iface enp130s0f1 inet manual ### PCI-e Card 1 - 10G SFP+ #### Not Assigned auto enp5s0f2 iface enp5s0f2 inet manual #### Not Assigned auto enp5s0f3 iface enp5s0f3 inet manual ### PCI-e Card 2 - 10G SFP+ #### Not Assigned auto enp7s0f2 iface enp7s0f2 inet manual #### Not Assigned auto enp7s0f3 iface enp7s0f3 inet manual
I have configured all not assigned interfaces as inet manual
as they are handy to have in the enabled state with no configuration for troubleshooting purposes.
This configuration will be applied when I reboot the host.
Remove default bridge interface
By default a bridge interface will be created (virbr0
and virbr0-nic
). This will not be used by the vMX and in some instances it can stop it from starting. You stop the interface and disable it starting on boot like this:
virsh net-destroy default virsh net-autostart default --disable
Reboot
At this stage you should now reboot the host to apply the changes that have been made above. It is important that the host is rebooted before continuing. Rebooting will also ensure that the kernel version running is the latest (it will be upgraded when running apt-get dist-upgrade
in the first steps above).
Intel NIC Drivers
The instructions on the Juniper vMX deployment document say to upgrade the drivers for the Intel X710 NIC’s (here).
Before upgrading the drivers the archive file for the vMX must be present on the server. In my case I have extracted the archive into /home/vMX-18.1R1-S1
. The vMX archive contains the Juniper modified copy of the i40e
driver.
i40e
Change working directory to drivers/i40e-2.1.26/src
from the vMX archive and install the driver:
cd /home/vMX-18.3R1-S1/drivers/i40e-2.1.26/src make install
If you encounter a SSL message like this you can safely ignore it:
make[1]: Entering directory '/usr/src/linux-headers-4.4.0-140-generic' INSTALL /home/vMX-18.3R1-S1/drivers/i40e-2.1.26/src/i40e.ko At main.c:222: - SSL error:02001002:system library:fopen:No such file or directory: bss_file.c:175 - SSL error:2006D080:BIO routines:BIO_new_file:no such file: bss_file.c:178 sign-file: certs/signing_key.pem: No such file or directory DEPMOD 4.4.0-140-generic make[1]: Leaving directory '/usr/src/linux-headers-4.4.0-140-generic'
i40evf
The latest version of the i40evf
driver on the Intel site is 3.6.10 at the time of writing (available here). To install the i40evf
driver:
cd /usr/local/src wget https://downloadmirror.intel.com/24693/eng/i40evf-3.6.10.tar.gz tar zxvf i40evf-3.6.10.tar.gz cd i40evf-3.6.10/src make install
NOTE: Since the time of writing there has been a new release of the i40evf driver. I have not tried the latest release but have heard of issues compiling it – I have archived the 3.6.10 release here f you would like to try the release I used:
initrd
The new drivers need to be added to the initd image:
update-initramfs -u -k `uname -r`
Potential Problems
NIC Issues
When deploying a server I can into an issue where the on board Intel X710 NIC’s were working fine but the X710 PCI-e cards were not working. Also strangely the on board Intel I350 based ports were also not working. From the host, the link status looked like this:
root@server:~# ip link | egrep "^[0-9].*" 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1 2: enp130s0f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 3: enp130s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 4: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq portid 246e96920720 state UP mode DEFAULT group default qlen 1000 5: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq portid 246e96920722 state UP mode DEFAULT group default qlen 1000 6: eno3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 portid 246e96920724 state UP mode DEFAULT group default qlen 1000 7: eno4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 portid 246e96920726 state UP mode DEFAULT group default qlen 1000 8: enp5s0f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq portid f8f21e037500 state DOWN mode DEFAULT group default qlen 1000 9: enp5s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq portid f8f21e037502 state DOWN mode DEFAULT group default qlen 1000 10: enp5s0f2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq portid f8f21e037504 state DOWN mode DEFAULT group default qlen 1000 11: enp5s0f3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq portid f8f21e037506 state DOWN mode DEFAULT group default qlen 1000 12: enp7s0f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq portid 3cfdfe311fe0 state DOWN mode DEFAULT group default qlen 1000 13: enp7s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq portid 3cfdfe311fe2 state DOWN mode DEFAULT group default qlen 1000 ...
I verified that the above NIC’s were all cabled correctly so the link status should be up. Checking the SFP+ module diagnostics I could see that the RX/TX values looked fine. I had also verified that the optics being used were Intel coded (they were tested in another server with X710’s with no issue).
After checking the differences between this server and another working server with the same setup the NIC firmware was newer. This is for a Dell server, so the firmware versions that were not working were:
- Intel I350: 18.8.9
- Intel X710: 18.8.9 I first attempted downgrading the X710 firmware only to 18.3.6. After the downgrade the issue was still occurring for both the X710 and I350 NIC’s. I then downgraded the I350 firmware to the same release and after a reboot all NIC’s were working without an issue.
Huge Pages Issue
When deploying the vMX I ran into this error:
Check libvirt support for hugepages...............[Failed] ls: cannot access '/HugePage_vPFE/libvirt': No such file or directory Error! Try restarting libvirt
I verified that huge pages support was indeed enabled so the problem didn’t appear to be a configuration issue. I restarted the libvirt-bin
service and attempted the install again with no problems.