本文介绍基于 DPDK 源码编译过程
VM 准备
- Vmware Fusion
- Ubuntu 18.04.5 4c4g
Vmware 创建 VM,并添加两块网卡
- 网络适配器1 ens38 192.168.179.16/24 SSH 使用
- 网络适配器2 ens160 172.20.0.4/24 DPDK 运行网卡
配置网卡
修改网络适配器1网卡配置信息
在VM存放目录文件,修改 dpdk.vmwarevm/dpdk.vmx
ethernet0.virtualDev = "e1000"
为:
ethernet0.virtualDev = "vmxnet3"
因为,Vmware VMXNET3
支持多队列网卡。检查命令:
$ lspci | grep Ethernet
02:06.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) (rev 01)
03:00.0 Ethernet controller: VMware VMXNET3 Ethernet Controller (rev 01)
配置 hugepage
Ubuntu
修改 /etc/default/grub
在 GRUB_CMDLINE_LINUX_DEFAULT
后添加:
CentOS
修改 /etc/default/grub
在 GRUB_CMDLINE_LINUX
后添加:
# 测试环境
default_hugepages=1G hugepagesz=2M hugepages=1024 isolcpus=3,4
# 生产酌情配置
default_hugepages=1G hugepagesz=1G hugepages=16 isolcpus=5,8
添加前后对比:
$ cat /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="... default_hugepages=1G hugepagesz=2M hugepages=1024 isolcpus=3,4"
# Ubuntu
grub-mkconfig -o /boot/grub/grub.cfg
# Centos
grub2-mkconfig -o /boot/grub/grub.cfg
$ mount | grep hugepage
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
cat /proc/meminfo | grep Huge
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 1024
HugePages_Free: 1024
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 2097152 kB
说明:
- 若发现配置的
hugepages
不生效或与时间的大小不一致,需要检查 vm.nr_hugepages
参数,该参数通过 sysctl
配置
$ sysctl -a | grep hugepage
...
vm.hugepages_treat_as_movable = 0
vm.nr_hugepages = 0
vm.nr_hugepages_mempolicy = 0
vm.nr_overcommit_hugepages = 0
# 配置
vi /etc/sysctl.conf
vm.nr_hugepages = 10
# 生效
sysctl -p
CPU 检查
- cpu 支持 Intel x86_64,采用
lscpu
命令验证
查看网卡是否支持多队列
$ cat /proc/interrupts | grep ens160
CPU0 CPU1 CPU2 CPU3
56: 20 55 0 0 PCI-MSI 1572864-edge ens160-rxtx-0
57: 0 5 28 2 PCI-MSI 1572865-edge ens160-rxtx-1
58: 0 9 0 27 PCI-MSI 1572866-edge ens160-rxtx-2
59: 7 0 0 0 PCI-MSI 1572867-edge ens160-rxtx-3
60: 0 0 0 0 PCI-MSI 1572868-edge ens160-event-4
可以看到 PCI-MSI 1572867-edge ens160-rxtx-<number>
表示网卡 ens160
支持多队列
$ dmesg |grep -i ens160
[ 3.997711] vmxnet3 0000:03:00.0 ens160: renamed from eth0
[ 6.809998] vmxnet3 0000:03:00.0 ens160: intr type 3, mode 0, 5 vectors allocated
[ 6.810372] vmxnet3 0000:03:00.0 ens160: NIC Link is Up 10000 Mbps
检查内核
- DPDK 19.11.7 (LTS) 需要
Kernel version >= 3.16
$ uname -a
Linux xiexianbin-vm 5.4.0-67-generic #75~18.04.1-Ubuntu SMP Tue Feb 23 19:17:50 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
检查 glibc >= 2.7
$ ldd --version
ldd (GNU libc) 2.17
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
编译 DPDK
安装依赖
# Ubuntu
apt install build-essential numactl libnuma-dev pkg-config libpcap0.8 libpcap-dev -y
# Centos: 注意 kernel-* 版本的版本要相同
yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r) make gcc numactl numactl-devel libpcap libpcap-devel -y
说明:
- libpcap 网络数据包捕获函数库(A system-independent interface for user-level
packet capture
)。
- libpcap-devel 用于编译和使用基于 libcap 的 PMD 轮询模式驱动程序。
源码包
在 https://core.dpdk.org/download/
下载,本示例采用 DPDK 19.11.7 (LTS)
,编译过程如下:
wget https://fast.dpdk.org/rel/dpdk-19.11.7.tar.xz
tar xf dpdk-19.11.7.tar.xz
cd dpdk-stable-19.11.7/
修改 ./usertools/dpdk-setup.sh
中 make install T=${RTE_TARGET}
为 make install T=${RTE_TARGET} -j [2]
,提高编译速度。
编译
编译 [44] x86_64-native-linux-gcc
,流程如下:
$ ./usertools/dpdk-setup.sh
------------------------------------------------------------------------------
RTE_SDK exported as /root/dpdk-stable-19.11.7
------------------------------------------------------------------------------
----------------------------------------------------------
Step 1: Select the DPDK environment to build
----------------------------------------------------------
[1] arm64-armada-linuxapp-gcc
[2] arm64-armada-linux-gcc
[3] arm64-armv8a-linuxapp-clang
[4] arm64-armv8a-linuxapp-gcc
[5] arm64-armv8a-linux-clang
[6] arm64-armv8a-linux-gcc
[7] arm64-bluefield-linuxapp-gcc
[8] arm64-bluefield-linux-gcc
[9] arm64-dpaa-linuxapp-gcc
[10] arm64-dpaa-linux-gcc
[11] arm64-emag-linuxapp-gcc
[12] arm64-emag-linux-gcc
[13] arm64-graviton2-linuxapp-gcc
[14] arm64-graviton2-linux-gcc
[15] arm64-n1sdp-linuxapp-gcc
[16] arm64-n1sdp-linux-gcc
[17] arm64-octeontx2-linuxapp-gcc
[18] arm64-octeontx2-linux-gcc
[19] arm64-stingray-linuxapp-gcc
[20] arm64-stingray-linux-gcc
[21] arm64-thunderx2-linuxapp-gcc
[22] arm64-thunderx2-linux-gcc
[23] arm64-thunderx-linuxapp-gcc
[24] arm64-thunderx-linux-gcc
[25] arm64-xgene1-linuxapp-gcc
[26] arm64-xgene1-linux-gcc
[27] arm-armv7a-linuxapp-gcc
[28] arm-armv7a-linux-gcc
[29] graviton2
[30] i686-native-linuxapp-gcc
[31] i686-native-linuxapp-icc
[32] i686-native-linux-gcc
[33] i686-native-linux-icc
[34] ppc_64-power8-linuxapp-gcc
[35] ppc_64-power8-linux-gcc
[36] x86_64-native-bsdapp-clang
[37] x86_64-native-bsdapp-gcc
[38] x86_64-native-freebsd-clang
[39] x86_64-native-freebsd-gcc
[40] x86_64-native-linuxapp-clang
[41] x86_64-native-linuxapp-gcc
[42] x86_64-native-linuxapp-icc
[43] x86_64-native-linux-clang
[44] x86_64-native-linux-gcc
[45] x86_64-native-linux-icc
[46] x86_x32-native-linuxapp-gcc
[47] x86_x32-native-linux-gcc
----------------------------------------------------------
Step 2: Setup linux environment
----------------------------------------------------------
[48] Insert IGB UIO module
[49] Insert VFIO module
[50] Insert KNI module
[51] Setup hugepage mappings for non-NUMA systems
[52] Setup hugepage mappings for NUMA systems
[53] Display current Ethernet/Baseband/Crypto device settings
[54] Bind Ethernet/Baseband/Crypto device to IGB UIO module
[55] Bind Ethernet/Baseband/Crypto device to VFIO module
[56] Setup VFIO permissions
----------------------------------------------------------
Step 3: Run test application for linux environment
----------------------------------------------------------
[57] Run test application ($RTE_TARGET/app/test)
[58] Run testpmd application in interactive mode ($RTE_TARGET/app/testpmd)
----------------------------------------------------------
Step 4: Other tools
----------------------------------------------------------
[59] List hugepage info from /proc/meminfo
----------------------------------------------------------
Step 5: Uninstall and system cleanup
----------------------------------------------------------
[60] Unbind devices from IGB UIO or VFIO driver
[61] Remove IGB UIO module
[62] Remove VFIO module
[63] Remove KNI module
[64] Remove hugepage mappings
[65] Exit Script
Option: 44
Configuration done using x86_64-native-linux-gcc
== Build lib
== Build lib/librte_kvargs
...
Build complete [x86_64-native-linux-gcc]
Installation cannot run with T defined and DESTDIR undefined
------------------------------------------------------------------------------
RTE_TARGET exported as x86_64-native-linux-gcc
------------------------------------------------------------------------------
Press enter to continue ...
编译时会自动创建 x86_64-native-linux-gcc
文件夹,存放编译过程中产生的文件,至此,编译完成。
其中,DPDK pdump 抓包工具依赖基于 libpcap 的 PMD 驱动。在构建时修改配置文件来开启:
$ vim dpdk-18.08/x86_64-native-linux-gcc/.config
CONFIG_RTE_LIBRTE_PMD_PCAP=y
CONFIG_RTE_LIBRTE_PDUMP=y
编译后目录
$ cd ~/dpdk-stable-19.11.7/x86_64-native-linux-gcc
$ ll
total 88
drwxr-xr-x 7 root root 4096 Apr 4 19:33 ./
drwxrwxr-x 16 root root 4096 Apr 4 19:28 ../
drwxr-xr-x 2 root root 4096 Apr 4 19:39 app/ # 编译好的可执行文件
drwxr-xr-x 7 root root 4096 Apr 4 19:38 build/ # 编译过程的文件
-rw-r--r-- 1 root root 21290 Apr 4 19:28 .config # 编译配置文件
-rw-r--r-- 1 root root 21290 Apr 4 19:28 .config.orig
drwxr-xr-x 3 root root 12288 Apr 4 19:38 include/ # .h 文件
drwxr-xr-x 2 root root 4096 Apr 4 19:33 kmod/ # 可能用到的编译好的内核模块
drwxr-xr-x 2 root root 4096 Apr 4 19:38 lib/ # 编译好的 .a 文件,用于静态连接
-rw-r--r-- 1 root root 279 Apr 4 19:28 Makefile # 修改 .config 后,可以使用make命令继续编译
ll app/
total 220504
drwxr-xr-x 2 root root 4096 Apr 4 19:39 ./
drwxr-xr-x 7 root root 4096 Apr 4 19:33 ../
-rwxr-xr-x 1 root root 12612296 Apr 4 19:39 cmdline_test*
-rw-r--r-- 1 root root 5666677 Apr 4 19:39 cmdline_test.map
-rwxr-xr-x 1 root root 12654744 Apr 4 19:39 dpdk-pdump*
-rw-r--r-- 1 root root 5670443 Apr 4 19:39 dpdk-pdump.map
-rwxr-xr-x 1 root root 33312 Apr 4 19:33 dpdk-pmdinfogen*
-rwxr-xr-x 1 root root 12638888 Apr 4 19:38 dpdk-procinfo*
-rw-r--r-- 1 root root 5669974 Apr 4 19:38 dpdk-procinfo.map
-rwxr-xr-x 1 root root 12653464 Apr 4 19:39 dpdk-test-compress-perf*
-rw-r--r-- 1 root root 5675943 Apr 4 19:39 dpdk-test-compress-perf.map
-rwxr-xr-x 1 root root 12685848 Apr 4 19:39 dpdk-test-crypto-perf*
-rw-r--r-- 1 root root 5686128 Apr 4 19:39 dpdk-test-crypto-perf.map
-rwxr-xr-x 1 root root 12715664 Apr 4 19:39 dpdk-test-eventdev*
-rw-r--r-- 1 root root 5708201 Apr 4 19:39 dpdk-test-eventdev.map
-rwxr-xr-x 1 root root 16276480 Apr 4 19:38 test*
-rwxr-xr-x 1 root root 12624440 Apr 4 19:39 testacl*
-rw-r--r-- 1 root root 5663514 Apr 4 19:39 testacl.map
-rwxr-xr-x 1 root root 12777808 Apr 4 19:39 testbbdev*
-rw-r--r-- 1 root root 5673524 Apr 4 19:39 testbbdev.map
-rw-r--r-- 1 root root 6182810 Apr 4 19:38 test.map
-rwxr-xr-x 1 root root 12651312 Apr 4 19:39 testpipeline*
-rw-r--r-- 1 root root 5681291 Apr 4 19:39 testpipeline.map
-rwxr-xr-x 1 root root 13756576 Apr 4 19:38 testpmd*
-rw-r--r-- 1 root root 6071598 Apr 4 19:38 testpmd.map
-rwxr-xr-x 1 root root 12627968 Apr 4 19:39 testsad*
-rw-r--r-- 1 root root 5669310 Apr 4 19:39 testsad.map
ll kmod/
total 84
drwxr-xr-x 2 root root 4096 Apr 4 19:33 ./
drwxr-xr-x 7 root root 4096 Apr 4 10:01 ../
-rw-r--r-- 1 root root 20816 Apr 4 19:33 igb_uio.ko
-rw-r--r-- 1 root root 51944 Apr 4 19:33 rte_kni.ko
配置网卡
插入 IGB UIO/VFIO 模块
$ modprobe vfio
$ modprobe uio
网卡要使用 DPDK
需要绑定到 igb_uio
内核模块上。DPDK APP
使用的网卡需要跟 Linux
原生驱动解绑,并重新绑定到 igb_uio
、uio_pci_generic
或 vfio_mdev
内核模块上。
执行 /usertools/dpdk-setup.sh
- 选择
[48] Insert IGB UIO module
插入 IGB UIO
模块,选择网卡为 vmxnet3
会加载此模块
- 选择
[49] Insert VFIO module
插入 VFIO
模块,选择网卡为 e1000
会加载此模块
Option: 48
Unloading any existing DPDK UIO module
Loading DPDK UIO module
Press enter to continue ...
...
Option: 49
Unloading any existing VFIO module
Loading VFIO module
chmod /dev/vfio
OK
Press enter to continue ...
或
$ cd x86_64-native-linux-gcc/kmod
$ lsmod | grep uio
# 加载 UIO Framework 内核模块
$ modprobe uio
$ lsmod | grep uio
uio 20480 0
# 加载 igb_uio 内核驱动程序模块
$ insmod igb_uio.ko
$ modprobe uio_pci_generic
$ modprobe vfio_mdev
$ lsmod | grep uio
uio_pci_generic 16384 0
igb_uio 20480 0
uio 20480 2 igb_uio,uio_pci_generic
网卡绑定
绑定网卡 ens160
到新的内核驱动模块。
$ ./usertools/dpdk-devbind.py --help
Usage:
------
dpdk-devbind.py [options] DEVICE1 DEVICE2 ....
where DEVICE1, DEVICE2 etc, are specified via PCI "domain:bus:slot.func" syntax
or "bus:slot.func" syntax. For devices bound to Linux kernel drivers, they may
also be referred to by Linux interface name e.g. eth0, eth1, em0, em1, etc.
Options:
--help, --usage:
Display usage information and quit
-s, --status:
Print the current status of all known network, crypto, event
and mempool devices.
For each device, it displays the PCI domain, bus, slot and function,
along with a text description of the device. Depending upon whether the
device is being used by a kernel driver, the igb_uio driver, or no
driver, other relevant information will be displayed:
* the Linux interface name e.g. if=eth0
* the driver being used e.g. drv=igb_uio
* any suitable drivers not currently using that device
e.g. unused=igb_uio
NOTE: if this flag is passed along with a bind/unbind option, the
status display will always occur after the other operations have taken
place.
--status-dev:
Print the status of given device group. Supported device groups are:
"net", "baseband", "crypto", "event", "mempool" and "compress"
-b driver, --bind=driver:
Select the driver to use or "none" to unbind the device
-u, --unbind:
Unbind a device (Equivalent to "-b none")
--force:
By default, network devices which are used by Linux - as indicated by
having routes in the routing table - cannot be modified. Using the
--force flag overrides this behavior, allowing active links to be
forcibly unbound.
WARNING: This can lead to loss of network connection and should be used
with caution.
Examples:
---------
To display current device status:
dpdk-devbind.py --status
To display current network device status:
dpdk-devbind.py --status-dev net
To bind eth1 from the current driver and move to use igb_uio
dpdk-devbind.py --bind=igb_uio eth1
To unbind 0000:01:00.0 from using any driver
dpdk-devbind.py -u 0000:01:00.0
To bind 0000:02:00.0 and 0000:02:00.1 to the ixgbe kernel driver
dpdk-devbind.py -b ixgbe 02:00.0 02:00.1
- 绑定前,可以看到网卡都在
Network devices using kernel driver
中
$ ./usertools/dpdk-devbind.py --status
Network devices using kernel driver
===================================
0000:02:06.0 '82545EM Gigabit Ethernet Controller (Copper) 100f' if=ens38 drv=e1000 unused=igb_uio,vfio-pci,uio_pci_generic *Active*
0000:03:00.0 'VMXNET3 Ethernet Controller 07b0' if=ens160 drv=vmxnet3 unused=igb_uio,vfio-pci,uio_pci_generic
No 'Baseband' devices detected
==============================
No 'Crypto' devices detected
============================
No 'Eventdev' devices detected
==============================
No 'Mempool' devices detected
=============================
No 'Compress' devices detected
==============================
No 'Misc (rawdev)' devices detected
===================================
$ ip link set ens160 down
$ ./usertools/dpdk-devbind.py --bind=igb_uio ens160 # 此时 ip a 命令已经无法查看 ens160 设备
- 绑定后,可以看到网卡
03:00.0
(lspci
获取的id) 在 Network devices using DPDK-compatible driver
中
./usertools/dpdk-devbind.py --status
Network devices using DPDK-compatible driver
============================================
0000:03:00.0 'VMXNET3 Ethernet Controller 07b0' drv=igb_uio unused=vmxnet3,vfio-pci,uio_pci_generic
Network devices using kernel driver
===================================
0000:02:06.0 '82545EM Gigabit Ethernet Controller (Copper) 100f' if=ens38 drv=e1000 unused=igb_uio,vfio-pci,uio_pci_generic *Active*
No 'Baseband' devices detected
==============================
No 'Crypto' devices detected
============================
No 'Eventdev' devices detected
==============================
No 'Mempool' devices detected
=============================
No 'Compress' devices detected
==============================
No 'Misc (rawdev)' devices detected
===================================
$ ll /dev/uio*
crw------- 1 root root 239, 0 Apr 4 10:21 /dev/uio0
$ ll /sys/class/uio/uio*/maps/
total 0
drwxr-xr-x 5 root root 0 Apr 4 10:22 ./
drwxr-xr-x 5 root root 0 Apr 4 10:21 ../
drwxr-xr-x 2 root root 0 Apr 4 10:22 map0/
drwxr-xr-x 2 root root 0 Apr 4 10:22 map1/
drwxr-xr-x 2 root root 0 Apr 4 10:22 map2/
网卡解绑
$ ./usertools/dpdk-devbind.py -b vmxnet3 03:00.0
然后up起来网卡,可以正常使用。
验证
helloworld 程序
$ export RTE_SDK=/root/dpdk-stable-19.11.7/
$ export RTE_TARGET=x86_64-native-linux-gcc
$ cd examples/helloworld/
$ make # 完成后,可以在 ./build/app/helloworld 找到可执行程序
CC main.o
LD helloworld
INSTALL-APP helloworld
INSTALL-MAP helloworld.map
执行helloworld
,看到 hello from core x
标识已经测试成功:
$ ./build/helloworld
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:02:06.0 on NUMA socket -1
EAL: Invalid NUMA socket, default to 0
EAL: probe driver: 8086:100f net_e1000_em
EAL: PCI device 0000:03:00.0 on NUMA socket -1
EAL: Invalid NUMA socket, default to 0
EAL: probe driver: 15ad:7b0 net_vmxnet3
hello from core 1
hello from core 2
hello from core 0
FAQ
uio 为加载错误
./usertools/dpdk-setup.sh
选择
[48] Insert IGB UIO module
报错:
## ERROR: Target does not have the DPDK UIO Kernel Module.
To fix, please try to rebuild target.
$ modprobe uio
再运行 ./usertools/dpdk-setup.sh 重新编译之后在加载 uio 驱动