Kubernetes flannel 网络介绍。
介绍
flannel 是 Kubernetes 的插件,支持多种后端:
- VxLAN,支持两种模式
- VxLAN
- Directrouting(源与目标都在二层网络,报文直接转发,若三层将使用隧道叠加)
- host-gw(gateway):缺点,网络报文工作在二层网络,广播包会产生影响,不支持三层网络
- UDP
flannel 的配置参数:
- Backend:vxlan、host-gw、udp
- Network: flannel 使用 CIDR 格式的网络地址,用于为 Pod 配置网络功能,默认使用
10.244.0.0/16
网段:
- master: 10.244.0.0/24
- node-1: 10.244.1.0/24
- …
- node-255: 10.244.255.0/24
- SubnetLen: 把 Network 切分为子网供各节点使用时,使用多长的掩码进行切分,默认为 24 位(即每个节点最多 255 个Pod)
- SubnetMin: 限制用于分给节点的起始子网,如:10.244.20.0/24
- SubnetMax: 限制用于分给节点的最大子网,如:10.244.200.0/24
安装
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
在/run/flannel/subnet.env
中有网络的配置信息。
- kube-flannel-ds CrashLoopBackOff
查看kube-flannel.yml
文件时发现quay.io/coreos/flannel:v0.13.1-rc1
quay.io
网站目前国内无法访问,有两中方式解决:
- 将
quay.io/coreos/flannel
修改为 qcoreos/flannel
- 在
https://github.com/coreos/flannel/releases
中下载flanneld-v0.13.1-rc1-amd64.docker
并导入到 docker
中
docker load < flanneld-v0.13.1-rc1-amd64.docker
kubeadm 安装的环境
flannel 默认使用 daemonset 方式运行:
$ kubectl -n kube-system get daemonsets
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-flannel-ds 2 2 2 2 2 <none> 6d8h
kube-proxy 2 2 2 2 2 kubernetes.io/os=linux 24d
root@k8s-master:~# kubectl -n kube-system get pods -o wide | grep kube-flannel
kube-flannel-ds-bgccx 1/1 Running 0 6d8h 172.20.0.82 k8s-node-1 <none> <none>
kube-flannel-ds-z524b 1/1 Running 0 6d8h 172.20.0.81 k8s-master <none> <none>
$ kubectl -n kube-system get configmap | grep flannel
kube-flannel-cfg 2 6d8h
$ kubectl -n kube-system get configmap kube-flannel-cfg -o yaml
apiVersion: v1
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan" # 默认的 backend 为 VxLAN
}
}
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","data":{"cni-conf.json":"{\n \"name\": \"cbr0\",\n \"cniVersion\": \"0.3.1\",\n \"plugins\": [\n {\n \"type\": \"flannel\",\n \"delegate\": {\n \"hairpinMode\": true,\n \"isDefaultGateway\": true\n }\n },\n {\n \"type\": \"portmap\",\n \"capabilities\": {\n \"portMappings\": true\n }\n }\n ]\n}\n","net-conf.json":"{\n \"Network\": \"10.244.0.0/16\",\n \"Backend\": {\n \"Type\": \"vxlan\"\n }\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{},"labels":{"app":"flannel","tier":"node"},"name":"kube-flannel-cfg","namespace":"kube-system"}}
creationTimestamp: "2022-03-19T06:55:55Z"
labels:
app: flannel
tier: node
name: kube-flannel-cfg
namespace: kube-system
resourceVersion: "687997"
uid: 00950b5a-1853-411b-9ff3-0d13fafa0449
节点网络配置
$ ip a
...
6: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether aa:f8:af:84:91:13 brd ff:ff:ff:ff:ff:ff
inet 10.244.0.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::a8f8:afff:fe84:9113/64 scope link
valid_lft forever preferred_lft forever
7: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
link/ether 66:bc:13:18:ef:9a brd ff:ff:ff:ff:ff:ff
inet 10.244.0.1/24 brd 10.244.0.255 scope global cni0
valid_lft forever preferred_lft forever
inet6 fe80::64bc:13ff:fe18:ef9a/64 scope link
valid_lft forever preferred_lft forever
...
说明:
- flannel.1 的网络地址很特殊:
10.244.0.0/32
,MTU 比物理网络少 50,用于隧道(VxLAN)叠加使用
- cni0 的地址为:
10.244.0.1/24
,用于本地通信的接口,有容器时才会创建
抓包验证
---
apiVersion: v1
kind: Service
metadata:
name: hello-app
namespace: default
spec:
selector:
app: hello-app
release: canary
type: ClusterIP
ports:
- name: port-80
port: 80
targetPort: 8080
protocol: TCP
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-app-dp
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: hello-app
release: canary
template:
metadata:
name: hello-app-pod
labels:
app: hello-app
release: canary
spec:
containers:
- name: hello-app
image: gcriogooglesamples/hello-app:1.0
ports:
- name: http
containerPort: 8080
$ kubectl apply -f deployment-hello-app.yaml
deployment.apps/hello-app-dp created
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hello-app-dp-665877bb77-5fgzh 1/1 Running 0 4m55s 10.244.2.2 k8s-node-2 <none> <none>
hello-app-dp-665877bb77-lgghr 1/1 Running 0 4m55s 10.244.1.102 k8s-node-1 <none> <none>
root@k8s-node-2:~# ip a show veth96c8e0e4
...
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether fa:4a:60:e2:25:30 brd ff:ff:ff:ff:ff:ff
inet 10.244.2.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::f84a:60ff:fee2:2530/64 scope link
valid_lft forever preferred_lft forever
5: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
link/ether 92:97:d1:6f:52:42 brd ff:ff:ff:ff:ff:ff
inet 10.244.2.1/24 brd 10.244.2.255 scope global cni0
valid_lft forever preferred_lft forever
inet6 fe80::9097:d1ff:fe6f:5242/64 scope link
valid_lft forever preferred_lft forever
6: veth96c8e0e4@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP group default
link/ether 7a:47:27:90:50:d6 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::7847:27ff:fe90:50d6/64 scope link
valid_lft forever preferred_lft forever
root@k8s-node-2:~# brctl show cni0
bridge name bridge id STP enabled interfaces
cni0 8000.9297d16f5242 no veth96c8e0e4
root@k8s-master:~/manifests# kubectl exec -it hello-app-dp-665877bb77-5fgzh -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
3: eth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP
link/ether ba:b4:d0:76:68:e3 brd ff:ff:ff:ff:ff:ff
inet 10.244.2.2/24 brd 10.244.2.255 scope global eth0
valid_lft forever preferred_lft forever
kubectl exec -it hello-app-dp-665877bb77-lgghr -- ping 10.244.2.2
- 在对应的 node 节点物理网卡抓 ICMP 的包发现没有报文,因为已经被封装为 UDP 的包了
tcpdump -i ens33 -nn icmp
- 在 k8s-node-1 的 cni0/flannel.1 网卡上抓 ICMP 包正常获取
root@k8s-node-2:~# tcpdump -i cni0 -nn icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cni0, link-type EN10MB (Ethernet), capture size 262144 bytes
19:41:24.466656 IP 10.244.1.102 > 10.244.2.2: ICMP echo request, id 12, seq 635, length 64
19:41:24.466699 IP 10.244.2.2 > 10.244.1.102: ICMP echo reply, id 12, seq 635, length 64
...
root@k8s-node-2:~# tcpdump -i flannel.1 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on flannel.1, link-type EN10MB (Ethernet), capture size 262144 bytes
20:08:01.265650 IP 10.244.1.102 > 10.244.2.2: ICMP echo request, id 12, seq 2361, length 64
20:08:01.265781 IP 10.244.2.2 > 10.244.1.102: ICMP echo reply, id 12, seq 2361, length 64
...
root@k8s-node-2:~# tcpdump -i ens33 -nn host 172.20.0.82
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
20:09:29.717858 IP 172.20.0.83.7946 > 172.20.0.82.7946: UDP, length 56
20:09:29.719369 IP 172.20.0.82.7946 > 172.20.0.83.7946: UDP, length 49
20:09:30.025022 IP 172.20.0.82.33364 > 172.20.0.83.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.1.102 > 10.244.2.2: ICMP echo request, id 12, seq 2457, length 64
20:09:30.025124 IP 172.20.0.83.54053 > 172.20.0.82.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.2.2 > 10.244.1.102: ICMP echo reply, id 12, seq 2457, length 64
20:09:30.950787 IP 172.20.0.82.33364 > 172.20.0.83.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.1.102 > 10.244.2.2: ICMP echo request, id 12, seq 2458, length 64
20:09:30.950884 IP 172.20.0.83.54053 > 172.20.0.82.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.2.2 > 10.244.1.102: ICMP echo reply, id 12, seq 2458, length 64
...
网络流量过程:
容器网卡 <--> cni0 <--> flannel.1 <--> ens33(node-2) <--> ens33(node-1)
在 ens33 上,是使用 OTV 协议的 overlay 隧道
修改为 VxLAN Directrouting 网络
root@k8s-master:~/manifests# ip r
default via 172.20.0.2 dev ens33 proto static
10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1
10.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink
10.244.2.0/24 via 10.244.2.0 dev flannel.1 onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
172.20.0.0/24 dev ens33 proto kernel scope link src 172.20.0.81
- 修改
kube-flannel.yml
中 net-conf.json 如下:
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan",
"Directrouting": true
}
}
删除 flannel,并重建。
# 修改内容同上,实际不生效,需要将 flannel 的 pod 重建
$ kubectl -n kube-system edit configmaps kube-flannel-cfg
configmap/kube-flannel-cfg edited
# 删除命令
kubectl -n kube-system delete pod kube-flannel-ds-xxx
root@k8s-master:~# ip r
default via 172.20.0.2 dev ens33 proto static
10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1
10.244.1.0/24 via 172.20.0.82 dev ens33
10.244.2.0/24 via 172.20.0.83 dev ens33
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
172.20.0.0/24 dev ens33 proto kernel scope link src 172.20.0.81
原来的值为:
10.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink
10.244.2.0/24 via 10.244.2.0 dev flannel.1 onlink
- 继续ping,再次在 ens33 上抓包,可以看到 ICMP 的报文,证明
Directrouting
正常工作
root@k8s-node-2:~# tcpdump -i ens33 -nnt icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
IP 10.244.1.102 > 10.244.2.2: ICMP echo request, id 19, seq 17, length 64
IP 10.244.2.2 > 10.244.1.102: ICMP echo reply, id 19, seq 17, length 64
...
修改为 host-gw 网络
"Backend": {
"Type": "host-gw",
}
host-gw 路由信息通 VxLAN Directrouting 在此不在演示。