使用 kubeadm 安装 Kubernetes 介绍
前言
k8s各节点运行的服务说明:
- master(控制平面)
- etcd
- kube-apiserver
- kube-scheduler(port: 10251/10259)
- kube-controller-manager
- node(数据平面)
- kube-proxy(port: 10256/30080/30443)
- kubelet
- container runtime, like docker
- 每个阶段最多创建 110 个 Pod
kubectl describe nodes k8s-worker-1 | grep -i "Capacity\|Allocatable" -A 6
- 配置
/var/lib/kubelet/config.yaml
的 maxPods: 500
修改最大值
PS:控制平面的服务以静态podstatic pod
的形式运行。
高可用
所有节点安装前初始化
- 节点
- k8s-master 192.168.179.81
- k8s-node-1 192.168.179.82
- k8s-node-2 192.168.179.83
- CentOS 7.6,Kubernetes中所有机器具有不同的Mac地址、product_uuid
ip link
cat /sys/class/dmi/id/product_uuid
ssh-keygen
ssh-copy-id -i ~/.ssh/id_rsa.pub root@ip
systemctl stop firewalld
systemctl disable firewalld
# 查看状态
firewall-cmd --state
ufw disable
ufw status
sed -i 's/enforcing/disabled/' /etc/selinux/config
setenforce 0
PS: 也可以通过配置 KUBELET_KUBEADM_ARGS 规避
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
# 系统级别打开文件句柄的数量,达到限制时提示 `Too many open files` 或 `Socket/File: Can’t open so many files` 等错误
fs.file-max=1000000
# 以下三个参数,当内核维护的 arp 表过于庞大时候,可以考虑优化
# 存在于 ARP 高速缓存中的最少层数,如果少于这个数,垃圾收集器将不会运行。缺省值是128。
net.ipv4.neigh.default.gc_thresh1=1024
# 保存在 ARP 高速缓存中的最多的记录软限制。垃圾收集器在开始收集前,允许记录数超过这个数字 5 秒。缺省值是 512。
net.ipv4.neigh.default.gc_thresh2=4096
# 保存在 ARP 高速缓存中的最多记录的硬限制,一旦高速缓存中的数目高于此,垃圾收集器将马上运行。缺省值是1024。
net.ipv4.neigh.default.gc_thresh3=8192
# 允许的最大跟踪连接条目,是在内核内存中netfilter可以同时处理的“任务”(连接跟踪条目)
net.netfilter.nf_conntrack_max=10485760
# 哈希表大小(只读)(64位系统、8G内存默认 65536,16G翻倍,如此类推)
net.netfilter.nf_conntrack_tcp_timeout_established=300
net.netfilter.nf_conntrack_buckets=655360
# 每个网络接口接收数据包的速率比内核处理这些包的速率快时,允许送到队列的数据包的最大数目
net.core.netdev_max_backlog=10000
# 指定了每一个real user ID可创建的inotify instatnces的数量上限,默认值: 128
fs.inotify.max_user_instances=524288
# 指定了每个inotify instance相关联的watches的上限,默认值: 8192
fs.inotify.max_user_watches=524288
EOF
sysctl --system
apt install chrony
swapoff -a
并将 /etc/fstab
中swap行注释掉。
CentOS:
cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
ipvs_modules="ip_vs ip_vs_lc ip_vs_wlc ip_vs_rr ip_vs_wrr ip_vs_lblc ip_vs_lblcr ip_vs_dh ip_vs_sh ip_vs_fo ip_vs_nq ip_vs_sed ip_vs_ftp nf_conntrack_ipv4"
for kernel_module in \${ipvs_modules}; do
/sbin/modinfo -F filename \${kernel_module} > /dev/null 2>&1
if [ $? -eq 0 ]; then
/sbin/modprobe \${kernel_module}
fi
done
EOF
启用
chmod 755 /etc/sysconfig/modules/ipvs.modules
modprobe nf_conntrack
bash /etc/sysconfig/modules/ipvs.modules
# 检查
lsmod | grep ip_vs
Ubuntu:
# 临时启用
$ for i in $(ls /lib/modules/$(uname -r)/kernel/net/netfilter/ipvs|grep -o "^[^.]*");do echo $i; /sbin/modinfo -F filename $i >/dev/null 2>&1 && /sbin/modprobe $i; done
# 永久启用,重启生效
$ ls /lib/modules/$(uname -r)/kernel/net/netfilter/ipvs|grep -o "^[^.]*" >> /etc/modules
# 检查
lsmod | grep ip_vs
- 允许 iptables 检查桥接流量,需要显示加载此模块
# 加载
sudo modprobe br_netfilter
# 检测是否加载
lsmod | grep br_netfilter
所有节点安装Docker、kubeadm/kubelet/kubectl
Docker
wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
yum install -y docker-ce
# Docker 1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,会导致Kubernetes集群中跨Node的Pod无法通信,在[Service]中添加ExecStartPost,脚本如下:
sed -i "13i ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT" /usr/lib/systemd/system/docker.service
apt update
apt -y install docker.io containerd
从 kubernetes v1.22 开始,为明确配置 kubelet 的 cgroup-driver 时,默认为 systemd,所以需要修改 docker 的 cgroupdriver
mkdir -p /etc/docker
cat <<EOF | sudo tee /etc/docker/daemon.json
{
"registry-mirrors": [
"https://ghihfm4j.mirror.aliyuncs.com",
"https://registry.docker-cn.com"
],
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "50m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
# "max-concurrent-downloads": 10
}
EOF
# 并配置加速
systemctl daemon-reload
systemctl enable docker
systemctl start docker
docker info | grep Cgroup
安装 cir-dockerd
由于 k8s v1.24 起移除了对 dockershim 的支持,而 docker engine 默认不支持 k8s CRI 规范,需要安装 Mirantis cri-dockerd,请根据操作系统选择需要的版本
curl -LO https://github.com/Mirantis/cri-dockerd/releases/download/v0.2.5/cri-dockerd_0.2.5.3-0.ubuntu-focal_amd64.deb
dpkg -i cri-dockerd_0.2.5.3-0.ubuntu-focal_amd64.deb
安装完成后,cri-docker.service
会自动启动
扩展:
配置 cir-dockerd
由于 cri-dockerd 服务无法下载 k8s.gcr.io/pause:3.6
上的镜像,国内需要配置镜像源:
/lib/systemd/system/cri-docker.service
ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image registry.aliyuncs.com/google_containers/pause:3.7
ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image gcmirrors/pause:3.7
systemctl daemon-reload
systemctl restart cri-docker.service
kubeadm/kubelet/kubectl
国内镜像源加速参考:https://developer.aliyun.com/mirror/kubernetes
cat > /etc/yum.repos.d/k8s.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet-1.19.0 kubeadm-1.19.0 kubectl-1.19.0
systemctl enable kubelet
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
echo "deb https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
apt update
apt install -y kubelet kubeadm kubectl
systemctl enable kubelet
安装
安装前,可以使用如下命令提前下载镜像:
# 方式一:配置文件
# default config
$ kubeadm config print init-defaults
$ kubeadm config print init-defaults --component-configs \
KubeProxyConfiguration,KubeletConfiguration > kubeadm-config.yaml
# 验证
kubeadm init --config kubeadm-config.yaml --dry-run
# pull 镜像
kubeadm config images pull --config kubeadm-config.yaml
# 方式二:命令行方式
# pull k8s images
kubeadm config images list
# list k8s images
kubeadm config images pull
# 加速
kubeadm config images list --image-repository gcmirrors
kubeadm config images pull --image-repository gcmirrors --cri-socket unix:///var/run/cri-dockerd.sock
kubeadm config images list --image-repository registry.aliyuncs.com/google_containers
kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers --cri-socket unix:///var/run/cri-dockerd.sock
说明:
KubeletConfiguration
配置 cgroupDriver
- k8s v1.24(移除Dockershim) 支持的 Container Runtimes
- containerd
- CRI-O
- Docker Engine
- Mirantis Container Runtime
- k8s 1.30 需要 Container Runtime Interface (CRI)
- 关于镜像加速,参考:Kubernetes 相关镜像国内加速
- 国内镜像源拉取,需要指定
--cri-socket
路径
- docker
--cri-socket unix:///var/run/cri-dockerd.sock
- containerd
--cri-socket unix:///var/containerd/containerd.sock
- CRI-o
--cri-socket unix:///var/run/crio/crio.sock
- CRI-o 和 containerd 的容器管理机制不同,镜像文件不通用
- cgroup drivers
apiVersion: kubeadm.k8s.io/v1beta3
...
nodeRegistration:
# 修改为 containerd
criSocket: /run/containerd/containerd.sock # docker 配置为 unix:///var/run/cri-dockerd.sock
...
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
...
cgroupDriver: systemd
Kubernetes Master
在Kubernetes Master节点执行
kubeadm init \
--apiserver-advertise-address=0.0.0.0 \
--apiserver-bind-port 6443 \
--cri-socket unix:///var/run/cri-dockerd.sock \
--image-repository gcmirrors \
--kubernetes-version v1.25.0 \
# --control-plane-endpoint="k8sapi.kb.cx" \
# --service-dns-domain "cluster.local" \
--pod-network-cidr=10.244.0.0/16 \
--service-cidr=10.96.0.0/12 \
# --ignore-preflight-errors=NumCPU \
--upload-certs \
--v 5
输出如下命令:
kubeadm token create --print-join-command
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
aliv2.kb.cx Ready master 2m21s v1.17.0
- 修复scheduler/controller-manager port问题
$ kubectl get componentstatuses
$ kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
组件状态异常,需要修改如下文件,注释--port=0
:
- /etc/kubernetes/manifests/kube-apiserver.yaml
- /etc/kubernetes/manifests/kube-controller-manager.yaml
- /etc/kubernetes/manifests/kube-scheduler.yaml
ref:
https://github.com/kubernetes/kubeadm/issues/2279
PS:/etc/kubernetes/manifests
存放K8s对配置文件。
$ curl --cacert /etc/kubernetes/pki/ca.crt https://127.0.0.1:6443/livez
ok
$ curl --cacert /etc/kubernetes/pki/ca.crt https://127.0.0.1:6443/readyz
ok
Pod网络插件(CNI)
Flannel
Calico
添加 Node 节点
在Node
执行,向集群添加新节点,执行在master执行kubeadm init
时输出的kubeadm join
命令:
kubeadm join 192.168.179.81:6443 --token u9079k.qhzueafydelowi05 \
--discovery-token-ca-cert-hash sha256:3f5c505110873644a0645f0037dacfebaeef99d671cd69ec3a5887bb0dc92c4e \
--cri-socket unix:///var/run/cri-dockerd.sock
# kubeadm join --config=<apiVersion: kubeadm.k8s.io/v1beta2\nkind: JoinConfiguration>
u9079k.qhzueafydelowi05
是token,可以使用 kubeadm token
,如下:
kubeadm token list
kubeadm token create # 可以生成新的token,替换 kubeadm join中的token就可以继续加入集群
卸载指定节点
kubeadm reset --cri-socket unix:///var/run/cri-dockerd.sock
测试kubernetes集群
kubectl get componentstatus
- 在Kubernetes集群中创建一个pod,验证是否正常运行:
kubectl create deployment nginx --image=nginx:alpine
kubectl expose deployment nginx --port=80 --type=NodePort
kubectl get pod,svc
访问地址:http://NodeIP:Port
运维
重启
docker ps -a | awk '{print $1}' | grep -v CONTAINER | xargs docker start
kubeadm init phase certs
cd /etc/kubernetes/pki
rm apiserver.crt apiserver.key
kubeadm init phase certs apiserver --apiserver-cert-extra-sans *.xiexianbin.cn
查看证书:
openssl x509 -text -noout -in apiserver.crt
# kubeadm init phase certs apiserver -h
Generate the certificate for serving the Kubernetes API, and save them into apiserver.cert and apiserver.key files.
Default SANs are kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, 10.96.0.1, 127.0.0.1
If both files already exist, kubeadm skips the generation step and existing files will be used.
Alpha Disclaimer: this command is currently alpha.
Usage:
kubeadm init phase certs apiserver [flags]
Flags:
--apiserver-advertise-address string The IP address the API Server will advertise it's listening on. If not set the default network interface will be used.
--apiserver-cert-extra-sans strings Optional extra Subject Alternative Names (SANs) to use for the API Server serving certificate. Can be both IP addresses and DNS names.
--cert-dir string The path where to save and store the certificates. (default "/etc/kubernetes/pki")
--config string Path to a kubeadm configuration file.
--control-plane-endpoint string Specify a stable IP address or DNS name for the control plane.
-h, --help help for apiserver
--kubernetes-version string Choose a specific Kubernetes version for the control plane. (default "stable-1")
--service-cidr string Use alternative range of IP address for service VIPs. (default "10.96.0.0/12")
--service-dns-domain string Use alternative domain for services, e.g. "myorg.internal". (default "cluster.local")
Global Flags:
--add-dir-header If true, adds the file directory to the header of the log messages
--log-file string If non-empty, use this log file
--log-file-max-size uint Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
--rootfs string [EXPERIMENTAL] The path to the 'real' host root filesystem.
--skip-headers If true, avoid header prefixes in the log messages
--skip-log-headers If true, avoid headers when opening log files
-v, --v Level number for the log level verbosity
更新证书有效期
- 查看
/etc/kubernetes/pki
目录下的证书是否过期:
$ kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W0220 14:50:39.905052 26453 utils.go:69] The recommended value for "resolvConf" in "KubeletConfiguration" is: /run/systemd/resolve/resolv.conf; the provided value is: /run/systemd/resolve/resolv.conf
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Feb 20, 2023 05:57 UTC 364d ca no
apiserver Feb 20, 2023 05:57 UTC 364d ca no
apiserver-etcd-client Feb 20, 2023 05:57 UTC 364d etcd-ca no
apiserver-kubelet-client Feb 20, 2023 05:57 UTC 364d ca no
controller-manager.conf Feb 20, 2023 05:57 UTC 364d ca no
etcd-healthcheck-client Feb 20, 2023 05:57 UTC 364d etcd-ca no
etcd-peer Feb 20, 2023 05:57 UTC 364d etcd-ca no
etcd-server Feb 20, 2023 05:57 UTC 364d etcd-ca no
front-proxy-client Feb 20, 2023 05:57 UTC 364d front-proxy-ca no
scheduler.conf Feb 20, 2023 05:57 UTC 364d ca no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Feb 20, 2032 05:57 UTC 9y no
etcd-ca Feb 20, 2032 05:57 UTC 9y no
front-proxy-ca Feb 20, 2032 05:57 UTC 9y no
kubeadm certs renew all
更新admin.conf
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
更多 Addons
FQA
coredns Pending
[root@aliv2 ~]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-9d85f5447-hmjcq 0/1 Pending 0 10h
coredns-9d85f5447-s9cgc 0/1 Pending 0 10h
etcd-aliv2.kb.cx 1/1 Running 0 10h
kube-apiserver-aliv2.kb.cx 1/1 Running 0 10h
kube-controller-manager-aliv2.kb.cx 1/1 Running 0 10h
kube-proxy-hn5sw 1/1 Running 0 10h
kube-scheduler-aliv2.kb.cx 1/1 Running 0 10h
[root@aliv2 ~]# kubectl describe pod coredns-9d85f5447-hmjcq -n kube-system
Name: coredns-9d85f5447-hmjcq
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: <none>
Labels: k8s-app=kube-dns
pod-template-hash=9d85f5447
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/coredns-9d85f5447
Containers:
coredns:
Image: registry.aliyuncs.com/google_containers/coredns:1.6.5
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-8qmdr (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-8qmdr:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-8qmdr
Optional: false
QoS Class: Burstable
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 34s (x430 over 10h) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
解决方式:
kubectl taint nodes --all node-role.kubernetes.io/master-
coredns CrashLoopBackOff
- kubectl edit cm coredns -n kube-system
- 删除掉那个 loop 并保存退出
- kubectl delete pod coredns-xxx-xxxx -n kube-system
cni config uninitialized
Nov 23 02:23:32 k8s-master kubelet: E1123 02:23:32.879492 5980 kubelet.go:2103] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
未安装 CNI 插件导致,安装见本文:Pod网络插件(CNI)
Container runtime network not ready
openEuler 20.03环境问题:
Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
yum install -y kubernetes-cni
yum install -y kubernetes-cni --nobest
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#check-required-ports
network: error getting ClusterInformation: connection is unauthorized: Unauthorized
- 删除
kubectl -n kube-system delete pod calicoxxx
,参考
failed to run Kubelet: misconfiguration
"Failed to run kubelet” err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: "systemd” is different from dock…er: "cgroupfs””
- docker和kubernetes所使用的cgroup不一致导致
cat > /etc/docker/daemon.json <<EOF
{"exec-opts": ["native.cgroupdriver=systemd"]}
EOF
重启docker
systemctl restart docker
flannel 网络配置错误
E0830 11:18:00.695429 1 main.go:330] Error registering network: failed to acquire lease: node "k8s-node-2" pod cidr not assigned
kubeadm init --pod-network-cidr
必须指定,且和 flannel.yaml 中网段一致
kubeadm / kubelet 报 swap 错误
[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with...
kubeadm init --ignore-preflight-errors=Swap
- kubelet 服务,查看
systemctl cat kubelet
找到对应的 ENV 文件,添加
KUBELET_EXTRA_ARGS="--fail-swap-on=false"
k8s failed to get imageFs info: non-existent label “docker-images”
- 在
/lib/systemd/system/kubelet.service
中添加
[Unit]
After=docker.service