Kubernetes 监控实现介绍
介绍
Kubernetes 监控指标按来源的分类:
- Kubernetes 系统指标
- 容器指标:CPU、内存等
- 应用/业务指标
按类型分类:
- 资源指标(Core metrics pipeline):
- 由 kubelet、metrics-server 以及 API Server 组成
- 指标包括:CPU 积累使用率、内存实时使用率、Pod 的资源占用率及容器的磁盘占用率
- 自定义指标:
- 用于从系统收集各种指标数据,并提供给终端用户、存储系统以及 HPA,包括核心指标和许多非核心指标
- 非核心指标不能被 Kubernetes 所解析
- prometheus,k8s-prometheus-adapter
API 聚合:
- kube-aggregator
- kube api
- metrivs-server,安装后提供的 API (资源
kind: APIService
)路径为:/apis/metrics.k8s.io/v1beta1
,该组件为以下组件提供监控数据,若不按照,将无法使用:
容器日志目录:/var/log/containers/
- 其他,容器采用
container_memory_rss, rss
作为监控指标(非 container_memory_working_set_bytes, wss
)
metrics-server
安装
配置清单文件在:addons/metrics-server,也可以参考:kubernetes-sigs/metrics-server
mkdir metrics-server
cd metrics-server
for f in auth-delegator.yaml auth-reader.yaml metrics-apiservice.yaml metrics-server-deployment.yaml metrics-server-service.yaml resource-reader.yaml; do curl -o $f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/metrics-server/$f; done
# 替换镜像
sed 's# image: k8s.gcr.io/metrics-server# image: k8sgcriometricsserver#g' -i ./metrics-server-deployment.yaml
sed 's# image: k8s.gcr.io/autoscaling# image: k8sgcrioautoscaling#g' -i ./metrics-server-deployment.yaml
- 更新
metrics-server-deployment.yaml
:
image: k8sgcriometricsserver/metrics-server:v0.5.2
command:
- ...
- --kubelet-insecure-tls
...
command:
- /pod_nanny
- --config-dir=/etc/config
# - --cpu={{ base_metrics_server_cpu }}
- --extra-cpu=0.5m
# - --memory={{ base_metrics_server_memory }}
# - --extra-memory={{ metrics_server_memory_per_node }}Mi
- --threshold=5
- --deployment=metrics-server-v0.5.2
- --container=metrics-server
- --poll-period=30000
- --estimator=exponential
# Specifies the smallest cluster (defined in number of nodes)
# resources will be scaled to.
#- --minClusterSize={{ metrics_server_min_cluster_size }}
# Use kube-apiserver metrics to avoid periodically listing nodes.
- --use-metrics=true
$ kubectl apply -f .
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
configmap/metrics-server-config created
deployment.apps/metrics-server-v0.5.2 created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
$ kubectl -n kube-system get pod | grep metrics-server
metrics-server-v0.5.2-855b59c585-kj8hn 2/2 Running 0 48s
$ kubectl get apiservices | grep metrics
v1beta1.metrics.k8s.io kube-system/metrics-server True 8m46s
监控指标示例
# 第一个窗口启动代理
$ kubectl proxy --port=8080
# 第二个窗口调用 API
$ curl http://localhost:8080/apis/metrics.k8s.io/v1beta1
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "metrics.k8s.io/v1beta1",
"resources": [
{
"name": "nodes",
"singularName": "",
"namespaced": false,
"kind": "NodeMetrics",
"verbs": [
"get",
"list"
]
},
{
"name": "pods",
"singularName": "",
"namespaced": true,
"kind": "PodMetrics",
"verbs": [
"get",
"list"
]
}
]
}
$ curl http://localhost:8080/apis/metrics.k8s.io/v1beta1/nodes
$ curl http://localhost:8080/apis/metrics.k8s.io/v1beta1/pods
$ kubectl top pod stress-demo
NAME CPU(cores) MEMORY(bytes)
stress-demo 1m 1Mi
$ kubectl top node k8s-node-1
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-node-1 156m 3% 1742Mi 45%
HeapSter
$ kubectl top pod stress-demo
error: Metrics API not available
是由于没有指标汇聚工具导致的,可以安装 HeapSter,HeapSter 组成:
- HeapSter Server
- 它由 Kubelet 的 cAdvisor 来采集内存、CPU 等指标
- InfluxDB 存储数据
- Grafana 为 InfluxDB 提供展示
由于 HeapSter 在 Kubernetes 1.11 版本废弃(1.12 彻底移除),在此不在详细介绍。