PromQL 介绍

Prometheus提供PromQL(Prometheus Queue language)用于时序数据的查询和统计

介绍

PromQl 支持处理两种向量（vector，序列称为向量）：

通过 F12 查看 API 返回值区分

瞬时向量查询
- 指标名：metric name/key
- 标签：{labels name:key,...}
- 正则获取
  - prometheus_http_requests_total{code!="200"}
  - prometheus_http_requests_total{code=~"40.*"} 正则获取 40x 的
  - prometheus_http_requests_total{code!~"40.*"} 正则获取非 40x 的
  - count(up{instance=~".*web.*"} == 1) 计算总数
范围向量选择器：[durations(s/m/h/d/w/y)]
- up[1h] 1h 内的数据
- up[1h:3m] 1h 内的数据，每 3 分钟一个点
偏移量 offset
- prometheus_http_requests_total offset 1h 1 小时之前的数据
- prometheus_http_requests_total[5m] offset 1h 1 小时之前的 5m 数据范围

四则运算：

up + 5
up - 5
up * 5
up / 5
up % 5
up ^ 3
指标运算：前提标签需要一致
- node_memory_Buffers_bytes + node_memory_Active_bytes
- node_memory_Buffers_bytes + on(instance) node_memory_Active_bytes

关系运算（=,!=,=~,!~,>,<,>=,<=）：

逻辑运算：必须是瞬时向量

集合运行：

函数：

up 查看运行的 Element (即被采集的节点)
up == 0 挂掉的节点
up{instance="100.80.0.128:9100",job="node"}[10m] 最近 10 的值，可以在 Console 中看到
(time() - node_boot_time_seconds)/60/60 系统已经运行的时间 h
node_cpu_seconds_total{mode="system"}
node_cpu_seconds_total{mode="system", instance="100.80.0.128:9100"}
count(node_cpu_seconds_total{mode="system", instance="100.80.0.128:9100"})
count(node_cpu_seconds_total{mode="system"}) by (instance) 查每个主机的 CPU 数量
avg by(instance, job) (1 - irate(node_cpu_seconds_total{mode="idle"}[5m]))
node_load1
node_load5
node_load15
avg(irate(node_cpu_seconds_total{mode="idle"}[10m])) by(instance) 最近 10 分钟的平均空闲率
1 - avg(irate(node_cpu_seconds_total{mode="idle"}[10m])) by(instance) * 100 最近 10 分钟的平均使用率
ALERTS 告警信息
ALERTS_FOR_STATE 告警信息

查询

声明周期管理，需要在命令行添加 --web.enable-lifecycle 参数

管理员管理，要在命令行添加 --web.enable-admin-api 参数