1 year ago

#359098

test-img

zangw

prometheus-adapter unable to fetch CPU metrics for pod podName skipping

We try to install Prometheus in AWS EKS through kube-prometheus, However, we got an error for kubectl top command.

kubectl top po -n monitoring
error: Metrics not available for pod monitoring/podName, age: 30h10m27.35393s
kubectl  top  node
error: metrics not available yet

After checking the logs of prometheus-adapter

Error logs as below

E0331 10:02:47.330109       1 provider.go:191] unable to fetch CPU metrics for pod beta/podName, skipping
E0331 10:02:47.346497       1 provider.go:191] unable to fetch CPU metrics for pod beta/podName, skipping
E0331 10:02:47.346540       1 provider.go:191] unable to fetch CPU metrics for pod beta/podName, skipping

Check the apiservice

kubectl get apiservice

v1beta1.events.k8s.io                  Local                           True        6d7h
v1beta1.extensions                     Local                           True        6d7h
v1beta1.flowcontrol.apiserver.k8s.io   Local                           True        6d7h
v1beta1.metrics.k8s.io                 monitoring/prometheus-adapter   True        26m
v1beta1.networking.k8s.io              Local                           True        6d7h
v1beta1.node.k8s.io                    Local                           True        6d7h

We know the prometheus-adapter is used for metrics now

The result of kubectl get apiservice v1beta1.metrics.k8s.io -o yaml

apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"apiregistration.k8s.io/v1","kind":"APIService","metadata":{"annotations":{},"labels":{"app.kubernetes.io/component":"metrics-adapter","app.kubernetes.io/name":"prometheus-adapter","app.kubernetes.io/part-of":"kube-prometheus","app.kubernetes.io/version":"0.9.0"},"name":"v1beta1.metrics.k8s.io"},"spec":{"group":"metrics.k8s.io","groupPriorityMinimum":100,"insecureSkipTLSVerify":true,"service":{"name":"prometheus-adapter","namespace":"monitoring"},"version":"v1beta1","versionPriority":100}}
  creationTimestamp: "2022-03-31T11:28:00Z"
  labels:
    app.kubernetes.io/component: metrics-adapter
    app.kubernetes.io/name: prometheus-adapter
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 0.9.0
  name: v1beta1.metrics.k8s.io
  resourceVersion: "2379238"
  uid: 570f7068-ea53-4850-ae65-6bf027457de1
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: prometheus-adapter
    namespace: monitoring
    port: 443
  version: v1beta1
  versionPriority: 100
status:
  conditions:
  - lastTransitionTime: "2022-03-31T11:28:31Z"
    message: all checks passed
    reason: Passed
    status: "True"
    type: Available

The metric reader configuration of Prometheus adapter

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/component: metrics-adapter
    app.kubernetes.io/name: prometheus-adapter
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 0.9.0
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch

The EKS version is 1.21 and prometheus-adapter is v0.9.0

Could someone help us to figure out the solution to this issue?


Update 04/06/2022:

After changing the instance to node in override in config map of Prometheus adapter. The errors are still here.

kubectl get cm -n monitoring adapter-config -o yaml

apiVersion: v1
data:
  config.yaml: |-
    "resourceRules":
      "cpu":
        "containerLabel": "container"
        "containerQuery": |
          sum by (<<.GroupBy>>) (
            irate (
                container_cpu_usage_seconds_total{<<.LabelMatchers>>,container!="",pod!=""}[120s]
            )
          )
        "nodeQuery": |
          sum by (<<.GroupBy>>) (
            1 - irate(
              node_cpu_seconds_total{mode="idle"}[60s]
            )
            * on(namespace, pod) group_left(node) (
              node_namespace_pod:kube_pod_info:{<<.LabelMatchers>>}
            )
          )
          or sum by (<<.GroupBy>>) (
            1 - irate(
              windows_cpu_time_total{mode="idle", job="windows-exporter",<<.LabelMatchers>>}[4m]
            )
          )
        "resources":
          "overrides":
            "namespace":
              "resource": "namespace"
            "node":
              "resource": "node"
            "pod":
              "resource": "pod"
      "memory":
        "containerLabel": "container"
        "containerQuery": |
          sum by (<<.GroupBy>>) (
            container_memory_working_set_bytes{<<.LabelMatchers>>,container!="",pod!=""}
          )
        "nodeQuery": |
          sum by (<<.GroupBy>>) (
            node_memory_MemTotal_bytes{job="node-exporter",<<.LabelMatchers>>}
            -
            node_memory_MemAvailable_bytes{job="node-exporter",<<.LabelMatchers>>}
          )
          or sum by (<<.GroupBy>>) (
            windows_cs_physical_memory_bytes{job="windows-exporter",<<.LabelMatchers>>}
            -
            windows_memory_available_bytes{job="windows-exporter",<<.LabelMatchers>>}
          )
        "resources":
          "overrides":
            "namespace":
              "resource": "namespace"
            "node":
              "resource": "node"
            "pod":
              "resource": "pod"
      "window": "5m"
kind: ConfigMap

amazon-web-services

kubernetes

prometheus

amazon-eks

prometheus-operator

0 Answers

Your Answer

Accepted video resources