Notice
Recent Posts
Recent Comments
Link
«   2025/05   »
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
Tags
more
Archives
Today
Total
관리 메뉴

KJH

Prometheus (Windows exporter) 본문

DevOps

Prometheus (Windows exporter)

모이스쳐라이징 2023. 10. 20. 23:22

prometheus에 윈도우 머신을 import할 수 있는 툴입니다.

 

 

※ 수집가능한 metric 종류는 많음

https://github.com/prometheus-community/windows_exporter

 

GitHub - prometheus-community/windows_exporter: Prometheus exporter for Windows machines

Prometheus exporter for Windows machines. Contribute to prometheus-community/windows_exporter development by creating an account on GitHub.

github.com

※ container 옵션도 있는데 Hyper-V가 꺼져있는 경우 event viewer에서 무수한 에러를 뿜으니 조심 

 

설치용 ps1

# localhost:9182/metrics 


function Install-Prometheus {
    $getPromtheus = get-package | Where-Object {$_.Name -Match "windows_exporter"}
    if (-not $getPromtheus) {
        $installerUrl = "https://github.com/prometheus-community/windows_exporter/releases/download/v0.24.0/windows_exporter-0.24.0-amd64.msi"
        $installerPath = "C:\temp\wmi_exporter.msi"

        $folderPath = Split-Path -Path $installerPath -Parent
        if (-not (Test-Path $folderPath)) {
            New-Item -Path $folderPath -ItemType Directory
        }

        Invoke-WebRequest -Uri $installerUrl -OutFile $installerPath

        Start-Process -Wait -FilePath "msiexec" -ArgumentList "/i $installerPath ENABLED_COLLECTORS=`"cpu,memory,cs,logical_disk,net,os,service,system,process,logon`" LISTEN_PORT=9182"
        Write-Host "Git has been installed successfully."
    } else {
        Restart-Service -Name "windows_exporter"
        Write-Host "Service $getPromtheus already exists."
    }
}

Install-Prometheus

prometheus helm 차트 작업

 

머신 등록

    additionalScrapeConfigs:
      - job_name: 'windows'
        static_configs:
          - targets: ['1.2.3.4:9182']

 

사용할만한 alert 쿼리

    - name: idc-exporter
      rules:
      - alert: HealthCheckIDC
        expr: sum(
                label_replace(
                  up{instance=~".*:9182"} == 0,
                  "hostname",
                  "$1",
                  "instance",
                  "(.*)"
                ) * on (instance) group_left (hostname)
                last_over_time(windows_cs_hostname[10d])
              ) by (hostname)
        for: 5m
        labels:
          group: idc-exporter
        annotations:
          summary: "Health Check IDC (hostname {{ $labels.hostname }})"
          description: "{{ $labels.hostname }}"

      - alert: LowDiskSpace
        expr: sum(
                label_replace(
                  round(
                    (windows_logical_disk_free_bytes{volume="C:"} / (1024^3) * 100)
                  ) / 100,
                  "hostname", 
                  "$1", 
                  "instance",
                  ".*"
                ) * on(instance) group_left(hostname) windows_cs_hostname
              ) by (hostname) < 20  # 20GB 미만일 경우 경고 발생
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Low Disk Space on C Drive (hostname {{ $labels.hostname }})"
          description: "{{ $value }}GB"

      - alert: highMemory
        expr: sum(
                label_replace(
                  round(
                    (windows_cs_physical_memory_bytes - windows_os_physical_memory_free_bytes) / windows_cs_physical_memory_bytes * 100),
                  "hostname", 
                  "$1", 
                  "instance",
                  ".*"
                ) * on(instance) group_left(hostname) windows_cs_hostname
              ) by (hostname) > 90 
        for: 5s
        labels:
          severity: info
        annotations:
          summary: "Windows high Memory"
          description: "{{ $value }}%"

 

알림 설정

    - name: healthcheck-slack-alert
      slack_configs:
      - send_resolved: true
        icon_url: "https://avatars3.githubusercontent.com/u/3380462"
        api_url: "slack url"
        title: >
          {{ range .Alerts }} 
          {{ if eq .Status "resolved" }}
          Machine is UP: ( {{ .Labels.hostname }} {{ .Labels.instance }} )
          {{ else }} 
          Machine is DOWN: ( {{ .Labels.hostname }} {{ .Labels.instance }} )
          {{ end }} 
          {{ end }}
          
          
      - match:
          alertname: HealthCheckIDC
        receiver: healthcheck-slack-alert
        group_by: ['...']
        group_interval: 10m
        repeat_interval: 24h

'DevOps' 카테고리의 다른 글

Terraform (Azure)  (0) 2023.11.01
ansible  (0) 2023.10.20
azure keyvault secrets provider  (0) 2023.10.17
Packer  (0) 2023.10.17
blackbox exporter 배포 및 alertmanager slack 설정  (0) 2022.11.06