Prometheus HTTP redirect check

From uCtrl
Photo by Danny Grizzle on Unsplash.

Over the years this website has been hosted on many different domains; uctrl.net, thomasjensen.me, blog.uctrl.net, uctrl.dev, and now; uctrl.org. Sometimes with www — sometimes without.

To point traffic to the new location, I’ve always set up 301 redirects. But on numerous occasions some of those redirects has broken, either through misconfiguration, expiring certificates, or just me forgetting to update them.

I wanted a way to continuously check that all redirects were valid and pointing to the right location. I’m already using Prometheus for everything else, so I added a new module for the Blackbox exporter.

In my blackbox.yml:

modules:
    http_uctrl_301:
      prober: http
      http:
        method: HEAD
        valid_status_codes: [301]
        no_follow_redirects: true
        fail_if_header_not_matches:
          - header: Location
            regexp: '^https:\/\/uctrl\.dev\/$'

And for prometheus.yml:

scrape_configs:
  - job_name: 'http-uctrl-301'
      scrape_interval: 5m
      metrics_path: /probe
      params:
        module: [http_uctrl_301]
      static_configs:
        - targets:
          - 'https://uctrl.net'
          - 'https://blog.uctrl.net'
          - 'https://www.uctrl.net'
          - 'https://thomasjensen.me'
          - 'https://www.thomasjensen.me'
          - 'https://uctrl.org'
          - 'https://www.uctrl.org'
          - 'https://www.uctrl.dev'
    relabel_configs:
        - source_labels: [__address__]
          target_label: __param_target
        - source_labels: [__param_target]
          target_label: instance
        - target_label: __address__
          replacement: localhost:9115

Now we need some rules for Alertmanager:

groups:
- name: General
  rules:

  - alert: SSLCertExpiringSoon
    expr: probe_ssl_earliest_cert_expiry - time() < 86400 * 30
    for: 5m
    labels:
      severity: warn
    annotations:
      summary: "SSL certificate on {{$labels.instance}} will expire in 30 days"
      description: "SSL certificate on {{$labels.instance}} will expire in 30 days"

  - alert: SiteIsDown
    expr: probe_success == 0
    for: 5m
    labels:
      severity: alert
    annotations:
      summary: "{{$labels.instance}} is DOWN"
      description: "{{$labels.instance}} is DOWN for 5 minutes"

If any of the URLs in the scrape configuration does not return a 301 status code, with the Location header set to https://uctrl.dev, and have a valid certificate — the respective rule will fail, and I will get an alert.