版权声明 本站原创文章 由 萌叔 发表
转载请注明 萌叔 | http://vearne.cc

1. 引言

对于高可用的服务,监控的粒度往往都会非常细。如果恰好你也在使用 Prometheus, 也需要在业务层对Redis连接池和MySQL连接池进行监控。那么此篇文章对你而言将是一种福利。

Redis Client

go-redis/redis

MySQL Client

jinzhu/gorm

2. 样例代码

go get github.com/vearne/golib

main.go

package main

import (
	"github.com/go-redis/redis"
	"github.com/jinzhu/gorm"
	_ "github.com/jinzhu/gorm/dialects/mysql"
	"github.com/prometheus/client_golang/prometheus/promhttp"
	"github.com/vearne/golib/metric"
	"log"
	"net/http"
	"time"
)


func main() {
	// init redis
	client := redis.NewClient(&redis.Options{
		Addr:     "localhost:6379",
		PoolSize: 100,
	})
	
    // ***监控Redis连接池***
	metric.AddRedis(client, "car")

	// init mysql
	DSN := "test:xxxx@tcp(localhost:6379)/somebiz?charset=utf8&loc=Asia%2FShanghai&parseTime=true"
	mysqldb, err := gorm.Open("mysql", DSN)
	if err != nil {
		panic(err)
	}

	mysqldb.DB().SetMaxIdleConns(50)
	mysqldb.DB().SetMaxOpenConns(100)
	mysqldb.DB().SetConnMaxLifetime(5 * time.Minute)

    // ***监控MySQL连接池***
	metric.AddMySQL(mysqldb, "car")

	// do some thing
	for i := 0; i < 30; i++ {
		go func() {
			for {
				client.Get("a").String()
				time.Sleep(200 * time.Millisecond)
				mysqldb.Exec("show tables")
			}
		}()
	}

	http.Handle("/metrics", promhttp.Handler())
	log.Fatal(http.ListenAndServe(":9090", nil))
	log.Println("starting...")
}
func AddRedis(client RedisClient, role string)
func AddMySQL(client *gorm.DB, role string)

role 仅用于区分不同的Redis实例

3. 能够获得的指标

注意: 如果连接池耗尽(active >= poolsize), 那么相应的操作可能会因为等待可用连接而阻塞。

# TYPE mysql_pool_fetches_total counter
mysql_pool_fetches_total{role="car",state="max_idle_closed"} 0
mysql_pool_fetches_total{role="car",state="max_life_closed"} 0
mysql_pool_fetches_total{role="car",state="wait_count"} 0
mysql_pool_fetches_total{role="car",state="wait_duration"} 0

MySQL 连接池使用情况

# HELP mysql_pool_state MySQL pool state
# TYPE mysql_pool_state gauge
mysql_pool_state{role="car",state="active"} 16
mysql_pool_state{role="car",state="idle"} 14
mysql_pool_state{role="car",state="poolsize"} 100
# TYPE redis_pool_fetches_total counter
redis_pool_fetches_total{role="car",state="hit"} 18605
redis_pool_fetches_total{role="car",state="miss"} 30
redis_pool_fetches_total{role="car",state="timeout"} 0

Redis 连接池使用情况

# HELP redis_pool_state Redis pool state
# TYPE redis_pool_state gauge
redis_pool_state{role="car",state="active"} 0
redis_pool_state{role="car",state="idle"} 30
redis_pool_state{role="car",state="poolsize"} 100

Redis 操作的耗时统计(TP90、TP99)

# HELP redis_request_duration_seconds Redis request duration in seconds
# TYPE redis_request_duration_seconds histogram
redis_request_duration_seconds_bucket{role="car",le="0.001"} 18257
redis_request_duration_seconds_bucket{role="car",le="0.0025"} 18510
redis_request_duration_seconds_bucket{role="car",le="0.005"} 18549
...
redis_request_duration_seconds_bucket{role="car",le="+Inf"} 18635
redis_request_duration_seconds_sum{role="car"} 6.7386483039999785
redis_request_duration_seconds_count{role="car"} 18635

4. 一点感想

高SLA的服务,监控的粒度往往会做的非常细。服务与服务的质量的上的差异就体现在这些微不足道的细节中。


微信公众号