Telemetry Demo

记录一下一个Telemetry的demo,使用pipeline+influxDB+Grafana,构建一个可视化监控。

  • pipeline: 用于采集设备的telemetry信息。
  • influxDB: 使用数据库的方式存储telemetry采集到的信息
  • Grafana: 将数据库采集的信息图像化输出.

安装与配置

Pipeline:

使用的是“bigmuddy-network-telemetry-pipeline”,bigmuddy-network-telemetry-pipeline 这个项目被存档了, 我fork了下, 所以可以使用以下的命令去get 这个项目:

git clone https://github.com/xuxing3/bigmuddy-network-telemetry-pipeline.git

配置文件如下:

[root@xuxing239 bigmuddy-network-telemetry-pipeline]# grep -v ^# pipeline-influxdb.conf  | grep -v ^$
[default]
id = pipeline
metamonitoring_prometheus_resource = /metrics
metamonitoring_prometheus_server = :8989
[testbed]
stage = xport_input
type = tcp
encap = st
listen = :5432           <<<< 监听5432 端口, 可以更改
[inspector]
stage = xport_output
type = tap
file = /opt/dump-xuxing.txt    <<<<<采集到的 telemetry的数据存储的位置
datachanneldepth = 1000
[metrics_influx]  
stage = xport_output
type = metrics
file = metrics-nms.json        <<<<<<  将采集到telemetry 的信息使用该文件定义的格式, 格式化输出到influxdb
dump = metrics-dump-xuxing.txt    <<<< 格式化输出之前会dump一份数据存储在该文件中
output = influx
influx = http://10.70.80.197:8086       <<<<<<<  influxdb, 默认端口为8086
database = mdt_db                       <<<<<<< 定义infludb 那个数据库用于存储telemetry信息

metrics-nms.json 格式如下, 如何去定义这个json 文件, 我目前能找到方法也只是参考pipline采集到数据自己去修改json文件。metrics文件的意义在于翻译采集到的信息, 并导入数据库。

[root@xuxing bigmuddy-network-telemetry-pipeline]# cat metrics-nms.json 
[
        {
                "basepath" : "Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/counters",
                "spec" : {
                        "fields" : [
                                {"name":"node-name", "tag": true},     <<<<<<<   tag 为 true 是为了设置后续可以按照哪个参数去filter 数据
                                {"name":"np-name"},
                                {
                                        "name":"np-counter",
                                        "fields" : [
                                                {"name":"counter-name", "tag": true},    <<<<<<
                                                {"name":"counter-value"},
                                                {"name":"rate"},
                                                {"name":"counter-type"},
                                                {"name":"counter-index"}
                                        ]
                                }
                        ]
                }
        }
]
InfluxDB/Grafana
docker run -d -v grafana-storage:/var/lib/grafana -p 3000:3000 grafana/grafana:5.4.3
docker run -d --volume=/opt/influxdb:/data -p 8080:8083 -p 8086:8086 tutum/influxdb

这里使用docker 的方式,run起来后访问http://<ServerIP>:8080, 去修改一些influxdb的参数:

  • 创建mdt_db database, 并修改database数据存储的时间
CREATE DATABASE "mdt_db"
CREATE RETENTION POLICY "24h" ON "db_name" DURATION 1d REPLICATION 1 DEFAULT  #1 day
CREATE RETENTION POLICY "one_month" ON "mdt_db" DURATION 30d REPLICATION 1 DEFAULT # one_month

其他的一些常用命令:
SHOW MEASUREMENTS  
select * from ""  
SHOW RETENTION POLICIES ON "mdt_db"

InfluxDB 命令参考以下文章

https://blog.csdn.net/daguanjia11/article/details/90666888
  • Grafana连接数据库

Telemetry数据采集

IOX设备配置如下:
RP/0/RSP1/CPU0:ASR9906-B#show run telemetry model-driven 
Wed May 19 07:52:54.048 UTC
telemetry model-driven
 destination-group xuxing
  vrf calo-mgmt
  address-family ipv4 10.70.80.197 port 5432   <<<<<<
   encoding self-describing-gpb
   protocol tcp
  !
 sensor-group xuxing
  sensor-path Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/counters/np-counter
 !
 subscription xuxing
  sensor-group-id xuxing strict-timer
  sensor-group-id xuxing sample-interval 5000
  destination-id xuxing
  source-interface MgmtEth0/RSP1/CPU0/0
Server 配置
[root@xuxing bigmuddy-network-telemetry-pipeline]# ./bin/pipeline -log= -debug -config=pipeline-influxdb.conf
INFO[2021-05-19 00:58:24.900661] Conductor says hello, loading config          config=pipeline-influxdb.conf debug=true fluentd= logfile= maxthreads=12 tag=pipeline version="v1.0.0(bigmuddy)"
DEBU[2021-05-19 00:58:24.901561] Conductor processing section...               name=conductor section=inspector tag=pipeline
DEBU[2021-05-19 00:58:24.901576] Conductor processing section, type...         name=conductor section=inspector tag=pipeline type=tap
INFO[2021-05-19 00:58:24.901588] Conductor starting up section                 name=conductor section=inspector stage="xport_output" tag=pipeline
DEBU[2021-05-19 00:58:24.901619] Conductor processing section...               name=conductor section="metrics_influx" tag=pipeline
DEBU[2021-05-19 00:58:24.901655] Conductor processing section, type...         name=conductor section="metrics_influx" tag=pipeline type=metrics
INFO[2021-05-19 00:58:24.901665] Conductor starting up section                 name=conductor section="metrics_influx" stage="xport_output" tag=pipeline
INFO[2021-05-19 00:58:24.901676] Metamonitoring: serving pipeline metrics to prometheus  name=default resource="/metrics" server=":8989" tag=pipeline
INFO[2021-05-19 00:58:24.901767] Starting up tap                               countonly=false filename="/opt/dump-xuxing.txt" name=inspector streamSpec=&{2 <nil>} tag=pipeline
CRYPT Client [metrics_influx],[http://10.70.80.197:8086]
 
 Enter username: admin
 Enter password:
 
INFO[2021-05-19 00:58:28.195240] setup authentication                          authenticator="http://10.70.80.197:8086" name="metrics_influx" pem= tag=pipeline username=admin
INFO[2021-05-19 00:58:28.195357] setup metrics collection                      basepath="Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/counters" name="metrics_influx" tag=pipeline
DEBU[2021-05-19 00:58:28.195371] metrics export configured                     file=metrics-nms.json metricSpec={[{Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/counters 0xc420344e60}] map[Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/counters:0xc420344e60] 0xc420328bd0} name="metrics_influx" output=influx tag=pipeline
DEBU[2021-05-19 00:58:28.195429] Conductor processing section...               name=conductor section=testbed tag=pipeline
DEBU[2021-05-19 00:58:28.195452] Conductor processing section, type...         name=conductor section=testbed tag=pipeline type=tcp
INFO[2021-05-19 00:58:28.195466] Conductor starting up section                 name=conductor section=testbed stage="xport_input" tag=pipeline
Grafana 简单配置如下:

其他优化

由于telemetry的数据过于庞大, 建议将server上的数据定期清空, 否则很容易占满磁盘

[root@xuxing bigmuddy-network-telemetry-pipeline]# crontab -l
*/60 * * * * root echo "" > /opt/dump-xuxing.txt
*/60 * * * * root echo "" > /root/bigmuddy-network-telemetry-pipeline/metrics-dump-xuxing.txt_wkid0
[root@xuxing bigmuddy-network-telemetry-pipeline]# 
           

No comments

Comments feed for this article

Reply

Your email address will not be published.