Configurable Alerts
This tutorial demonstrates how to use the configurable alerts feature in Edge Builder. It covers the following:
- Initialise the Edge Builder server components and CLI
- Add a node
- Configure a set of alert rules
- Add alert definitions
- Deploy alerts to an Edge Node
- Enable and disable alerts on an edge node
- Viewing triggered alerts
- Adding and removing alert labels
Initalise the Edge Builder server components and CLI
To initialise the Edge Builder server components and CLI, complete the following steps:
- Set up the environment as described in Tutorials Setup
-
SSH into the master node:
vagrant ssh master
-
Start the Edge Builder server components using the following command:
When the server components are up, run the following command to list all server components:sudo edgebuilder-server up -a 192.168.56.10
vagrant@master:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 1df950ae96b7 iotechsys/dev-eb-controller:1.2.0.dev "./entrypoint.sh" 17 seconds ago Up 13 seconds 0.0.0.0:8085->8085/tcp, :::8085->8085/tcp, 0.0.0.0:50000-50100->50000-50100/tcp, :::50000-50100->50000-50100/tcp, 0.0.0.0:1022->22/tcp, :::1022->22/tcp eb-controller c864744d4e50 chronograf:1.8.8-alpine "./custom-entrypoint…" 18 seconds ago Up 16 seconds 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp eb-chronograf 96b5c9e6b040 iotechsys/dev-eb-webssh:1.2.0.dev "wssh --address=0.0.…" 21 seconds ago Up 17 seconds 0.0.0.0:8989->8989/tcp, :::8989->8989/tcp eb-webssh 56af55f3f97c iotechsys/dev-eb-redis:1.2.0.dev "redis-server /etc/r…" 21 seconds ago Up 17 seconds 6379/tcp eb-redis 9c3fb1d52651 postgres:alpine "./entrypoint.sh" 21 seconds ago Up 17 seconds 5432/tcp eb-db 0a28a5a21ad8 kapacitor:1.5-alpine "/entrypoint.sh kapa…" 21 seconds ago Up 17 seconds 9092/tcp eb-kapacitor ac0ac77da753 iotechsys/dev-eb-salt-master:1.2.0.dev "/bin/sh -c 'sed -i …" 21 seconds ago Up 17 seconds 0.0.0.0:4505-4506->4505-4506/tcp, :::4505-4506->4505-4506/tcp, 0.0.0.0:8099->8099/tcp, :::8099->8099/tcp eb-salt-master 44cb2cd6d28e influxdb:1.8.1-alpine "./custom-entrypoint…" 21 seconds ago Up 17 seconds 0.0.0.0:8086->8086/tcp, :::8086->8086/tcp eb-influxdb 295af012879a grafana/grafana:7.4.2 "/run.sh" 21 seconds ago Up 17 seconds 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp eb-grafana 41ea794246c7 vault:1.7.1 "./entrypoint.sh" 21 seconds ago Up 17 seconds 8200/tcp eb-vault 685cecd2a64a portainer/portainer-ce:2.1.0-alpine "/portainer -H unix:…" 21 seconds ago Up 17 seconds 0.0.0.0:8000->8000/tcp, :::8000->8000/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp eb-portainer
-
Log in to Edge Builder with the default user credentials using the following command:
The following message displays:edgebuilder-cli user login -u iotech -p EdgeBuilder123 -c "http://192.168.56.10:8085"
INFO: User "iotech" logged in successfully
-
Confirm that you have a valid license in the Vagrant project directory ('edgebuilder-vagrant')
- Add a license using the following command:
The output is similar to the following:
edgebuilder-cli license add -l DemoLicense -f /vagrant/EdgeBuilder_test_Evaluation.lic
INFO: License added successfully +-------------+--------------------------------------+---------------------------------+-----------+-----------+ | NAME | ID | FILENAME | MAX NODES | EXPIRY | +-------------+--------------------------------------+---------------------------------+-----------+-----------+ | DemoLicense | b65c2ba0-c78b-4031-aaf4-cc030d3d763d | EdgeBuilder_test_Evaluation.lic | 100 | unlimited | +-------------+--------------------------------------+---------------------------------+-----------+-----------+ INFO: Viewing 1 result(s)
Add a node
In this section, we add a node to Edge Builder. It is accessible on 192.168.56.11
.
To add an edge node, complete the following steps:
-
Confirm that the example node configuration file is available using the
cat
command:The output is similar to the following:cat /vagrant/examples/single-node-config.json
{ "NodeConfig": [ { "name": "node1", "description": "virtual node 1", "nodeaddress": "192.168.56.11", "username" : "vagrant", "password" : "vagrant", "serveraddress": "192.168.56.10", "labels" : [] } ] }
-
Add the node to Edge Builder using the following command:
The output is similar to the following:edgebuilder-cli node add -f /vagrant/examples/single-node-config.json
INFO: SSH Node(s) added successfully: +-------+--------------------------------------+ | NODE | ID | +-------+--------------------------------------+ | node1 | e01c540b-fa87-4721-b38c-5f0109210381 | +-------+--------------------------------------+ INFO: Viewing 1 result(s)
-
Confirm that the node has been added using the following command:
The output is similar to the following:edgebuilder-cli node view --all
INFO: Finding all nodes... INFO: *** Node View Results *** +-------+----------+------------+--------------+----------------+--------------------------------------+ | NAME | STATUS | DEBUG MODE | METRICS MODE | DESCRIPTION | ID | +-------+----------+------------+--------------+----------------+--------------------------------------+ | node1 | Deploying| Off | Local | virtual node 1 | e01c540b-fa87-4721-b38c-5f0109210381 | +-------+----------+------------+--------------+----------------+--------------------------------------+ INFO: Viewing 1 result(s)
Note
- The status of the node is initially shown as
Deploying
. After a few minutes, you can re-issue the command and the status will have changed toUp
- You can also use the
watch
command with thenode view
above to automatically track the status changes
- The status of the node is initially shown as
Configure a set of alert rules
In this section, we demonstrate how to define a set of alert rules in Chronograf that can be added to Edge Builder as alert definitions.
-
Select an edge node as a prototype node. The node is used to configure initial alert parameters before we deploy them to multiple edge nodes.
-
On this prototype node, enable live metrics using the following:
The output is similar to the following:edgebuilder-cli node metrics -n node1 -m live -w
INFO: Setting metrics mode to 'Live' on 1 node(s) INFO: Access metrics dashboards here: INFO: (name = node1, ID = b759337c-669d-4ee4-9e1e-054d15fb5c1b): "http://192.168.56.10:3000/d/4MGFhG1Gz/edge-node-metrics?orgId=1&refresh=5s&var-Node=node1"
-
Open the Chronograf web UI at
http://192.168.56.10:8888
. -
In Chronograf, navigate to Alerting -> Manage Tasks. You should see the following dashboard:
-
Select Build Alert Rule on this dashboard. The following alert rule builder screen displays:
-
Name the alert. To do this, enter CPU Alert in the text box at the top of the page
-
Select threshold from the alert types, and configure the idle CPU usage as your monitor value like the following:
-
Set the condition at which the alert will be triggered. This section also shows historical data for the chosen value:
-
Leave the Alert Handlers section blank. We set this at the deployment stage.
-
Use the provided template variables to set the alert message. The alert message is passed to Edge Builder as part of the alert event:
{{.Name}} is {{ index .Fields "value" }}
-
Click Save Rule at the top of the screen. We've now created an alert rule.
-
Repeat this process, but this time create an alert to monitor memory usage. Set up this alert like the following:
-
Navigate to Alerting on the left sidebar, and select "Alert History". This will give you a list of the alerts that have been triggered, allowing you to see the alert behaviour and edit any values if necessary. To edit the configured alerts, return to the previous screen and select one of your alert rules.
Note
When selecting the threshold option for an alert rule, note that by default this alert will fire only when the provided threshold is crossed. This is because of the Kapacitor property method
.stateChangesOnly
being present in the TICKScript automatically.
Add alert definitions
In this section, we describe how to take the alert rules configured in the previous section and add them as alert definitions in Edge Builder.
-
Add alert definitions to Edge Builder, and automatically associate them with our previously created alert group AlertGroup1, using the following command:
The output is similar to the following:edgebuilder-cli alertDefinition add -r "CPU Alert","Mem Alert" -d "A CPU alert","A Memory alert" -g AlertGroup1
INFO: AlertDefinition(s) added successfully: +---------------------+--------------------------------------+----------------+-------------+ | NAME | ID | DESCRIPTION | LABELS | +---------------------+--------------------------------------+----------------+-------------+ | CPUAlert-definition | c9c7ad5c-ea91-4d36-80b7-ed9b80b08490 | A CPU alert | AlertGroup1 | | MemAlert-definition | a4a8500b-73ff-4ce9-8141-fda6d1e2bb5c | A Memory alert | AlertGroup1 | +---------------------+--------------------------------------+----------------+-------------+ INFO: Viewing 2 result(s)
-
Confirm the alert definitions have been added using the following command:
The output is similar to the following:edgebuilder-cli alertDefinition view -A
INFO: Finding all alertDefinitions... INFO: *** AlertDefinition View Results *** +---------------------+--------------------------------------+----------------+-------------+ | NAME | ID | DESCRIPTION | LABELS | +---------------------+--------------------------------------+----------------+-------------+ | CPUAlert-definition | c9c7ad5c-ea91-4d36-80b7-ed9b80b08490 | A CPU alert | AlertGroup1 | | MemAlert-definition | a4a8500b-73ff-4ce9-8141-fda6d1e2bb5c | A Memory alert | AlertGroup1 | +---------------------+--------------------------------------+----------------+-------------+ INFO: Viewing 2 result(s)
-
To view more detailed information about an alert definition, including alert group association, run the following command:
The output is similar to the following:edgebuilder-cli alertDefinition inspect -d CPUAlert-definition
{ "Alerts": [], "CreatedBy": "iotech", "Description": "A CPU alert", "ID": "8d1f1366-6503-47bb-bd47-ce10fd3964b7", "ModifiedBy": "iotech", "Name": "CPUAlert-definition", "Tickscript": "var db = 'node1'\n\nvar rp = 'autogen'\n\nvar measurement = 'cpu'\n\nvar groupBy = []\n\nvar whereFilter = lambda: (\"cpu\" == 'cpu-total') AND isPresent(\"usage_idle\")\n\nvar name = 'CPU Alert'\n\nvar idVar = name\n\nvar message = '{{.Name}} is {{ index .Fields \"value\" }}'\n\nvar idTag = 'alertID'\n\nvar levelTag = 'level'\n\nvar messageField = 'message'\n\nvar durationField = 'duration'\n\nvar outputDB = 'chronograf'\n\nvar outputRP = 'autogen'\n\nvar outputMeasurement = 'alerts'\n\nvar triggerType = 'threshold'\n\nvar crit = 70\n\nvar data = stream\n |from()\n .database(db)\n .retentionPolicy(rp)\n .measurement(measurement)\n .groupBy(groupBy)\n .where(whereFilter)\n |eval(lambda: \"usage_idle\")\n .as('value')\n\nvar trigger = data\n |alert()\n .crit(lambda: \"value\" \u003e crit)\n .message(message)\n .id(idVar)\n .idTag(idTag)\n .levelTag(levelTag)\n .messageField(messageField)\n .durationField(durationField)\n .stateChangesOnly()\n\ntrigger\n |eval(lambda: float(\"value\"))\n .as('value')\n .keep()\n |influxDBOut()\n .create()\n .database(outputDB)\n .retentionPolicy(outputRP)\n .measurement(outputMeasurement)\n .tag('alertName', name)\n .tag('triggerType', triggerType)\n\ntrigger\n |httpOut('output')\n", "TimestampCreate": 1639587026, "TimestampModify": 1639587026 }
Deploy alerts to an edge node
Now that we have created our alert rules and added them Edge Builder, we can deploy these alerts to an edge node.
-
Deploy all alerts in an alert group to an edge node using the following:
The output is similar to the following:edgebuilder-cli alert deploy -n node1 --alertDefinitionLabel AlertGroup1 -l Alert1
INFO: Deploying alerts: +---------------------------+--------------------------------------+-----------+--------------------------------------+--------+ | ALERT NAME | ALERT ID | STATUS | NODE ID | LABELS | +---------------------------+--------------------------------------+-----------+--------------------------------------+--------+ | MemAlert-definition-node1 | 2b401a1e-c8a0-465f-8fed-6c7cc39e8093 | Deploying | c34f061f-3621-415f-af77-eb9278876e1a | Alert1 | | CPUAlert-definition-node1 | 34161118-2f45-49ec-822d-6fddb8b374c6 | Deploying | c34f061f-3621-415f-af77-eb9278876e1a | Alert1 | +---------------------------+--------------------------------------+-----------+--------------------------------------+--------+ INFO: Viewing 2 result(s)
-
Confirm that the alerts have been added by running the following command:
The output is similar to the following:edgebuilder-cli alert view -A
INFO: Finding all alerts... INFO: *** Alert View Results *** +---------------------------+--------------------------------------+---------+-----------+-----------------------+--------+ | ALERT NAME | ALERT ID | STATUS | NODE NAME | ALERT DEFINITION NAME | LABELS | +---------------------------+--------------------------------------+---------+-----------+-----------------------+--------+ | MemAlert-definition-node1 | 2b401a1e-c8a0-465f-8fed-6c7cc39e8093 | Enabled | node1 | MemAlert-definition | Alert1 | | CPUAlert-definition-node1 | 34161118-2f45-49ec-822d-6fddb8b374c6 | Enabled | node1 | CPUAlert-definition | Alert1 | +---------------------------+--------------------------------------+---------+-----------+-----------------------+--------+ INFO: Viewing 2 result(s)
Note
- The status of the alert is initially shown as
Deploying
. After a few seconds, you can re-issue the command and the status will have changed toEnabled
- The status of the alert is initially shown as
Enable and disable alerts
In the section we demonstrate how to do some basic management of alerts once they have been deployed.
-
Disable one of the alerts using the following command:
The output is similar to the following:edgebuilder-cli alert update --alert CPUAlert-definition-node1 --disable
INFO: Processing 1 alerts...
-
Ensure that the alert has been disabled by running the following and ensuring the status has been set to
Disabled
:The output is similar to the following:edgebuilder-cli alert view --alert CPUAlert-definition-node1
INFO: Finding alerts: ["CPUAlert-definition-node1"] INFO: *** Alert View Results *** +---------------------------+--------------------------------------+----------+-----------+-----------------------+--------+ | ALERT NAME | ALERT ID | STATUS | NODE NAME | ALERT DEFINITION NAME | LABELS | +---------------------------+--------------------------------------+----------+-----------+-----------------------+--------+ | CPUAlert-definition-node1 | 34161118-2f45-49ec-822d-6fddb8b374c6 | Disabled | node1 | CPUAlert-definition | Alert1 | +---------------------------+--------------------------------------+----------+-----------+-----------------------+--------+ INFO: Viewing 1 result(s)
-
Re-enable the alert using the following command:
The output is similar to the following:edgebuilder-cli alert update --alert CPUAlert-definition-node1 --enable
INFO: Processing 1 alerts...
-
Ensure that the alert has been re-enabled by running the following and ensuring the status has been set to
Enabled
:edgebuilder-cli alert view --alert CPUAlert-definition-node1
INFO: Finding alerts: [CPUAlert-definition-node1] INFO: *** Alert View Results *** +---------------------------+--------------------------------------+--------------------------------------+--------------------------------------+----------+ | NAME | ID | NODE ID | ALERT DEFINITION ID | STATUS | +---------------------------+--------------------------------------+--------------------------------------+--------------------------------------+----------+ | CPUAlert-definition-node1 | 32e37a57-a499-4564-b047-0c8578a99309 | 1ba2970d-0a27-42e9-9e95-fed7ecf76e6c | 20ab28c3-7b11-48ee-ba3b-17997cbe1a5c | Enabled | +---------------------------+--------------------------------------+--------------------------------------+--------------------------------------+----------+ INFO: Viewing 1 result(s)
Viewing triggered alerts
In the section we demonstrate how the output of alerts are returned.
-
If a node has an active event present on it, it appears in the
Warning
state. You can view a node's status by running:The output is similar to the following:edgebuilder-cli node view -n node1
INFO: Finding nodes: ["node1"] INFO: *** Node View Results *** +-------+--------------------------------------+---------+------------+--------------+------------------------------------------+----------------+ | NAME | ID | STATUS | DEBUG MODE | METRICS MODE | LABELS | DESCRIPTION | +-------+--------------------------------------+---------+------------+--------------+------------------------------------------+----------------+ | node1 | c34f061f-3621-415f-af77-eb9278876e1a | Warning | Off | Live | Kernel:Linux, Virtual:VirtualBox, CPUArc | virtual node 1 | | | | | | | h:x86_64, GroupName:root, Language:en_US | | | | | | | | , UserName:root, Hostname:node1, OSArch: | | | | | | | | amd64, OSFinger:Ubuntu-20.04, ProductNam | | | | | | | | e:VirtualBox, label2, CPUModel:Intel(R)C | | | | | | | | ore(TM)i9-9900CPU@3.10GHz, label1, Manuf | | | | | | | | acturer:innotekGmbH, OSFamily:Debian, Ti | | | | | | | | mezone:UTC | | +-------+--------------------------------------+---------+------------+--------------+------------------------------------------+----------------+ INFO: Viewing 1 result(s)
-
To view active events present on the node, run the following command:
The output is similar to the following:edgebuilder-cli event view -n node1
An event has been triggered warning us that the CPU usage on node1 has exceeded the threshold set in our CPU alert rule.INFO: Finding events for nodes: ["node1"] INFO: *** Event View Results *** +--------------------------------------+----------+--------------------------------------+--------+-----------------------------------------+-------------------------------+ | ID | TYPE | NODE ID | ACTIVE | SUMMARY | TIMESTAMP CREATE | +--------------------------------------+----------+--------------------------------------+--------+-----------------------------------------+-------------------------------+ | aa5d9834-0daf-4cda-8925-34d1029bb32b | Resource | c34f061f-3621-415f-af77-eb9278876e1a | true | Alert Message: cpu is 98.9733059548264 | 2021-12-15 16:56:19 +0000 UTC | | 28a4e4fd-20f3-4061-874d-0521f7b1158b | Resource | c34f061f-3621-415f-af77-eb9278876e1a | true | Alert Message: cpu is 98.9733059548264 | 2021-12-15 16:56:19 +0000 UTC | +--------------------------------------+----------+--------------------------------------+--------+-----------------------------------------+-------------------------------+ INFO: Viewing 3 result(s)
Adding and removing alert labels
In the above example we automatically added a label to an alert definition when adding it to Edge Builder. To do this manually, complete the following steps:
-
Remove our alert definitions from the alert label using the following command:
The output is similar to the following:edgebuilder-cli alertDefinition labels -d CPUAlert-definition,MemAlert-definition --remove AlertGroup1
INFO: Labels updated successfully for alertDefinition "MemAlert-definition" INFO: Labels updated successfully for alertDefinition "CPUAlert-definition"
-
Ensure that the alert definitions labels have been removed using the following command:
The output is similar to the following:edgebuilder-cli alertDefinition inspect -d CPUAlert-definition
{ "Alerts": [ { "AlertDefinitionID": "c9c7ad5c-ea91-4d36-80b7-ed9b80b08490", "CreatedBy": "iotech", "ID": "34161118-2f45-49ec-822d-6fddb8b374c6", "Labels": null, "ModifiedBy": "iotech", "Name": "CPUAlert-definition-node1", "NodeID": "c34f061f-3621-415f-af77-eb9278876e1a", "Status": 5, "TimestampCreate": 1639587440, "TimestampModify": 1639587570 } ], "CreatedBy": "iotech", "Description": "A CPU alert", "ID": "c9c7ad5c-ea91-4d36-80b7-ed9b80b08490", "ModifiedBy": "iotech", "Name": "CPUAlert-definition", "Tickscript": "var db = 'node1'\n\nvar rp = 'autogen'\n\nvar measurement = 'cpu'\n\nvar groupBy = []\n\nvar whereFilter = lambda: (\"cpu\" == 'cpu-total') AND isPresent(\"usage_idle\")\n\nvar name = 'CPU Alert'\n\nvar idVar = name\n\nvar message = '{{.Name}} is {{ index .Fields \"value\" }}'\n\nvar idTag = 'alertID'\n\nvar levelTag = 'level'\n\nvar messageField = 'message'\n\nvar durationField = 'duration'\n\nvar outputDB = 'chronograf'\n\nvar outputRP = 'autogen'\n\nvar outputMeasurement = 'alerts'\n\nvar triggerType = 'threshold'\n\nvar crit = 70\n\nvar data = stream\n |from()\n .database(db)\n .retentionPolicy(rp)\n .measurement(measurement)\n .groupBy(groupBy)\n .where(whereFilter)\n |eval(lambda: \"usage_idle\")\n .as('value')\n\nvar trigger = data\n |alert()\n .crit(lambda: \"value\" \u003e crit)\n .message(message)\n .id(idVar)\n .idTag(idTag)\n .levelTag(levelTag)\n .messageField(messageField)\n .durationField(durationField)\n .stateChangesOnly()\n\ntrigger\n |eval(lambda: float(\"value\"))\n .as('value')\n .keep()\n |influxDBOut()\n .create()\n .database(outputDB)\n .retentionPolicy(outputRP)\n .measurement(outputMeasurement)\n .tag('alertName', name)\n .tag('triggerType', triggerType)\n\ntrigger\n |httpOut('output')\n", "TimestampCreate": 1639587312, "TimestampModify": 1639587922 }
-
Add the labels to the alert definitions using the following command:
The output is similar to the following:edgebuilder-cli alertDefinition labels -d CPUAlert-definition,MemAlert-definition --add AlertGroup1
INFO: Labels updated successfully for alertDefinition "MemAlert-definition" INFO: Labels updated successfully for alertDefinition "CPUAlert-definition"
-
Ensure that the label have been successfully added using the following command:
The output is similar to the following:edgebuilder-cli alertDefinition inspect -d CPUAlert-definition
{ "Alerts": [ { "AlertDefinitionID": "c9c7ad5c-ea91-4d36-80b7-ed9b80b08490", "CreatedBy": "iotech", "ID": "34161118-2f45-49ec-822d-6fddb8b374c6", "Labels": null, "ModifiedBy": "iotech", "Name": "CPUAlert-definition-node1", "NodeID": "c34f061f-3621-415f-af77-eb9278876e1a", "Status": 5, "TimestampCreate": 1639587440, "TimestampModify": 1639587570 } ], "CreatedBy": "iotech", "Description": "A CPU alert", "ID": "c9c7ad5c-ea91-4d36-80b7-ed9b80b08490", "Labels": [ "AlertGroup1" ], "ModifiedBy": "iotech", "Name": "CPUAlert-definition", "Tickscript": "var db = 'node1'\n\nvar rp = 'autogen'\n\nvar measurement = 'cpu'\n\nvar groupBy = []\n\nvar whereFilter = lambda: (\"cpu\" == 'cpu-total') AND isPresent(\"usage_idle\")\n\nvar name = 'CPU Alert'\n\nvar idVar = name\n\nvar message = '{{.Name}} is {{ index .Fields \"value\" }}'\n\nvar idTag = 'alertID'\n\nvar levelTag = 'level'\n\nvar messageField = 'message'\n\nvar durationField = 'duration'\n\nvar outputDB = 'chronograf'\n\nvar outputRP = 'autogen'\n\nvar outputMeasurement = 'alerts'\n\nvar triggerType = 'threshold'\n\nvar crit = 70\n\nvar data = stream\n |from()\n .database(db)\n .retentionPolicy(rp)\n .measurement(measurement)\n .groupBy(groupBy)\n .where(whereFilter)\n |eval(lambda: \"usage_idle\")\n .as('value')\n\nvar trigger = data\n |alert()\n .crit(lambda: \"value\" \u003e crit)\n .message(message)\n .id(idVar)\n .idTag(idTag)\n .levelTag(levelTag)\n .messageField(messageField)\n .durationField(durationField)\n .stateChangesOnly()\n\ntrigger\n |eval(lambda: float(\"value\"))\n .as('value')\n .keep()\n |influxDBOut()\n .create()\n .database(outputDB)\n .retentionPolicy(outputRP)\n .measurement(outputMeasurement)\n .tag('alertName', name)\n .tag('triggerType', triggerType)\n\ntrigger\n |httpOut('output')\n", "TimestampCreate": 1639587312, "TimestampModify": 1639588018 }