Monitoring Server Status

Use these checks to monitor your servers in real time

Types of Monitoring

There are two ways to monitor your nodes:

  1. Onscreen Monitoring - requires logging in to the server using tools like gLiveView

  2. Remote Monitoring - view graphs, statistics and alerts remotely using tools like Grafana/Prometheus or Uptime Robot

There are advantages to both methods.

From an operational perspective, it is best to use remote monitoring for basic checks. If everything is working correctly, you will only need to login to your nodes if you notice something unusual. There are a lot of tutorials on Grafana/Prometheus so I will not cover this at this point.

It should be noted that although initial monitoring can be done remotely, doing the advanced checks listed here will require you to login and perform commands on a terminal.

Onscreen Monitoring with gLiveView

One of the easiest way to monitor your node is via gLiveView. This is a tool created by Guild Operators to visually see the status of your nodes. A link to the full documentation including installation steps can be found here.

Some of the things that you should look out for when you view gLiveView are:

  1. Node Type - this is found at the top of the screen and will indicate Relay for Relays or Core for Block Producer nodes. If your Block Producer is showing that it is a relay, you most probably have a wrong configuration. Check your port settings and startup scripts.

  2. Port - this is the port number of your node. Make sure this number in the ”env” file matches your actual port number in your startup script.

  3. Status - will show "starting" if the node is not yet fully up. When the node is up, this label will be changed to "Tip (diff)"

  4. Processed TX - this shows how many transactions were processed since the node was started. This number should steadily increase. May take some time in testnet

  5. Mempool TX/Bytes - should show some transactions moving. May take some time in testnet

  6. Peers Out/In (BP nodes) - both Out and In should equal the number of your own relays. For example if you have 1 relay, Out and In should be 1 only.

  7. Peers Out/In (Relay nodes) - These are variable numbers. Out is managed through your topology file. If your node has limited memory, try to keep this number below 15.

If any of the above values are incorrect, please look back at your node and investigate further. Some things to look into are:

  • Startup scripts - check port number and other settings

  • env file - this is used by gLiveView. Check port number and other settings

Remote Monitoring with Prometheus/Grafana

The ideal way of monitoring your nodes is remotely. Of course, if you are already working in front of the terminal (e.g. upgrading your nodes), it is better to use an onscreen tool like gLiveview. However, for all other times when you are away from the terminal, it is better to have remote monitoring setup.

For remote monitoring, the usual go-to tools are Prometheus and Grafana. These tool work hand in hand in gathering the data, visualizing it in dashboards respectively.

The following are useful resources for setting up Prometheus/Grafana:

  1. Tutorial on Prometheus/Grafana Setup by CoinCashew

2. Setting Telegram Alerts on Grafana by LVLUP Pool

3. Official Grafana website

Remote Monitoring of Uptime

If you are only interested in being notified when your nodes are down or inacessible, a simpler solution is to use a third party tool called Uptime Robot. To use this tool, just do the following:

  1. Go to https://uptimerobot.com and create an account

  2. Login and create monitors for all of your nodes. You can create monitors for the port numbers that you use and/or create monitors by IP address

  3. Set the polling frequency. You may poll as fast as every 5 minutes for their free account

  4. You may also install their app to your mobile phone for easier access and to get notifications while you are on the go.

  5. You will get email notications whenever there is a change of status. For example, if your node goes down or if it is already down, when your node comes back up.

Optional Tip: If you want to monitor the status of even the relays that you are connecting to, you can create entries for each of them. Setting this up is time consuming and may be overkill but it helps to keep you aware immediatelyat Ely if a faulty external relay is having issues.

The above tool is very simple and easy to use especially if you want to avoid the complexity and/or additional overhead of Grafana/Prometheus. an added bonus is it is an external tool so it does not add any workload on your nodes like Grafana/Prometheus.

Remote Monitoring Using Telegram

While you can create your own monitors using various tools and output the notifications to social media, there are already existing tools that do that. Some pool explorers have already built robots that send notification via Telegram whenever there are changes to the pool. Some commonly notified changes include:

  • Pool parameter changes (e.g. pledge, margin, fixed cost, etc)

  • Block production (e.g. whenever a new block is produced)

  • Delegation levels (e.g. when stake is added or removed)

  • Epoch summary (e.g. pool statistics for the current and past 2 epochs)

Two pool explorers that offer these services include:

  1. Pooltool - connect to @PoolToolBot in Telegram and enter the name of the pool you wish to monitor

  2. Adapools - you need to create an account and login to adapools.org, then enable the Telegram bot. The bot’s name in Telegram is @AdaPoolsOrg_bot

I suggest you look into these tools before building a new one yourself. You might just save yourself a lot of time which you can devote to other pool activities like maintaining your nodes or marketing your pool.

Last updated