google.com, pub-7590763259720133, DIRECT, f08c47fec0942fa0
 

Part 8 - Node Operator's Configuration Guide for Dummies - Monitoring & Maintenance


Part 8 - Monitoring & Maintenance

Part 8 - Monitoring & Maintenance


Watching your Validators' Performance

The next thing you'll want to track is the attestation and block proposal performance of your validators. Block explorers like Beaconcha.in watch your validator's attestation performance and income.


Navigate to that site, and enter the public key for your validator in the search box at the top of the screen.

TIP If you forgot your validator's public key, you can easily retrieve it with the command rocketpool minipool status.

If everything is set correctly, you should see something like this:


This is a record of all the Beacon Chain activity for your validator. You can use it to check your validator's balance on the Beacon Chain to watch it grow over time and calculate your APY.


You can also use it to quickly gauge if your validator is alive and running correctly. If it is, all the attestations should say Attested for their Status, and ideally all the Opt. Incl. Dist. should be 0 (though an occasional 1 or 2 is fine).


If there are numerous blocks that say Missed on them, then your validator is not working properly. You should check the logs of the eth1, eth2, and validator services with rocketpool service logs ...


You should pin this tab or create a bookmark with it, so you can quickly jump to it and check the status of your validator.

TIP Beaconcha.in has an iOS / Android app that you can download to provide this same information in a convenient, phone-friendly form. You can also create an account and register with email notifications; these will inform you via email when your validator goes offline, and you need to act.

Setting up Grafana

Assuming you had selected yes to install the metrics dashboard, we will move on to setting up Grafana. If you had selected node, you will need to refer to the official documentation, "Enabling the Metrics Server"


Now that the metrics server is ready, you can access it with any browser on your local network.


Refer to the tabs below for your Smartnode installation mode.

Navigate to the following URL, substituting the variables with your setup as necessary:

  http://<your node IP>:<grafana port> 

For example, if your node's IP was 192.168.1.5, and you used the default Grafana port of 3100, then you would go to this URL in your browser:

 http://192.168.1.5:3100 

You will see a login screen like this:


The default Grafana information is:

 Username: admin 
 Password: admin 

You will then be prompted to change the default password for the admin account. Pick something strong and don't forget it!


Tip If you lose the admin password, you can reset it using the following command on your node:

docker exec -it rocketpool_grafana grafana-cli admin reset-admin-password admin      

You will be able to log into Grafana using the default admin credentials once again, and then you will be prompted to change the password for the admin account.


Thanks to community member tedsteen's work, Grafana will automatically connect to your Prometheus instance, so it has access to the metrics that it collects. All you need to do is grab the dashboard!


Importing the Rocket Pool Dashboard

Now that you have Grafana attached to Prometheus, you can import the standard dashboard (or build your own using the metrics that it provides, if you are familiar with that process).

Start by going to the Create menu (the plus icon on the right-sidebar) and click on Import:


When prompted for the URL, select the option from the below list based on which ETH2 client you are using:

Tip If you don't remember which client you have, you can check quickly by running rocketpool service version on your node.

Enter one of the URLs above into the Import via grafana.com box and press the Load button. You will be prompted with some information about the dashboard here, such as its name and where you'd like to store it (the default General folder is fine unless you use a lot of dashboards and want to organize them).


Under the Prometheus drop-down at the bottom, you should only have a single option labeled Prometheus (default). Select this option.


Your screen should look like this (using Lighthouse as an example):


If yours matches, click the Import button, and you will be immediately taken to your new dashboard.


At first glance, you should see lots of information about your node and your validators. Each box comes with a handy tooltip on the top left corner (the i icon) that you can hover over to learn more about it. For example, here is the tooltip for the Your Validator Share box:


However, we aren't done setting things up yet - there is still a little more configuration to do.

NOTE Some boxes (notably the APR ones) calculate their values by comparing today's value with yesterday's value. Until the metrics server has been running for more than a day, these will say N/A or No data.
This is normal! Just wait a day until it has enough data to calculate its values correctly.

Tailoring the Hardware Monitor to your System

Now that the dashboard is up, you might notice that a few boxes are empty, such as SSD Latency and Network Usage. We have to tailor the dashboard to your specific hardware, so it knows how to capture these things.


CPU Temp

To update your CPU temperature gauge, click the title of the CPU Temp box and select Edit from the dropdown. Your screen will now look something like this:


This is Grafana's edit mode, where you can change what is displayed and how it looks. We're interested in the query box highlighted in red, to the right of the Metrics browser button.

By default, that box has this in it:

 node_hwmon_temp_celsius{job="node", chip="", sensor=""} 

There are two fields in this text that are currently blank: chip and sensor. These are unique to each machine, so you'll have to fill them in based on what your machine provides.


To achieve this, follow these steps:

  1. Remove the , sensor="" portion, so it ends with chip=""}. For clarity, the whole thing should now be node_hwmon_temp_celsius{job="node", chip=""}.

  2. Put your cursor in-between the quote marks of chip="" and press Ctrl+Spacebar. This will bring up an auto-complete box with the available options, which looks like this:


3. Select the option that corresponds to your system's CPU.

4. Once that's selected, add , sensor="" back into the string. Place your cursor in-between the quote marks of sensor="" and press Ctrl+Spacebar to get another auto-complete menu. Select the sensor you want to monitor.

Tip If you don't know which chip or sensor is correct, you'll have to try all of them until you find the one that looks right. To help with this, install the lm-sensors package (for example, sudo apt install lm-sensors on Debian / Ubuntu) and run the sensors -u command to provide what sensors your computer has. You can try to correlate a chip ID from Grafana's list with what you see here based on their names and IDs.
For example, this is one of the outputs of our sensors -u command:
  k10temp-pci-00c3 
  Tctl:   
   temp1_input: 33.500 
  Tdie:   
   temp2_input: 33.500 
In our case, the corresponding chip in Grafana is pci0000:00_0000:00:18_3 and the corresponding sensor is temp1.

Once you're happy with your selections, click the blue Apply button in the top-right corner of the screen to save the settings.

NOTE Not all systems expose CPU temperature info - notably virtual machines or cloud-based systems. If yours doesn't have anything in the auto-complete field for chip, this is probably the case, and you won't be able to monitor your CPU temperature.

SSD Latency

The SSD Latency chart tracks how long it takes for read/write operations to take. This is helpful in gauging how fast your SSD is, so you know if it becomes a bottleneck if your validator suffers from poor performance. To update the SSD, you want to track in the chart, click on the SSD Latency title and select Edit.


This chart has two query fields (two textboxes) with four device="" portions in total. You'll need to update all four of these fields with the device you intend to track.


Simply place your cursor in-between the quote marks and press Ctrl+Spacebar to get Grafana's auto-complete list, and select the correct option from there for each of the device="" portions. You want to start from the leftmost empty setting first, or the auto-complete list may not appear.

Tip If you don't know which device to track, run the following command
  lsblk 

This will output a tree showing your device and partition list, for example:


NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT 
... 
loop25        7:25   0   132K  1 loop /snap/gtk2-common-themes/9 
loop26        7:26   0  65,1M  1 loop /snap/gtk-common-themes/1515 
nvme0n1     259:0    0 238,5G  0 disk  
├─nvme0n1p1 259:1    0   512M  0 part /boot/efi 
├─nvme0n1p2 259:2    0 150,1G  0 part / 
├─nvme0n1p3 259:3    0  87,4G  0 part  
└─nvme0n1p4 259:4    0   527M  0 part 
If you didn't change Docker's default location to a different drive during your Smartnode installation, then the disk you want to track will be the one that your Operating System is installed on. Look in the MOUNTPOINT column for an entry simply labeled /, then follow that back up to its parent device (the one with disk in the TYPE column).
Typically, this will be sda for SATA drives or nvme0n1 for NVMe drives.
If you did change Docker's default location to a different drive, or if you're running a hybrid / native setup, you should be able to use the same technique of "following the mount point" to determine which device your chain data resides on.

Once you're happy with your selections, click the blue Apply button in the top-right corner of the screen to save the settings.


Network Usage

This chart tracks how much data you're sending and receiving over a particular network connection. As you might expect, the dashboard needs to know which network you want it to track.


To change it, click on the Network Usage title and select Edit.


This chart has two query fields with two device="" portions in total. You'll need to update these with the network you want to track.


Place your cursor in-between the quote marks and press Ctrl+Spacebar to get Grafana's auto-complete list, and select the correct option from there for each of the device="" portions.

Tip If you don't know which device to track, run the following command:
 route
The output will look something like this:
Kernel IP routing table 
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface 
default         192.168.1.1     0.0.0.0         UG    100    0        0 eth0 
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0 
192.168.1.1     0.0.0.0         255.255.255.255 UH    100    0        0 eth0 
Look at the Destination column for the row with the value if default. Follow that row all the way to the Iface column. The device listed there is the one you want to use - in this example, eth0.

Once you're happy with your selections, click the blue Apply button in the top-right corner of the screen to save the settings.


Total Net I/O

This tracks the total amount of data you've sent and received. You might find it useful to watch if, for example, your ISP limits you to a certain amount of data per month.


The setup is identical to the Network Usage box above, so simply follow those instructions for this box too.


OS Disk Space Used

This keeps tabs on how full your Operating System disk is getting, so you know when it's time to clean up (and if your chain data lives on the same drive, time to prune Geth, located further down in this guide).


The steps are the same as the SSD Latency box above, so simply follow those instructions for this box too. As a reminder, you want the drive that houses the partition which has / in the MOUNTPOINT column for this one because that will be your Operating System drive.


Disk 2 Space Used

This is an optional field that tracks the free space of a second disk on your system. It is aimed at people that keep their Operating System and chain data on separate drives, such as Raspberry Pi users.


Setting it up is the same as the SSD Latency box above, but instead of looking at which partition has / in the MOUNTPOINT column, you want to look for the one that has whatever your 2nd drive's mount point is. Use the disk associated with that partition.


Grafana SMTP Settings for Sending Emails

To send emails from Grafana, e.g. for alerts or to invite other users, SMTP settings need to be configured in the Rocket Pool Metrics Stack. See the Grafana SMTP configuration page for reference.


Open ~/.rocketpool/docker-compose-metrics.yml in a text editor. Include the below GF_SMTP_<KEYNAME> environment variables in the environment section of the grafana service:

version: "3.4"
services:
...
  grafana:
    image: grafana/grafana:8.1.1
    container_name: ${COMPOSE_PROJECT_NAME}_grafana
    restart: unless-stopped
    environment:
      - GF_SERVER_HTTP_PORT=${GRAFANA_PORT:-3100}
      ## SMTP settings start
      - GF_SMTP_ENABLED=true
      - GF_SMTP_HOST=mail.example.com:<port>
      - GF_SMTP_USER=[email protected].com
      - GF_SMTP_PASSWORD=password
      - GF_SMTP_FROM_ADDRESS=[email protected].com
      - GF_SMTP_FROM_NAME="Rocketpool Grafana Admin"
      ## SMTP server settings end
    ports: 
      - "${GRAFANA_PORT:-3100}:${GRAFANA_PORT:-3100}/tcp"
    volumes:
      - "./grafana-prometheus-datasource.yml:/etc/grafana/provisioning/datasources/prometheus.yml"
      - "grafana-storage:/var/lib/grafana"
    networks:
      - net
...

To test the SMTP settings, go to the Alerting menu and click Notification channels.


Click Add channel and select Email as the type. Enter an email address in the Addresses section and click Test.


Check to see that the test email was received.

22 views0 comments