Register and monitor external services with Consul ESM
Consul's service discovery features are a versatile solution for monitoring application health, tracking services and nodes, and keeping a current catalog of healthy service instances. When you register your services to Consul or monitor your application health with distributed checks, a Consul client agent runs on the same node as your service. This client agent performs service and node health checks for the Consul servers and provides a local interface with the Consul DNS service.
Consul binary releases support a wide range of platforms and architectures, allowing many kinds of services and nodes to join a datacenter. However, there are situations where it may not be possible to install a Consul agent on the node that hosts a service. In these situations, the node must run outside of the Consul datacenter. Examples of these external services include:
- third-party SaaS services, such as Amazon RDS or Azure Database for PostgreSQL
- legacy services that do not permit installing third-party software on the node
- services managed by other teams, where you do not have access to the underlying node
In this tutorial, you will learn how to register an external service to your Consul datacenter and make it discoverable using the Consul DNS interface. You will accomplish these tasks using Consul External Service Monitor (ESM), a tool that helps run health checks and update the status of those health checks in the catalog.
Tutorial scenario
This tutorial uses HashiCups, a demo coffee shop application made up of several microservices running on VMs.
Your Consul datacenter will interact with four HashiCups services:
- NGINX runs on a node named
hashicups-nginx-0
- Frontend runs on a node named
hashicups-frontend-0
- API runs on a node named
hashicups-api-0
- Database runs on two nodes named
hashicups-db-0
andhashicups-db-1
At the start of the tutorial, the NGINX, Frontend, and API services are already registered in Consul catalog and are discoverable by their downstream services with Consul DNS. The two instances of the database service are not registered in the Consul catalog because they run on external nodes that cannot join the datacenter.
As a result, the API service is configured to communicate with one of the database instances using its IP address.
The node consul-esm-0
is also registered with the Consul catalog, but starts with no services running.
The initial scenario in this tutorial includes a Consul datacenter with HashiCups deployed. There are two instances of the Database service running on nodes that are not registered with Consul.
Because HashiCups requires the API service to communicate with the database service, the API service uses the IP address of one of the database nodes. Without the Consul agent, it is not possible to load balance between two different nodes.
Prerequisites
This tutorial assumes you are already familiar with Consul and its core functionalities. If you are new to Consul refer to refer to the Consul Getting Started tutorials collection.
If you want to follow along with this tutorial and you do not already have the required infrastructure in place, the following steps guide you through the process to deploy a demo application and a configured Consul service mesh on AWS automatically using Terraform.
To create a Consul deployment on AWS using Terraform, you need the following:
Clone GitHub repository
Clone the GitHub repository containing the configuration files and resources.
$ git clone https://github.com/hashicorp-education/learn-consul-external-services-vms
Enter the directory that contains the configuration files for this tutorial.
$ cd learn-consul-external-services-vm/self-managed/infrastructure/aws
Create infrastructure
With these Terraform configuration files, you are ready to deploy your infrastructure.
Issue the terraform init
command from your working directory to download the
necessary providers and initialize the backend.
$ terraform init
Initializing the backend...
Initializing provider plugins...
...
Terraform has been successfully initialized!
...
Then, deploy the resources. Confirm the run by entering yes
.
$ terraform apply -var-file=../../ops/conf/monitor_external_services_with_consul_esm.tfvars
## ...
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
## ...
Apply complete! Resources: 50 added, 0 changed, 0 destroyed.
Tip
The Terraform deployment could take up to 15 minutes to complete. Feel free to explore the next sections of this tutorial while waiting for the environment to complete initialization.
After the deployment is complete, Terraform returns a list of outputs you can use to interact with the newly created environment.
Outputs:
connection_string = "ssh -i certs/id_rsa.pem admin@`terraform output -raw ip_bastion`"
ip_bastion = "<redacted-output>"
remote_ops = "export BASTION_HOST=<redacted-output>"
retry_join = "provider=aws tag_key=ConsulJoinTag tag_value=auto-join-hcoc"
ui_consul = "https://<redacted-output>:8443"
ui_grafana = "http://<redacted-output>:3000/d/hashicups/hashicups"
ui_hashicups = "http://<redacted-output>"
The Terraform outputs provide useful information, including the bastion host IP address. The following is a brief description of the Terraform outputs:
- The
ip_bastion
provides IP address of the bastion host you use to run the rest of the commands in this tutorial. - The
remote_ops
lists the bastion host IP, which you can use access the bastion host. - The
retry_join
output lists Consul'sretry_join
configuration parameter. - The
ui_consul
output lists the Consul UI address. The Consul UI is not currently running. You will use the Consul UI in this tutorial to verify services registered in your catalog. - The
ui_grafana
output lists the Grafana UI address. You will not use this address in this tutorial. - The
ui_hashicups
output lists the HashiCups UI address. You can open this address in a web browser to verify the HashiCups demo application is running properly.
Check initial state in Consul UI
Retrieve the Consul UI address from Terraform.
$ terraform output -raw ui_consul
Open the address in a browser.
After accepting the certificate presented by the Consul server, you will land on the Services page.
Notice that the Services page only shows the Consul service and three services for the HashiCups application. The Database service is not registered in the Consul catalog.
Click on the Nodes tab.
The Database nodes do not appear. The consul-esm-0
node appears because a properly configured Consul client agent is already running on the node, even though there is no service running.
Login into the bastion host VM
Login to the bastion host using ssh
.
$ ssh -i certs/id_rsa.pem admin@`terraform output -raw ip_bastion`
#...
admin@bastion:~$
Configure CLI to interact with Consul
Configure your bastion host to communicate with your Consul environment using the two dynamically generated environment variable files.
$ source "/home/admin/assets/scenario/env-scenario.env" && \
source "/home/admin/assets/scenario/env-consul.env"
After loading the needed variables, verify you can connect to your Consul datacenter.
$ consul members
Node Address Status Type Build Protocol DC Partition Segment
consul-server-0 172.21.0.7:8301 alive server 1.18.1 2 dc1 default <all>
consul-esm-0 172.21.0.6:8301 alive client 1.18.1 2 dc1 default <default>
hashicups-api-0 172.21.0.2:8301 alive client 1.18.1 2 dc1 default <default>
hashicups-frontend-0 172.21.0.4:8301 alive client 1.18.1 2 dc1 default <default>
hashicups-nginx-0 172.21.0.9:8301 alive client 1.18.1 2 dc1 default <default>
Verify downstream service configuration
The hashicups-db
service instances are not part of Consul catalog. This means the upstream services that need to connect to them must use their IP address. Verify the configuration for the hashicups-api
service instance.
Login to hashicups-api-0
from the bastion host.
$ ssh -i certs/id_rsa hashicups-api-0
#..
admin@hashicups-api-0:~
Verify hashicups-api
application configuration.
$ cat ~/conf.json
{
"db_connection": "host=172.21.0.8 port=5432 user=hashicups password=hashicups_pwd dbname=products sslmode=disable",
"bind_address": ":9090",
"metrics_address": ":9103"
}
The host
value for the db_connection
is set to the IP address of one of the hashicups-db
instances. As a result, the HashiCups application is unable to scale by using multiple instances of the Database service, and it is prone to errors if the IP address of the instance changes.
To continue with the tutorial, exit the SSH session to return to the bastion host.
$ exit
logout
Connection to hashicups-api-0 closed.
admin@bastion:~$
Register external services
In the context of Consul, external services run on nodes where you cannot run a local Consul agent. These nodes might be inside your infrastructure but not directly configurable, such as with a mainframe, virtual appliance, or an unsupported platform. These nodes can also be outside of your infrastructure, such as with a SaaS platform. When registering an external service, it is also necessary to register the node it runs on.
For this reason, the configuration for an external service registered directly with the catalog is slightly different than the one for an internal service registered by an agent. Because this scenario contains two hashicups-db
nodes, you will create two configuration files for the two different nodes and services.
Create the service configuration files
Create the folder that will contain the external service definition.
$ mkdir -p ~/assets/scenario/conf/external-services
Retrieve the IP address for the hashicups-db-0
node.
$ export IP_HASHICUPS_DB_0=`getent hosts hashicups-db-0 | awk '{print $1}'`
Then, create the configuration for hashicups-db-0
that includes the networking information, service definition, and health check to run.
$ tee ~/assets/scenario/conf/external-services/hashicups-db-0.json > /dev/null << EOF
{
"Datacenter": "$CONSUL_DATACENTER",
"Node": "hashicups-db-0-ext",
"ID": "`cat /proc/sys/kernel/random/uuid`",
"Address": "${IP_HASHICUPS_DB_0}",
"NodeMeta": {
"external-node": "true",
"external-probe": "true"
},
"Service": {
"ID": "hashicups-db-0",
"Service": "hashicups-db",
"Tags": [
"external",
"inst_0"
],
"Address": "${IP_HASHICUPS_DB_0}",
"Port": 5432
},
"Checks": [{
"CheckID": "hashicups-db-0-check",
"Name": "hashicups-db check",
"Status": "passing",
"ServiceID": "hashicups-db-0",
"Definition": {
"TCP": "${IP_HASHICUPS_DB_0}:5432",
"Interval": "5s",
"Timeout": "1s",
"DeregisterCriticalServiceAfter": "60m"
}
}]
}
EOF
Complete the same process for hashicups-db-1
. Retrieve the IP address of the node.
$ export IP_HASHICUPS_DB_1=`getent hosts hashicups-db-1 | awk '{print $1}'`
Finally, create the configuration for hashicups-db-1
node with relative service and health checks.
$ tee ~/assets/scenario/conf/external-services/hashicups-db-1.json > /dev/null << EOF
{
"Datacenter": "$CONSUL_DATACENTER",
"Node": "hashicups-db-1-ext",
"ID": "`cat /proc/sys/kernel/random/uuid`",
"Address": "${IP_HASHICUPS_DB_1}",
"NodeMeta": {
"external-node": "true",
"external-probe": "true"
},
"Service": {
"ID": "hashicups-db-1",
"Service": "hashicups-db",
"Tags": [
"external",
"inst_1"
],
"Address": "${IP_HASHICUPS_DB_1}",
"Port": 5432
},
"Checks": [{
"CheckID": "hashicups-db-1-check",
"Name": "hashicups-db check",
"Status": "passing",
"ServiceID": "hashicups-db-1",
"Definition": {
"TCP": "${IP_HASHICUPS_DB_1}:5432",
"Interval": "5s",
"Timeout": "1s",
"DeregisterCriticalServiceAfter": "60m"
}
}]
}
EOF
Observe the following details in this configuration:
- Because you are defining a node that does not have a local Consul agent, you must include the
Datacenter
value so that Consul knows where to register the node and service. - The node is identified using the
Node
,ID
, andAddress
parameters. - The
NodeMeta
parameter is necessary forconsul-esm
to identify the external nodes that will require periodic health checks.- For Consul ESM to detect external nodes and health checks, set
"external-node": "true"
in the node metadata before you register it. - The
external-probe
field determines if Consul ESM performs regular pings to the node to update its health status.
- For Consul ESM to detect external nodes and health checks, set
- The
Status
field inside theChecks
section is set topassing
. When registering a service, Consul sets the initial status tocritical
and then updates the status after it performs its first health check. In the case of external services, since no health check is performed untilconsul-esm
is running, the service will remain in thecritical
state until you startconsul-esm
process. For the scope of this tutorial, you will register the service with an initial state ofpassing
so that you can avoid downtime in the application. However, we recommend that you register services with acritical
health status in production environments to ensure that service instances are healthy before Consul can route traffic to them.
For a full list of service configuration parameters refer to Services configuration reference and Health check configuration reference.
Register the external services in the Consul catalog
Register the nodes in Consul catalog using the /v1/catalog/register
endpoint. First, register hashicups-db-0
.
$ curl --silent \
--header "X-Consul-Token: $CONSUL_HTTP_TOKEN" \
--connect-to server.${CONSUL_DATACENTER}.${CONSUL_DOMAIN}:8443:consul-server-0:8443 \
--cacert ${CONSUL_CACERT} \
--data @/home/admin/assets/scenario/conf/external-services/hashicups-db-0.json \
--request PUT \
https://server.${CONSUL_DATACENTER}.${CONSUL_DOMAIN}:8443/v1/catalog/register
If the registration is performed correctly the command will output true
.
Then, register hashicups-db-1
.
$ curl --silent \
--header "X-Consul-Token: $CONSUL_HTTP_TOKEN" \
--connect-to server.${CONSUL_DATACENTER}.${CONSUL_DOMAIN}:8443:consul-server-0:8443 \
--cacert ${CONSUL_CACERT} \
--data @/home/admin/assets/scenario/conf/external-services/hashicups-db-1.json \
--request PUT \
https://server.${CONSUL_DATACENTER}.${CONSUL_DOMAIN}:8443/v1/catalog/register
If the registration is performed correctly the command will output true
.
Verify service and node registration
After you register the services and nodes, you can verify they are present in the Consul catalog.
Get the list of nodes for your Consul datacenter.
$ consul catalog nodes -detailed
Node ID Address DC TaggedAddresses Meta
consul-esm-0 3993c409-ab35-80a9-f4f7-8cb3f3e9085b 172.21.0.4 dc1 lan=172.21.0.4, lan_ipv4=172.21.0.4, wan=172.21.0.4, wan_ipv4=172.21.0.4 consul-network-segment=, consul-version=1.18.1
consul-server-0 885bdc1d-b58f-9cc4-1df1-1c68a0605e41 172.21.0.5 dc1 lan=172.21.0.5, lan_ipv4=172.21.0.5, wan=172.21.0.5, wan_ipv4=172.21.0.5 consul-network-segment=, consul-version=1.18.1
hashicups-api-0 3d15d86e-4e82-0c1b-0210-a0302f439565 172.21.0.8 dc1 lan=172.21.0.8, lan_ipv4=172.21.0.8, wan=172.21.0.8, wan_ipv4=172.21.0.8 consul-network-segment=, consul-version=1.18.1
hashicups-db-0-ext f8d9e662-37da-4810-a374-6a77e8024c4c 172.21.0.6 dc1 external-node=true, external-probe=true
hashicups-db-1-ext 7800e840-d65c-4f4c-b29a-29e7331306df 172.21.0.9 dc1 external-node=true, external-probe=true
hashicups-frontend-0 a6ccbdad-39c2-e132-aef6-9be1271822ad 172.21.0.2 dc1 lan=172.21.0.2, lan_ipv4=172.21.0.2, wan=172.21.0.2, wan_ipv4=172.21.0.2 consul-network-segment=, consul-version=1.18.1
hashicups-nginx-0 a9789d6b-ea74-7f18-67d0-50ee26838ed7 172.21.0.7 dc1 lan=172.21.0.7, lan_ipv4=172.21.0.7, wan=172.21.0.7, wan_ipv4=172.21.0.7 consul-network-segment=, consul-version=1.18.1
The two new nodes hashicups-db-0-ext
and hashicups-db-1-ext
are now present and in the Meta
section you can verify the external-node
and external-probe
metadata.
Get the list of services for your Consul datacenter.
$ consul catalog services -tags
consul
hashicups-api inst_0
hashicups-db external,inst_0,inst_1
hashicups-frontend inst_0
hashicups-nginx inst_0
Verify domain name resolution and load balancing
Once the service is present in the Consul catalog, Consul DNS interface will be able to resolve the service correctly.
Login to hashicups-api-0
from the bastion host.
$ ssh -i certs/id_rsa hashicups-api-0
#..
admin@hashicups-api-0:~
Verify that you can resolve all instances of hashicups-db
services using Consul DNS interface.
$ dig hashicups-db.service.dc1.consul
; <<>> DiG 9.18.24-1-Debian <<>> hashicups-db.service.dc1.consul
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44982
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;hashicups-db.service.dc1.consul. IN A
;; ANSWER SECTION:
hashicups-db.service.dc1.consul. 0 IN A 172.21.0.5
hashicups-db.service.dc1.consul. 0 IN A 172.21.0.8
;; Query time: 0 msec
;; SERVER: 172.21.0.7#53(172.21.0.7) (UDP)
;; WHEN: Wed May 22 14:11:42 UTC 2024
;; MSG SIZE rcvd: 92
Notice the dig
command returns two IPs. Consul will load balance requests across all available instances of the service.
Verify that you can resolve hashicups-db-0-ext
node using Consul.
$ dig hashicups-db-0-ext.node.dc1.consul
; <<>> DiG 9.16.48-Debian <<>> hashicups-db-0-ext.node.dc1.consul
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60590
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;hashicups-db-0-ext.node.dc1.consul. IN A
;; ANSWER SECTION:
hashicups-db-0-ext.node.dc1.consul. 0 IN A 10.0.4.159
;; Query time: 0 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Thu Jun 06 14:35:58 UTC 2024
;; MSG SIZE rcvd: 79
Verify you can connect to the first instance of hashicups-db
service using Consul FQDN.
$ PGPASSWORD=hashicups_pwd \
psql -P pager=off \
-d products \
-U hashicups \
-h inst_0.hashicups-db.service.dc1.consul \
-c "select * from coffees;"
Your output should appear similar to the following example:
id | name | teaser | collection | origin | color | description | price | image | created_at | updated_at | deleted_at
----+---------------------+--------------------------------------------------------------+-------------+-------------+---------+-------------+-------+----------------+---------------------+---------------------+------------
1 | HCP Aeropress | Automation in a cup | Foundations | Summer 2020 | #444 | | 200 | /hashicorp.png | 2024-05-22 00:00:00 | 2024-05-22 00:00:00 |
2 | Packer Spiced Latte | Packed with goodness to spice up your images | Origins | Summer 2013 | #1FA7EE | | 350 | /packer.png | 2024-05-22 00:00:00 | 2024-05-22 00:00:00 |
3 | Vaulatte | Nothing gives you a safe and secure feeling like a Vaulatte | Foundations | Spring 2015 | #FFD814 | | 200 | /vault.png | 2024-05-22 00:00:00 | 2024-05-22 00:00:00 |
4 | Nomadicano | Drink one today and you will want to schedule another | Foundations | Fall 2015 | #00CA8E | | 150 | /nomad.png | 2024-05-22 00:00:00 | 2024-05-22 00:00:00 |
5 | Terraspresso | Nothing kickstarts your day like a provision of Terraspresso | Origins | Summer 2014 | #894BD1 | | 150 | /terraform.png | 2024-05-22 00:00:00 | 2024-05-22 00:00:00 |
6 | Vagrante espresso | Stdin is not a tty | Origins | 2010 | #0E67ED | | 200 | /vagrant.png | 2024-05-22 00:00:00 | 2024-05-22 00:00:00 |
7 | Connectaccino | Discover the wonders of our meshy service | Origins | Spring 2014 | #F44D8A | | 250 | /consul.png | 2024-05-22 00:00:00 | 2024-05-22 00:00:00 |
8 | Boundary Red Eye | Perk up and watch out for your access management | Discoveries | Fall 2020 | #F24C53 | | 200 | /boundary.png | 2024-05-22 00:00:00 | 2024-05-22 00:00:00 |
9 | Waypointiato | Deploy with a little foam | Discoveries | Fall 2020 | #14C6CB | | 250 | /waypoint.png | 2024-05-22 00:00:00 | 2024-05-22 00:00:00 |
(9 rows)
To continue with the tutorial, exit the SSH session to return to the bastion host.
$ exit
logout
Connection to hashicups-api-0 closed.
admin@bastion:~$
You will now stop one instance of the hashicups-db
service and verify Consul behavior.
Login to hashicups-db-0
from the bastion host.
$ ssh -i certs/id_rsa hashicups-db-0
#..
admin@hashicups-db-0:~
Stop the first instance of hashicups-db
service instance.
$ ./start_service.sh stop
Stop pre-existing instances.
Service instance stopped.
To continue with the tutorial, exit the SSH session to return to the bastion host.
$ exit
logout
Connection to hashicups-db-0 closed.
admin@bastion:~$
Login to hashicups-api-0
from the bastion host.
$ ssh -i certs/id_rsa hashicups-api-0
#..
admin@hashicups-api-0:~
Verify available instances of hashicups-db
services using Consul.
Notice the dig
command still returns two IPs. This is because health checks are not performed periodically.
$ dig hashicups-db.service.dc1.consul
; <<>> DiG 9.18.24-1-Debian <<>> hashicups-db.service.dc1.consul
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42128
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;hashicups-db.service.dc1.consul. IN A
;; ANSWER SECTION:
hashicups-db.service.dc1.consul. 0 IN A 172.21.0.5
hashicups-db.service.dc1.consul. 0 IN A 172.21.0.8
;; Query time: 0 msec
;; SERVER: 172.21.0.7#53(172.21.0.7) (UDP)
;; WHEN: Wed May 22 14:11:42 UTC 2024
;; MSG SIZE rcvd: 92
Verify your connection to the hashicups-db-0-ext
service instance.
$ PGPASSWORD=hashicups_pwd \
psql -P pager=off \
-d products \
-U hashicups \
-h inst_0.hashicups-db.service.dc1.consul \
-c "select * from coffees;"
This time the command fails. This failure occurs because Consul has not performed a health check on the service instance yet, so it is still marked as healthy in the catalog.
psql: error: connection to server at "inst_0.hashicups-db.service.dc1.consul" (172.21.0.6), port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
To continue with the tutorial, exit the SSH session to return to the bastion host.
$ exit
logout
Connection to hashicups-api-0 closed.
admin@bastion:~$
This completes the first part of the tutorial, where you learned how to register external services with Consul. You also learned how to verify that external services are resolved using the Consul DNS interface. Finally, you learned that you cannot rely on the Consul catalog to return healthy instances of external services.
In the next section of the tutorial you will introduce consul-esm
in your datacenter and use it to periodically check the health of your external services.
Create ACL token for Consul ESM
Consul ESM requires access to several Consul resources to operate properly:
agent:read
- to check version compatibility and calculate network coordinates.key:write
- to store assigned health checks, the path is defined in the configuration file.node:write
- to update the status of each node thatconsul-esm
monitors.node:read
- to retrieve list of nodes that need to be monitored.service:write
- to registerconsul-esm
service defined in the configuration file.session:write
- to acquireconsul-esm
cluster leader lock.
When configuring the ACL policy for consul-esm
, you can decide whether to assign fine-tuned permissions to the token to grant permissions to the specific services and nodes you want to monitor, or to use a more permissive ACL policy that permits consul-esm
to monitor newly added services without the need to modify specific ACL permissions.
First, create the rules for the ACL policy.
$ tee ~/assets/scenario/conf/acl-policy-consul-esm-strict.hcl > /dev/null << EOF
# To check version compatibility and calculating network coordinates
# Requires at least read for the agent API for the Consul node
# where consul-esm is registered
agent "consul-esm-0" {
policy = "read"
}
# To store assigned checks
key_prefix "consul-esm/" {
policy = "write"
}
# To update the status of each node monitored by consul-esm
# Requires one acl block per node
node_prefix "hashicups-db" {
policy = "write"
}
# To retrieve nodes that need to be monitored
node_prefix "" {
policy = "read"
}
# To register consul-esm service
service_prefix "consul-esm" {
policy = "write"
}
# To update health status for external service hashicups-db
service "hashicups-db" {
policy = "write"
}
# To acquire consul-esm cluster leader lock when used in HA mode
session "consul-esm-0" {
policy = "write"
}
EOF
Then, create the ACL policy using the rules defined in the previous step.
$ consul acl policy create \
-name 'acl-policy-consul-esm-strict' \
-description 'Policy for consul-esm' \
-rules @/home/admin/assets/scenario/conf/acl-policy-consul-esm-strict.hcl > /dev/null 2>&1
Finally, create the ACL token for the consul-esm
instance.
$ consul acl token create \
-description 'consul-esm token' \
-policy-name acl-policy-consul-esm-strict \
--format json > /home/admin/assets/scenario/conf/secrets/acl-token-consul-esm.json 2> /dev/null
After you create the token, export it in an environment variable.
$ export CONSUL_ESM_TOK=`cat /home/admin/assets/scenario/conf/secrets/acl-token-consul-esm.json | jq -r ".SecretID"`
Configure Consul ESM
After you create the ACL token for the consul-esm
agent, create the configuration file.
$ tee ~/assets/scenario/conf/consul-esm-0/consul-esm-config.hcl > /dev/null << EOF
log_level = "DEBUG"
log_json = false
instance_id = "`cat /proc/sys/kernel/random/uuid`"
consul_service = "consul-esm"
consul_kv_path = "consul-esm/"
external_node_meta {
"external-node" = "true"
}
node_reconnect_timeout = "72h"
node_probe_interval = "10s"
disable_coordinate_updates = false
http_addr = "localhost:8500"
token = "${CONSUL_ESM_TOK}"
datacenter = "${CONSUL_DATACENTER}"
client_address = "127.0.0.1:8080"
ping_type = "udp"
passing_threshold = 0
critical_threshold = 0
EOF
For a full configuration reference refer to Consul ESM configuration.
Copy the configuration file on the consul-esm-0
node.
$ rsync -av \
-e "ssh -i /home/admin/certs/id_rsa" \
/home/admin/assets/scenario/conf/consul-esm-0/consul-esm-config.hcl \
admin@consul-esm-0:/home/admin/consul-esm-config.hcl
That output should be similar to the following example.
sending incremental file list
consul-esm-config.hcl
sent 626 bytes received 35 bytes 1,322.00 bytes/sec
total size is 504 speedup is 0.76
Start Consul ESM
To start Consul ESM, log in to consul-esm-0
from the bastion host.
$ ssh -i certs/id_rsa consul-esm-0
#..
admin@consul-esm-0:~
Verify that Consul ESM the configuration was copied to the node correctly.
$ cat /home/admin/consul-esm-config.hcl
log_level = "DEBUG"
log_json = false
instance_id = "e6564d2e-2bda-442a-b4e6-5f59dd5e8185"
consul_service = "consul-esm"
consul_kv_path = "consul-esm/"
external_node_meta {
"external-node" = "true"
}
node_reconnect_timeout = "72h"
node_probe_interval = "10s"
disable_coordinate_updates = false
http_addr = "localhost:8500"
token = "14392d0b-e418-e713-fa40-0dc96e961b7d"
datacenter = "dc1"
client_address = "127.0.0.1:8080"
ping_type = "udp"
passing_threshold = 0
critical_threshold = 0
After you confirm that the configuration is correct, start consul-esm
to run as a long-lived process.
$ consul-esm -config-file=consul-esm-config.hcl > /tmp/consul-esm.log 2>&1 &
The process starts in the background. You can check the logs for the process in the log file specified in the configuration.
$ cat /tmp/consul-esm*.log
[INFO] consul-esm: Connecting to Consul: address=localhost:8500
[WARN] consul-esm: unable to determine Consul server version, check for compatibility; requires >= 1.4.1
[DEBUG] consul-esm: Consul agent and all servers are running compatible versions with ESM
Consul ESM running!
Datacenter: "dc1"
Service: "consul-esm"
Service Tag: ""
Service ID: "consul-esm:a8c7d327-3cd1-4123-85c9-f25dfe23635a"
Node Reconnect Timeout: "72h0m0s"
Disable coordinates: false
Statsd address: ""
Metrix prefix: ""
Log data will now stream in as it occurs:
[DEBUG] consul-esm: Registered ESM service with Consul
[INFO] consul-esm: Trying to obtain leadership...
[INFO] consul-esm: Obtained leadership
[INFO] consul-esm: Updating external node list: items=2
[INFO] consul-esm: Rebalanced external nodes across ESM instances: nodes=2 instances=1
[INFO] consul-esm: Fetched nodes from catalog: count=2
[DEBUG] consul-esm: Now waiting between node pings: time=5s
[INFO] consul-esm: found check: name="hashicups-db check"
[INFO] consul-esm: found check: name="hashicups-db check"
[DEBUG] consul-esm: Added TCP check: checkHash=hashicups-db-0-ext/hashicups-db-0/hashicups-db-0-check
[DEBUG] consul-esm: Added TCP check: checkHash=hashicups-db-1-ext/hashicups-db-1/hashicups-db-1-check
[INFO] consul-esm: Updated checks: count=2 found=2 added=2 updated=0 removed=0
[INFO] consul-esm: Health check counts changed: healthChecks=2 nodes=2
[DEBUG] consul-esm: Check status updated: check=hashicups-db-1-ext/hashicups-db-1/hashicups-db-1-check status=passing
[WARN] consul-esm: Check socket connection failed: check=hashicups-db-0-ext/hashicups-db-0/hashicups-db-0-check error="dial tcp 172.21.0.6:5432: connect: connection refused"
[WARN] consul-esm: Check is now critical: check=hashicups-db-0-ext/hashicups-db-0/hashicups-db-0-check
##...
The output verifies that Consul ESM is able to retrieve the two nodes that are going to be monitored, alongside their respective checks. Also, notice that the check for hashicups-db-0
is failing, which is expected because you stopped the service earlier in this tutorial.
To continue with the tutorial, exit the SSH session to return to the bastion host.
$ exit
logout
Connection to consul-esm-0 closed.
admin@bastion:~$
Verify service state and load balancing
After Consul ESM starts, verify the UI starts showing a failing instance of hashicups-db
service.
Also notice that a new service, consul-esm
, is now present in Consul catalog.
When the health check marks the service instance as failing, Consul DNS will stop including the instance in the results.
Login to hashicups-db-0
from the bastion host.
$ for i in `seq 1 100` ; do \
dig @consul-server-0 -p 8600 hashicups-db.service.dc1.consul +short | head -1; \
done | sort | uniq -c
The output should appear similar to the following example. You may notice that all request are now forwarded to the healthy instance.
100 172.21.0.5
To confirm that Consul ESM reacts to changes in service state, you will now restart the hashicups-db
instance.
Login to hashicups-db-0
from the bastion host.
$ ssh -i certs/id_rsa hashicups-db-0
#..
admin@hashicups-db-0:~
Start the hashicups-db
instance.
$ ./start_service.sh start --consul
Stop pre-existing instances.
START - Start services on all interfaces.
START CONSUL - Starts the service using Consul service name for upstream services (using LB functionality).
NOT APPLICABLE FOR THIS SERVICE - No Upstreams to define.
Start service instance.
Reloading config to listen on all available interfaces.
To continue with the tutorial, exit the SSH session to return to the bastion host.
$ exit
logout
Connection to hashicups-db-0 closed.
admin@bastion:~$
Log in to hashicups-api-0
from the bastion host.
$ ssh -i certs/id_rsa hashicups-api-0
#..
admin@hashicups-api-0:~
Verify that you can resolve all instances of hashicups-db
services using Consul.
$ dig hashicups-db.service.dc1.consul
; <<>> DiG 9.18.24-1-Debian <<>> hashicups-db.service.dc1.consul
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50874
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;hashicups-db.service.dc1.consul. IN A
;; ANSWER SECTION:
hashicups-db.service.dc1.consul. 0 IN A 172.21.0.8
hashicups-db.service.dc1.consul. 0 IN A 172.21.0.5
;; Query time: 0 msec
;; SERVER: 172.21.0.7#53(172.21.0.7) (UDP)
;; WHEN: Wed May 22 14:11:58 UTC 2024
;; MSG SIZE rcvd: 92
Notice the dig
command returns two IPs. Consul will load balance requests across all available instances of the service.
The same information can be retrieved from Consul UI, where the hashicups-db
service now shows two healthy instances.
Now that Consul ESM takes care of the health monitoring for the external services, you can use Consul service discovery to configure your downstream services.
Restart hashicups-api
to use Consul FQDN instead of IP addresses.
$ ./start_service.sh start --consul
Stop pre-existing instances.
hashicups-api-payments
hashicups-api-product
hashicups-api-public
START - Start services on all interfaces.
START CONSUL - Starts the service using Consul service name for upstream services (using LB functionality).
Start service instance.
Service started to listen on all available interfaces.
{
"db_connection": "host=hashicups-db.service.dc1.consul port=5432 user=hashicups password=hashicups_pwd dbname=products sslmode=disable",
"bind_address": ":9090",
"metrics_address": ":9103"
}
bc46f6e4c7267bb1a266135cd19504b6167f4bafac0ea583036122ced11c7fa2
35fc5b73a599d807bcb50fe7e26e52f0d7145b0a4c08d9c7065aef98b67f6f3f
170a050675ba637f2db8f8d09de5ecb4c6a525f5085b972448594c5cba438cc4
From the output you can verify that now the host
parameter for the DB connection is set to hashicups-db.service.dc1.consul
To continue with the tutorial, exit the ssh session to return to the bastion host.
$ exit
logout
Connection to hashicups-api-0 closed.
admin@bastion:~$
Monitor an HTTP endpoint with Consul ESM
Some application deployments rely on external HTTP endpoints to retrieve deploy information. These endpoints are usually considered always-on and are not monitored inside the deploy workflow. As a result, errors may occur when a failure in the remote endpoint happens.
Registering these endpoints in the Consul catalog and monitoring them with Consul ESM allows you to standardize your processes and leverage additional Consul features, such as watches and events, to build more resilient workflows.
For example, HashiCorp supports the releases.hashicorp.com and checkpoint.hashicorp.com endpoints. These endpoints are often integrated in the tool installation process when version information is required.
You will now learn how to register those endpoints in Consul and to monitor their health using Consul ESM.
Generate a UUID to use as node ID for the external services.
$ export RANDOM_UUID=`cat /proc/sys/kernel/random/uuid`
Create the service definition for an external node with the relative services and health checks configured. First, for releases.hashicorp.com
:
$ tee ~/assets/scenario/conf/external-services/hashicorp-releases.json > /dev/null << EOF
{
"Datacenter": "$CONSUL_DATACENTER",
"Node": "hashicorp",
"ID": "${RANDOM_UUID}",
"Address": "hashicorp.com",
"NodeMeta": {
"external-node": "true"
},
"Service": {
"ID": "releases.hashicorp.com",
"Service": "hashicorp-releases",
"Tags": [
"external",
"deploy"
],
"Address": "releases.hashicorp.com",
"Port": 443
},
"Checks": [{
"CheckID": "releases.hashicorp.com",
"Name": "releases.hashicorp.com check",
"Status": "warning",
"ServiceID": "releases.hashicorp.com",
"Definition": {
"http": "https://releases.hashicorp.com",
"Interval": "30s",
"Timeout": "10s"
}
}]
}
EOF
Then, for checkpoint.hashicorp.com
:
$ tee ~/assets/scenario/conf/external-services/hashicorp-checkpoint.json > /dev/null << EOF
{
"Datacenter": "$CONSUL_DATACENTER",
"Node": "hashicorp",
"ID": "${RANDOM_UUID}",
"Address": "hashicorp.com",
"NodeMeta": {
"external-node": "true"
},
"Service": {
"ID": "checkpoint.hashicorp.com",
"Service": "hashicorp-checkpoint",
"Tags": [
"external",
"deploy"
],
"Address": "checkpoint.hashicorp.com",
"Port": 443
},
"Checks": [{
"CheckID": "checkpoint.hashicorp.com",
"Name": "checkpoint.hashicorp.com check",
"Status": "warning",
"ServiceID": "checkpoint.hashicorp.com",
"Definition": {
"http": "https://checkpoint.hashicorp.com",
"Interval": "30s",
"Timeout": "10s"
}
}]
}
EOF
Note
These service definitions do not include "external-probe": "true"
in the NodeMeta
field. As a result, Consul ESM will only perform the health checks detailed in the service definition and will not check node health. Because node checks are performed by pinging the external node, we omit the external-probe
parameter to prevent checks from failing when the underlying URL does not support a ping.
Register releases.hashicorp.com
.
$ curl --silent \
--header "X-Consul-Token: $CONSUL_HTTP_TOKEN" \
--connect-to server.${CONSUL_DATACENTER}.${CONSUL_DOMAIN}:8443:consul-server-0:8443 \
--cacert ${CONSUL_CACERT} \
--data @/home/admin/assets/scenario/conf/external-services/hashicorp-checkpoint.json \
--request PUT \
https://server.${CONSUL_DATACENTER}.${CONSUL_DOMAIN}:8443/v1/catalog/register
The command outputs true
when the registration is performed correctly.
Register checkpoint.hashicorp.com
.
$ curl --silent \
--header "X-Consul-Token: $CONSUL_HTTP_TOKEN" \
--connect-to server.${CONSUL_DATACENTER}.${CONSUL_DOMAIN}:8443:consul-server-0:8443 \
--cacert ${CONSUL_CACERT} \
--data @/home/admin/assets/scenario/conf/external-services/hashicorp-releases.json \
--request PUT \
https://server.${CONSUL_DATACENTER}.${CONSUL_DOMAIN}:8443/v1/catalog/register
The command outputs true
when the registration is performed correctly.
Update ACL permissions for Consul ESM
If you assigned strict permissions to the Consul ESM ACL token, you must update the token to include permissions for the new nodes and services.
With strict ACLs, Consul ESM is not able to discover the new nodes and services that represent the releases.hashicorp.com
and checkpoint.hashicorp.com
endpoints.
As a result, Consul marks these services with a warning
status when you register them, and Consul ESM is not be able to perform checks on them or to change their status into passing
.
To make Consul ESM able to monitor the services, an extended policy with permissions for nodes and services is required.
$ tee ~/assets/scenario/conf/acl-policy-consul-esm-strict-addendum.hcl > /dev/null << EOF
# To check version compatibility and calculating network coordinates
# Requires at least read for the agent API for the Consul node
# where consul-esm is registered
agent "consul-esm-0" {
policy = "read"
}
# To store assigned checks
key_prefix "consul-esm/" {
policy = "write"
}
# To update the status of each node monitored by consul-esm
# Requires one acl block per node
node_prefix "hashicups-db" {
policy = "write"
}
node_prefix "hashicorp" {
policy = "write"
}
# To retrieve nodes that need to be monitored
node_prefix "" {
policy = "read"
}
# To register consul-esm service
service_prefix "consul-esm" {
policy = "write"
}
service "hashicups-db" {
policy = "write"
}
service_prefix "hashicorp-" {
policy = "write"
}
# To acquire consul-esm cluster leader lock when used in HA mode
session "consul-esm-0" {
policy = "write"
}
EOF
Update the policy associated with Consul ESM token to use these new rules.
$ consul acl policy update \
-name "acl-policy-consul-esm-strict" \
-rules @/home/admin/assets/scenario/conf/acl-policy-consul-esm-strict-addendum.hcl > /dev/null 2>&1
This command produces no output.
With the right ACL permissions, Consul ESM will automatically pick the new services from Consul catalog and retrieve the health checks to perform. The Consul ESM logs will show some information regarding the update.
To inspect Consul ESM logs, login to consul-esm-0
from the bastion host.
$ ssh -i certs/id_rsa consul-esm-0
#..
admin@consul-esm-0:~
Logs for Consul ESM are stored in the /tmp
folder.
$ cat /tmp/consul-esm*.log
## ...
[INFO] consul-esm: Updating external node list: items=3
[INFO] consul-esm: Fetched nodes from catalog: count=3
## ...
[INFO] consul-esm: found check: name="hashicups-db check"
[INFO] consul-esm: found check: name="hashicups-db check"
[INFO] consul-esm: found check: name="checkpoint.hashicorp.com check"
[INFO] consul-esm: found check: name="releases.hashicorp.com check"
[INFO] consul-esm: found check: name="hashicups-db check"
[INFO] consul-esm: found check: name="hashicups-db check"
[DEBUG] consul-esm: Added HTTP check: checkHash=hashicorp/checkpoint.hashicorp.com/checkpoint.hashicorp.com
[DEBUG] consul-esm: Added HTTP check: checkHash=hashicorp/releases.hashicorp.com/releases.hashicorp.com
[INFO] consul-esm: Updated checks: count=4 found=4 added=2 updated=0 removed=0
[INFO] consul-esm: Health check counts changed: healthChecks=4 nodes=3
[INFO] consul-esm: found check: name="checkpoint.hashicorp.com check"
[INFO] consul-esm: found check: name="releases.hashicorp.com check"
[INFO] consul-esm: found check: name="hashicups-db check"
[INFO] consul-esm: found check: name="hashicups-db check"
## ...
[INFO] consul-esm: Fetched nodes from catalog: count=3
## ...
[DEBUG] consul-esm: Check status updated: check=hashicorp/checkpoint.hashicorp.com/checkpoint.hashicorp.com status=passing
[INFO] consul-esm: Updating output and status for: checkID=checkpoint.hashicorp.com
[INFO] consul-esm: found check: name="checkpoint.hashicorp.com check"
[INFO] consul-esm: found check: name="releases.hashicorp.com check"
[INFO] consul-esm: found check: name="hashicups-db check"
[INFO] consul-esm: found check: name="hashicups-db check"
[INFO] consul-esm: found check: name="checkpoint.hashicorp.com check"
[INFO] consul-esm: found check: name="releases.hashicorp.com check"
[INFO] consul-esm: found check: name="hashicups-db check"
[INFO] consul-esm: found check: name="hashicups-db check"
## ...
[DEBUG] consul-esm: Check status updated: check=hashicorp/releases.hashicorp.com/releases.hashicorp.com status=passing
[INFO] consul-esm: Updating output and status for: checkID=releases.hashicorp.com
[INFO] consul-esm: found check: name="checkpoint.hashicorp.com check"
[INFO] consul-esm: found check: name="releases.hashicorp.com check"
[INFO] consul-esm: found check: name="hashicups-db check"
[INFO] consul-esm: found check: name="hashicups-db check"
[DEBUG] consul-esm: Check status updated: check=hashicups-db-1-ext/hashicups-db-1/hashicups-db-1-check status=passing
## ...
Tip
When adding new external nodes to Consul, Consul ESM may take up to 5 minutes to notice them and start performing the health checks. If the services are still in the `warning` state, wait a few minutes and then try again.After the health checks confirm the health of the new external services, the UI updates to show them as passing.
Destroy the infrastructure
Now that the tutorial is complete, clean up the infrastructure you created.
From the ./self-managed/infrastruture/aws
folder of the repository, use terraform
to destroy the infrastructure.
$ terraform destroy --auto-approve
Next steps
In this tutorial you learned how to register external services to your Consul datacenter and how to access them using the different interfaces provided by Consul: UI, CLI, API, and DNS. You also learned how to use Consul External Service Monitor (ESM) to run health checks over external services and nodes and to keep Consul catalog updated with the results of those health checks.
For more information about the topics covered in this tutorial, refer to the following resources:
- Consul External Service Monitor (ESM)
- Service configuration reference
- Health check configuration reference
To learn more about how Consul service discovery can simplify application deployment and load balancing in your datacenter, refer to the following resources: