Fault tolerance with redundancy zones

12min
|
Enterprise
Vault

Enterprise Only

The functionality described in this tutorial is available only in Vault Enterprise.

In this tutorial, you will configure fault resiliency for your Vault cluster using redundancy zones. Redundancy zones is a Vault autopilot feature that makes it possible to run one voter and any number of non-voters in each defined zone.

For this tutorial, you will use one voter and one non-voter in three zones, for a total of six servers. If an entire availability zone is completely lost, both the voter and non-voter will be lost; however, the cluster will remain available. If only the voter is lost in an availability zone, the autopilot will promote the non-voter to a voter automatically, putting the hot standby server into service quickly.

Prerequisites

This tutorial requires Vault Enterprise, sudo access, and additional configuration to create the cluster.

Vault Enterprise 1.11.0 or later

You will also need a text editor, the curl executable to test the API endpoints, and optionally the jq command to format the output for curl.

Configure Vault for redundancy zones

To demonstrate the new autopilot redundancy zones, you will start a cluster with 3 nodes each defined in a separate zone. Then, add additional node to each zone.

You will run a script to start a cluster.

vault_1 (http://127.0.0.1:8100) is initialized and unsealed. The root token creates a transit key that enables the other Vaults auto-unseal. This Vault server is not a part of the cluster.
vault_2 (http://127.0.0.1:8200) is initialized and unsealed. This Vault starts as the cluster leader and defined in a zone-a.
vault_3 (http://127.0.0.1:8300) is started and automatically joins the cluster via retry_join and defined in a zone-b.
vault_4 (http://127.0.0.1:8400) is started and automatically joins the cluster via retry_join and defined in a zone-c.

Disclaimer

For the purpose of demonstration, a script will run Vault servers locally. In practice, the redundancy zone may map to the availability zones or similar.

Setup a cluster

Retrieve the configuration by cloning the hashicorp/learn-vault-raft repository from GitHub.
```
$ git clone https://github.com/hashicorp-education/learn-vault-raft
```
This repository contains supporting content for all of the Vault learn tutorials. The content specific to this tutorial can be found within a sub-directory.
Change the working directory to learn-vault-raft/raft-redundancy-zones/local.
```
$ cd learn-vault-raft/raft-redundancy-zones/local
```
Set the setup_1.sh file to executable.
```
$ chmod +x setup_1.sh
```

Execute the setup_1.sh script to spin up a Vault cluster.

$ ./setup_1.sh

[vault_1] Creating configuration
  - creating /git/learn-vault-raft/raft-autopilot/local/config-vault_1.hcl
[vault_2] Creating configuration
  - creating /git/learn-vault-raft/raft-autopilot/local/config-vault_2.hcl
  - creating /git/learn-vault-raft/raft-autopilot/local/raft-vault_2

...snip...

[vault_3] starting Vault server @ http://127.0.0.1:8300
Using [vault_1] root token (hvs.tqKc9An04pQY5H1uysw02Xn6) to retrieve transit key for auto-unseal

[vault_4] starting Vault server @ http://127.0.0.1:8400
Using [vault_1] root token (hvs.tqKc9An04pQY5H1uysw02Xn6) to retrieve transit key for auto-unseal

You can find the server configuration files and the log files in the working directory.

Use your preferred text editor and open the configuration files to examine the generated server configuration for vault_2, vault_3 and vault_4.

Notice that autopilot_redundancy_zone parameter is set to zone-a inside the storage stanza. This is an optional string that specifies Vault's redundancy zone. This is reported to autopilot and is used to enhance scaling and resiliency.

storage "raft" {
   path    = "/learn-vault-raft/raft-redundancy-zones/local/raft-vault_2/"
   node_id = "vault_2"
   autopilot_redundancy_zone = "zone-a"
}

listener "tcp" {
   address = "127.0.0.1:8200"
   cluster_address = "127.0.0.1:8201"
   tls_disable = true
}

seal "transit" {
   address            = "http://127.0.0.1:8100"
   # token is read from VAULT_TOKEN env
   # token              = ""
   disable_renewal    = "false"
   key_name           = "unseal_key"
   mount_path         = "transit/"
}

disable_mlock = true
cluster_addr = "http://127.0.0.1:8201"

The retry_join block is configured that vault_3 nodes automatically joined the cluster. The redundancy zone is set to zone-b.

storage "raft" {
   path    = "/learn-vault-raft/raft-redundancy-zones/local/raft-vault_3/"
   node_id = "vault_3"
   autopilot_redundancy_zone = "zone-b"

   retry_join {
      leader_api_addr = "http://127.0.0.1:8200"
   }
}

listener "tcp" {
   address = "127.0.0.1:8300"
   cluster_address = "127.0.0.1:8301"
   tls_disable = true
}

seal "transit" {
   address            = "http://127.0.0.1:8100"
   # token is read from VAULT_TOKEN env
   # token              = ""
   disable_renewal    = "false"
   key_name           = "unseal_key"
   mount_path         = "transit/"
}

disable_mlock = true
cluster_addr = "http://127.0.0.1:8301"

The redundancy zone is set to zone-c for vault_4.

storage "raft" {
   path    = "/learn-vault-raft/raft-redundancy-zones/local/raft-vault_4/"
   node_id = "vault_4"
   autopilot_redundancy_zone = "zone-c"

   retry_join {
   leader_api_addr = "http://127.0.0.1:8200"
   }
}

listener "tcp" {
   address = "127.0.0.1:8400"
   cluster_address = "127.0.0.1:8401"
   tls_disable = true
}

seal "transit" {
   address            = "http://127.0.0.1:8100"
   # token is read from VAULT_TOKEN env
   # token              = ""
   disable_renewal    = "false"
   key_name           = "unseal_key"
   mount_path         = "transit/"
}

disable_mlock = true
cluster_addr = "http://127.0.0.1:8401"

Export an environment variable for the vault CLI to address the vault_2 server.
```
$ export VAULT_ADDR=http://127.0.0.1:8200
```

List the peers.

$ vault operator raft list-peers

Node       Address           State       Voter
----       -------           -----       -----
vault_2    127.0.0.1:8201    leader      true
vault_3    127.0.0.1:8301    follower    false
vault_4    127.0.0.1:8401    follower    false

Verify the cluster members. You see one node per redundancy zone.

$ vault operator members 

Host Name            API Address              ...    Redundancy Zone    Last Echo
---------            -----------              ...    ---------------    ---------
C02DVAMJML85         http://127.0.0.1:8200    ...    zone-a             n/a
C02DVAMJML85         http://127.0.0.1:8300    ...    zone-b             2022-06-14T15:40:04-07:00
C02DVAMJML85         http://127.0.0.1:8400    ...    zone-c             2022-06-14T15:40:01-07:00

View the autopilot's redundancy zones settings.

$ vault operator raft autopilot state

Output:

The overall failure tolerance is 1; however, the zone-level failure tolerance is 0.

Healthy:                         true
Failure Tolerance:               1
Leader:                          vault_2
Voters:
   vault_2
   vault_3
   vault_4
Optimistic Failure Tolerance:    1

...snip...

Redundancy Zones:
   zone-a
      Servers: vault_2
      Voters: vault_2
      Failure Tolerance: 0
   zone-b
      Servers: vault_3
      Voters: vault_3
      Failure Tolerance: 0
   zone-c
      Servers: vault_4
      Voters: vault_4
      Failure Tolerance: 0
      
...snip...

$ curl -s --header "X-Vault-Token: $(cat root_token-vault_2)" \
    $VAULT_ADDR/v1/sys/storage/raft/autopilot/state | jq -r ".data"

Output:

Each redundancy zone has one voter server, and the cluster-level failure tolerance is 1.

{
  "failure_tolerance": 1,
  "healthy": true,
  "leader": "vault_2",
  "optimistic_failure_tolerance": 1,
  "redundancy_zones": {
    "zone-a": {
      "servers": [
        "vault_2"
      ],
      "voters": [
        "vault_2"
      ]
    },
    "zone-b": {
      "servers": [
        "vault_3"
      ],
      "voters": [
        "vault_3"
      ]
    },
    "zone-c": {
      "servers": [
        "vault_4"
      ],
      "voters": [
        "vault_4"
      ]
    }
  },
  ...snip...

Add additional node to each zone

Use your preferred text editor and open the configuration files to examine the generated server configuration for vault_5, vault_6 and vault_7.

The autopilot_redundancy_zone parameter is set to zone-a inside the storage stanza. The same zone as vault_2.

storage "raft" {
  path    = "/learn-vault-raft/raft-redundancy-zones/local/raft-vault_5/"
  node_id = "vault_5"
  autopilot_redundancy_zone = "zone-a"

  retry_join {
    leader_api_addr = "http://127.0.0.1:8200"
  }
}
...snip...

The redundancy zone is set to zone-b which is same as vault_3.

storage "raft" {
   path    = "/learn-vault-raft/raft-redundancy-zones/local/raft-vault_6/"
   node_id = "vault_6"
   autopilot_redundancy_zone = "zone-b"

   retry_join {
      leader_api_addr = "http://127.0.0.1:8200"
   }
}
...snip...

The redundancy zone is set to zone-c which is same as vault_4.

storage "raft" {
   path    = "/learn-vault-raft/raft-redundancy-zones/local/raft-vault_7/"
   node_id = "vault_7"
   autopilot_redundancy_zone = "zone-c"

   retry_join {
      leader_api_addr = "http://127.0.0.1:8200"
   }
}
...snip...

Set the setup_2.sh file to executable.
```
$ chmod +x setup_2.sh
```

Execute the setup_2.sh script to add three additional nodes to the cluster.

$ ./setup_2.sh

[vault_5] starting Vault server @ http://127.0.0.1:8500
Using [vault_1] root token (hvs.Ks9HlRsyL3CbaetmHL6AJEHi) to retrieve transit key for auto-unseal

[vault_6] starting Vault server @ http://127.0.0.1:8600
Using [vault_1] root token (hvs.Ks9HlRsyL3CbaetmHL6AJEHi) to retrieve transit key for auto-unseal

[vault_7] starting Vault server @ http://127.0.0.1:8700
Using [vault_1] root token (hvs.Ks9HlRsyL3CbaetmHL6AJEHi) to retrieve transit key for auto-unseal

Check the redundancy zone memebership as the script executes.

$ vault operator raft autopilot state

Output:

Now, each redundancy zone has a failure tolerance of 1, and the cluster-level optimistic failure tolerance is 4 since there are six nodes in the cluster.

Healthy:                         true
Failure Tolerance:               1
Leader:                          vault_2
Voters:
   vault_2
   vault_3
   vault_4
Optimistic Failure Tolerance:    4

...snip...

Redundancy Zones:
   zone-a
      Servers: vault_2, vault_5
      Voters: vault_2
      Failure Tolerance: 1
   zone-b
      Servers: vault_3, vault_6
      Voters: vault_3
      Failure Tolerance: 1
   zone-c
      Servers: vault_4, vault_7
      Voters: vault_4
      Failure Tolerance: 1

...snip...

$ curl -s --header "X-Vault-Token: $(cat root_token-vault_2)" \
    $VAULT_ADDR/v1/sys/storage/raft/autopilot/state | jq -r ".data"

Output:

Now, each redundancy zone has a failure tolerance of 1, and the cluster-level optimistic failure tolerance is 4 since there are six nodes in the cluster.

{
  "failure_tolerance": 1,
  "healthy": true,
  "leader": "vault_2",
  "optimistic_failure_tolerance": 4,
  "redundancy_zones": {
    "zone-a": {
      "servers": [
        "vault_2",
        "vault_5"
      ],
      "voters": [
        "vault_2"
      ],
      "failure_tolerance": 1
    },
    "zone-b": {
      "servers": [
        "vault_3",
        "vault_6"
      ],
      "voters": [
        "vault_3"
      ],
      "failure_tolerance": 1
    },
    "zone-c": {
      "servers": [
        "vault_4",
        "vault_7"
      ],
      "voters": [
        "vault_4"
      ],
      "failure_tolerance": 1
    }
  },
  ...snip...

List the peers.

$ vault operator raft list-peers

Node       Address           State       Voter
----       -------           -----       -----
vault_2    127.0.0.1:8201    leader      true
vault_3    127.0.0.1:8301    follower    true
vault_4    127.0.0.1:8401    follower    true
vault_5    127.0.0.1:8501    follower    false
vault_6    127.0.0.1:8601    follower    false
vault_7    127.0.0.1:8701    follower    false

The vault_5, vault_6 and vault_7 nodes have joined the cluster as non-voters.

Note

There is only one voter node per zone.

Verify the cluster members.

$ vault operator members

Host Name       API Address              ...    Redundancy Zone    Last Echo
---------       -----------              ...    ---------------    ---------
C02DVAMJML85    http://127.0.0.1:8200    ...    zone-a             n/a
C02DVAMJML85    http://127.0.0.1:8300    ...    zone-b             2022-06-14T15:54:04-07:00
C02DVAMJML85    http://127.0.0.1:8400    ...    zone-c             2022-06-14T15:54:06-07:00
C02DVAMJML85    http://127.0.0.1:8500    ...    zone-a             2022-06-14T15:54:06-07:00
C02DVAMJML85    http://127.0.0.1:8600    ...    zone-b             2022-06-14T15:54:08-07:00
C02DVAMJML85    http://127.0.0.1:8700    ...    zone-c             2022-06-14T15:54:05-07:00

Test fault tolerance

Stop vault_3 to mock server failure to see how autopilot behaves.

Stop the vault_3 node.

$ ./cluster.sh stop vault_3

Found 1 Vault service(s) matching that name
[vault_3] stopping

Verify that the non-voter nodes have been removed from the cluster.

$ vault operator raft list-peers

Wait until Vault promotes vault_6 to become a voter in absense of vault_3.

Output:

Since vault_3is not running, vault_6 became the voter node.

Node       Address           State       Voter
----       -------           -----       -----
vault_2    127.0.0.1:8201    leader      true
vault_3    127.0.0.1:8301    follower    false
vault_4    127.0.0.1:8401    follower    true
vault_5    127.0.0.1:8501    follower    false
vault_6    127.0.0.1:8601    follower    true
vault_7    127.0.0.1:8701    follower    false

Check the redundancy zone memebership as the script executes.

$ vault operator raft autopilot state

Output:

The cluster's optimistic failure tolerance is down to 3, and zone-b and zone-c has zero fault tolerance.

Healthy:                         false
Failure Tolerance:               1
Leader:                          vault_2
Voters:
   vault_2
   vault_4
   vault_6
Optimistic Failure Tolerance:    3
Servers:
   vault_2
      Name:              vault_2
      Address:           127.0.0.1:8201
      Status:            leader
      Node Status:       alive
      Healthy:           true
      Last Contact:      0s
      Last Term:         3
      Last Index:        356
      Version:           1.11.0
      Upgrade Version:   1.11.0
      Redundancy Zone:   zone-a
      Node Type:         zone-voter
   vault_3
      Name:              vault_3
      Address:           127.0.0.1:8301
      Status:            non-voter
      Node Status:       alive
      Healthy:           false
      Last Contact:      3m56.800681779s
      Last Term:         3
      Last Index:        256
      Version:           1.11.0
      Upgrade Version:   1.11.0
      Redundancy Zone:   zone-b
      Node Type:         zone-standby 

...snip...

Redundancy Zones:
   zone-a
      Servers: vault_2, vault_5
      Voters: vault_2
      Failure Tolerance: 1
   zone-b
      Servers: vault_3, vault_6
      Voters: vault_6
      Failure Tolerance: 0
   zone-c
      Servers: vault_4, vault_7
      Voters: vault_4
      Failure Tolerance: 1

...snip...

$ curl -s --header "X-Vault-Token: $(cat root_token-vault_2)" \
    $VAULT_ADDR/v1/sys/storage/raft/autopilot/state | jq -r ".data"

Output:

The optimistic_failure_tolerance is now 3, and zone-b shows no failure tolerance.

{
  "failure_tolerance": 1,
  "healthy": false,
  "leader": "vault_2",
  "optimistic_failure_tolerance": 3,
  "redundancy_zones": {
    "zone-a": {
      "servers": [
        "vault_2",
        "vault_5"
      ],
      "voters": [
        "vault_2"
      ],
      "failure_tolerance": 1
    },
    "zone-b": {
      "servers": [
        "vault_6",
        "vault_3"
      ],
      "voters": [
        "vault_6"
      ]
    },
    "zone-c": {
      "servers": [
        "vault_4",
        "vault_7"
      ],
      "voters": [
        "vault_4"
      ],
      "failure_tolerance": 1
    }
  },
  "servers": {
     
    ...snip...
    
    "vault_3": {
      "id": "vault_3",
      "name": "vault_3",
      "address": "127.0.0.1:8301",
      "node_status": "alive",
      "last_contact": "6m48.800927138s",
      "last_term": 3,
      "last_index": 256,
      "healthy": false,
      "stable_since": "2022-06-15T12:45:51.823431-07:00",
      "status": "non-voter",
      "version": "1.11.0",
      "redundancy_zone": "zone-b",
      "upgrade_version": "1.11.0",
      "node_type": "zone-standby"
    },
    
    ...snip...
    
    "vault_6": {
      "id": "vault_6",
      "name": "vault_6",
      "address": "127.0.0.1:8601",
      "node_status": "alive",
      "last_contact": "1.319527019s",
      "last_term": 3,
      "last_index": 424,
      "healthy": true,
      "stable_since": "2022-06-15T12:46:45.824197-07:00",
      "status": "voter",
      "version": "1.11.0",
      "redundancy_zone": "zone-b",
      "upgrade_version": "1.11.0",
      "node_type": "zone-voter"
    },
    
  ...snip...

Note

If you want to see the cluster behavior when vault_3 become operational again, run ./cluster.sh start vault_3 to start the node. This should bring the cluster back to its healthy state. Alternatively, you can stop other nodes using the ./cluster stop <node_name> to stop other servers to watch how autopilot behaves.

Post-test discussion

Although you stopped the vault_3 node to mimic server failure, it is still listed as a peer. In reality, the node failure could be temporal, and they may become operational again. Therefore, the node remain as cluster member unless you remove them.

If the node is not recoverable, you can do one of the following:

Option 1: Manually remove nodes

Run the remove-peer command to remove the failed server.

$ vault operator raft remove-peer vault_3

Option 2: Enable dead server cleanup

Configure the dead server cleanup to automatically remove nodes deemed unhealthy. By default, the feature is disabled.

Example: The following command enables dead server cleanup. When a node remains unhealthy for 300 seconds (the default is 24 hours), Vault removes the node from the cluster.

$ vault operator raft autopilot set-config \
    -dead-server-last-contact-threshold=300 \
    -server-stabilization-time=10 \
    -cleanup-dead-servers=true \
    -min-quorum=3

See the Integrated Storage Autopilot tutorial to learn more.

Clean up

The cluster.sh script provides a clean operation that removes all services, configuration, and modifications to your local system.

Clean up your local workstation.

$ ./cluster.sh clean

Found 1 Vault service(s) matching that name
[vault_1] stopping

...snip...

Removing log file /git/learn-vault-raft/raft-autopilot/local/vault_5.log
Removing log file /git/learn-vault-raft/raft-autopilot/local/vault_6.log
Clean complete

Help and Reference

For additional information, refer to the following tutorials and documentation.

Integrated storage autopilot

Automate upgrades with Vault Enterprise