In a previous article1 we discussed Dirk, Attestant’s distributed remote keymanager. Dirk provides users with high levels of security and availability for validator keys, however it needs to be integrated into a staking infrastructure for them to obtain its benefits. At the same time, as users start to build their own staking infrastructures, they find that switching between beacon nodes (a necessary function to ensure service resiliency and client diversity) is difficult to achieve.
This article introduces Vouch, a multi-node validator client built by Attestant.
Before proceeding, it is worth taking a minute to explain the difference between a validator and a validator client as this is a common cause of confusion. A validator is a virtual entity on the beacon chain. It has an activation time, a balance, and various other properties. A validator client is a piece of software with the ability to act on behalf of a validator, usually by access to the validator’s private key. A single validator client can act on behalf of many validators. Sometimes the terms are conflated: a common statement is that “validator \(x\) proposed a block” when a more accurate statement is “a validator client proposed a block on behalf of validator \(x\)”. In general, the former is used as shorthand for the latter; however, the distinction is important when discussing validators and validator clients explicitly. This article uses each term in their formal definition.
What does a validator client do?
Validator clients are a critical component of the Ethereum 2 beacon chain. The validator client is responsible for the process of proposing new blocks for the chain, as well as attesting to blocks produced by other validator clients in order to establish their authenticity and accuracy 2.
As can be seen from the diagram, the validator client sits between the beacon node and the signer. Taking the block proposal part of its responsibilities as an example, it:
- obtains information from the beacon node to decide when it should propose new blocks;
- fetches the components of the block from the beacon node at the appropriate time;
- provides details of the block to the signer for signing; and
- presents the signed block to the beacon node for it to broadcast to the network.
It carries out a similar process when attesting to blocks proposed by other validators. Whereas the beacon node knows things, the validator client does things, and the Ethereum 2 chain cannot operate without them.
Why use a dedicated validator client?
The various teams that have built beacon nodes (Lighthouse, Teku, Prysm etc.) have also built validator clients. These clients are designed in tandem with the beacon nodes, and are built by people with the deepest understanding of the Ethereum 2 protocol and operations. So why consider a dedicated validator client, such as Vouch, over one provided by the client teams?
As mentioned in the article for Dirk, the validator client and signer functions commonly reside in the same codebase. Attestant provides two separate products, Vouch for the validator client and Dirk for signer, resulting in lower complexity in each of the products, a cleaner architecture, and the ability to separate security domains.
The Ethereum 2 network has benefited massively from having multiple independent implementations of the beacon node that communicate with each other. However, validator clients built by beacon node teams only need to talk to their respective beacon nodes, with which they often share the same codebase. This closeness of the validator client to the beacon node in many implementations results in inevitable overlap: configuration directories and files are shared, assumptions are made about APIs and their return values, and in the most extreme cases errors can propagate through the beacon node and validator client without checks in place. An example of this was the recent Medalla issue, where both the prysm beacon node and prysm validator client were affected by the same clock skew issue. If the codebases had been separate there is every chance that the difference in the times between the two systems would have been caught, resulting in a more robust response to the problem.
Another side-effect of the overlap is that configuration and operation information can be difficult to separate. For example, movement of validators from one validator client to another can involve a combination of copying files, deleting database entries, running commands, and similar. Such processes are time-consuming and error-prone, and make flexible architectures harder to achieve. A separate dedicated validator client will not have such dependencies, and is far more of a self-contained unit.
The closeness of implementation also commonly results in trust assumptions between validator clients and beacon nodes. For example, a beacon node may verify the information it obtains from its peers and so the validator client treats information it receives from the beacon node as “already verified”. Although this is a valid position to take as far as the client teams are concerned, it does mean that a successful attack on the beacon node can “infect” the validator client with bad information that can result in the validator client missing or carrying out incorrect operations.
Some users may consider that these areas are not of major concern, or that they are happy with using a beacon node and validator client from the same team due to the fact that these are the common methods of deployment, but for those with large stakes or who are running a service for others, a multi-node dedicated validator client such as Vouch will allow them to achieve the highest levels of security and availability.
Principles of operation
When designing software it is important to understand its focus: what is this software trying to achieve? In Vouch’s case the following principles of operation apply:
- security: no assumed trust of the data provided by the beacon node
- performance: managing operations in a timely fashion
- availability: allowing multiple instances to run concurrently; accessing multiple beacon nodes for maximum uptime
- visibility: providing high levels of visibility into the result and performance of operations undertaken
Similar to Dirk, these principles guide the inclusion (or not) of the features and functions of Vouch.
What is Vouch?
Vouch is a validator client. It is designed to work in tandem with Dirk to provide the highest levels of security and availability for staking infrastructures.
Vouch needs to be as lightweight as possible: its job should be to work out what to do and when to do it, but to hand off as much of the work as possible to the beacon nodes and signers. By reducing the amount of code in Vouch the potential for failure is reduced, and it also benefits from advances in beacon nodes performance and stability.
Vouch works with multiple beacon node implementations: it can operate with Prysm, Lighthouse and Teku today, and will support other beacon nodes over time.
Swapping from one beacon node to another simply requires pointing it at a different beacon node. The user need not worry about movement of keys, multiple instances running at the same time, and other manual operations.
No trust assumptions
Vouch plays its part in securing validator funds by providing its own layer of protection. It is strict in terms of validating the information it receives, ensuring that erroneous or anomalous data is caught and rejected early rather than creating incorrect attestations or proposals.
Vouch’s ability to work with multiple nodes overcomes an issue unearthed in another recent article that evaluated the performance of beacon nodes in proposing blocks. Its conclusion was that any of the beacon nodes examined would be a good choice, but there was no single obvious “best” choice. A better answer would be “don’t choose”: use all of the beacon nodes to ensure diversity and increase reliability.
Vouch is designed to work with multiple beacon nodes using strategies. A strategy allows Vouch intelligently to decide which actions to take in which situations, in order to provide the optimum validating service. A simple example of this can be seen if we look at obtaining a block to sign. A traditional infrastructure would look roughly like this:
However, there are multiple potential failure points here because in this example Vouch is linked to only a single beacon node. The node could be out-of-date with respect to the current state of the chain. It may have been shut down for an upgrade, or be carrying out some other task that means it isn’t able to provide the requested block. Perhaps it’s under attack. Regardless, the beacon node is a single point of failure when it comes to requesting information. Even if the beacon node is working perfectly, it may have missed critical information on the network that means the block it will propose is suboptimal in terms of its usefulness to the network (and hence its rewards).
Vouch uses a multi-node design, where it works with multiple nodes at the same time:
Here, Vouch has a connection to three separate beacon nodes. These may be beacon nodes from three different teams, three instances of the same beacon node in different geographical regions, or any mix. When it is time to propose a block, Vouch will talk to all three beacon nodes to request a block from each. The strategy is then to consider each of the blocks, and select one of them based on the strategy’s own internal metrics, weight up such items as the number of attestations included, their inclusion distance, etc. By carefully selecting the “best” block each time it is asked to propose, Vouch ensures that an individual beacon node providing non-optimal blocks does not result in the validator losing out on rewards.
This strategy also provides much higher levels of availability. If a beacon node becomes unavailable for any reason, Vouch is able to use the information from the remaining available beacon nodes:
Here, although beacon node 1 is unavailable Vouch can obtain blocks from beacon nodes 2 and 3 and proceed as above. As and when beacon node 1 becomes available once more it will again be queried for blocks.
Vouch’s strategies are modular: multiple strategies can be provided for different situations, and the user can select between them with configuration options. Equally, strategies can easily be removed in favor of talking to a single beacon node if the user prefers. Strategies apply to connections to beacon nodes, selection of blocks to propose, submissions of signed blocks and attestations, and more. Strategies allow Vouch to be extensible, evolving with the mainnet.
Designed for production
Vouch is designed to run in a production environment, and as such it has been designed to make life as easy as possible.
All configuration options are available through environment variables, command-line flags or a dedicated configuration file.
Vouch identifies itself to Dirk through use of a certificate3. Because this certificate can be considered confidential, Vouch supports the use of Majordomo to store the certificates remotely, for example Google Secret Manager or AWS Secrets Manager. Central management of these certificates brings additional security and allows them to be revoked or altered easily and remotely.
Again following in the footsteps of Dirk, Vouch provides metrics focused on providing the information that an operations department needs to know. They provide a focus on activity and performance, tracking each of the activities that Vouch performs and presenting the aggregate metrics in the standard Prometheus format. The metrics have been designed to allow you to see easily how well Vouch is performing at its assigned tasks, resulting in the ability to provide operations dashboards and alerting systems that zero in on the information you need to ensure your validating infrastructure is operating correctly and in a timely fashion.
A strict release methodology is a great help when attempting to run a production infrastructure. Vouch will have a clear separation between patch releases and feature releases, to avoid situations where upgrading to a new release to patch a bug also involves configuration changes, API alterations, and other items that make what should be a quick fix turn into a major operational headache.
This methodology follows semantic versioning, and ensures that users can separate out bug fixes from functional upgrades. It uses the same methodology as Dirk, to ensure consistency around products.
Vouch is available now on the Attestant github site, where further information is available about its release status as well as instructions on how to install and use it in various configurations. It is available under a permissive open source license, allowing use in a wide variety of environments. The community for Vouch is in the #vouch channel of the Attestant Discord. Vouch is in a less mature state than Dirk, with a number of features under active development, however it is expected to reach version 1.0 before mainnet launch.
Vouch provides a multi-node validator client with a focus on security and availability. Together with Dirk it forms the basis of Attestant’s validating infrastructure: we will continue to provide updates and support for these products up to and beyond mainnet launch.
Which should ideally be read prior to reading this article.↩
It is also responsible for aggregating, or verifying the aggregation of, attestations to avoid blocks from increasing significantly in size as the number of validators increases.↩
See the article on Dirk for more information about certificates.↩