Back to Blog

Table of Contents

Highlights

Feature Gate Spotlight: Timely Vote Credits

Written By

Justin Starry

November 25, 2024

Summary

Timely Vote Credits (TVC) refers to a new feature designed and approved by Solana validator operators to incentivize validators to submit consensus votes in a timely manner. To do accomplish this goal, TVC will award extra vote "credits" to validators whose votes are included in blocks with relatively low latency. Earned vote credits are directly tied to the proportion of inflation rewards that are awarded to validators.

Motivation

Before the activation of this new feature, validators have always earned a full vote credit for each correct vote as long as their vote was received within approximately 2.5 minutes after the voted block was produced. This has led to some validators lagging their consensus votes to earn more vote credits by waiting until the supermajority confirms a block. This avoids the opportunity cost of voting on a block that was not confirmed. This behavior hurts consensus because as more validators lag their votes, confirmation latency gets worse. In the worst case, everyone is waiting for everyone else to make the first move and a supermajority can never be reached to confirm blocks.

Implementation

When recording votes, the vote program will record the slot difference of the voted slots and the slot of the block the votes were recorded in. For every recorded vote that gets rooted, the amount of awarded vote credits is a function of slot latency. There is a grace period of 2 slots meaning that votes with a slot latency of 2 slots or less will receive the maximum of 16 credits. Each additional slot of latency will result in a deduction of 1 vote credit down to a minimum of 1 credit per rooted vote.

Aside: How Do Vote Credits Map to Inflation Rewards?

The goal of inflation reward distribution is to deliver rewards to each delegated stake account proportionally to how many credits the delegated vote account earned as well as the amount of actively delegated stake. So at the end of each epoch, the Solana protocol calculates "reward points" for each actively delegated stake account by multiplying the stake delegation amount by the amount of vote credits earned by their delegated vote account. Each epoch, the amount of total distributed inflation rewards is calculated by multiplying the latest inflation rate by the amount of active stake in the network. This amount is then distributed to validators in proportion to each active stake account's earned reward points.

If every validator earned the exact same number of vote credits, inflation rewards would be distributed proportionally to active stake. But even before activating TVC there has actually been quite a bit of vote credit variance across each cluster due to how accurately each validator votes on confirmed blocks. There are consensus rules against voting on competing forks, so if a validator votes on a fork that isn't confirmed, their incorrect vote causes them to miss the vote credits on the correct fork until they're allowed to switch back to the correct fork. With TVC, an extra dimension of vote latency is taken into consideration and the amount of vote credits earned for each vote depends on how soon after the voted block the vote was included. So for example, let's say Validator A votes slowly and often votes on the wrong fork while Validator B votes quickly and accurately. Validator B will earn proportionally more vote credits than Validator A and so any stake delegated to validator A will be distributed a higher reward rate on their delegated stake than stake delegated to validator A.

Expectations

Faster Confirmation Times

Once a block has been fully produced, validators have less than a second to receive, validate, and vote on the produced block to earn full vote credits.

This has a few implications:

  1. Validators have to validate produced blocks very quickly

Validators running on slow hardware will struggle to validate blocks fast enough to vote with low enough latency to earn full vote credits.

Validators who optimize block validation by improving transaction processing scheduling algorithms will be able to submit their votes sooner.

  1. Validators can no longer lag their votes for multiple slots

Some validators have been using different voting behavior modifications that lag their votes to ensure that they earn optimal vote credits. This strategy will no longer be viable once TVC is enabled if the lagging would cause their votes to have latency that is too high.

  1. Validators need to ensure their votes are delivered

It's not always straightforward to know which leader is current producing the next block that will be confirmed by consensus, especially during times of forking.

Vote transactions are primarily delivered over UDP to the current leader. Packet loss could therefore impact latency.

Concerns

Geographic Stake Centralization

Some validators located outside of North America and Europe will likely decide to move their servers closer to areas where stake is already concentrated to decrease network latency. This is obviously far from ideal. Core developers are considering reducing this centralization risk by increasing the slot latency grace period beyond the current configuration of 2 slots to give validators in sparse stake weight regions more time to send their votes.

Validator Stake Centralization

Validators are incentivized to include votes for higher staked nodes in their blocks first to increase the stake weight of their fork. This means that higher staked nodes will tend to have lower vote latency than other nodes. Again, this is the reason we have a grace period for vote inclusion.

Voting Censorship

Once TVC is enabled, block producers may be tempted to censor or delay votes to decrease the vote credits earned by their competitors. This temptation shouldn't be a big issue for several reasons. First, any leader who completely censors votes from being included in their blocks is forgoing the transaction fees for those vote transactions. Second, any leader who consistently delays votes from particular validators can be publicly shamed or retaliated against because this behavior is publicly observable onchain. Lastly, the actual impact of targeted censorship is fairly minimal. Votes can earn up to 16 credits for low latency inclusion and there is a grace period of a few slots. So even if a leader censors votes for a few slots, they will only lose a small portion of the total possible credits.

How Can Validators Track Vote Latency?

The Vortex dashboard has features for analyzing the timely vote credits earned by each validator:

How Can Validators Improve Vote Latency?

Improve Replay Times
  1. Experiment with block verification methods.

For example, the new "unified scheduler" is available for use on Agave v2 and can be enabled via the following CLI parameter:

agave-validator --block-verification-method unified-scheduler

  1. Upgrade and configure hardware

Solana validator operators should ensure that they are running sufficiently performant hardware and that their validators are configured according to the latest recommendations.

  1. Use the default full snapshot interval

The Agave validator client has degraded block production and validation during full snapshot creation but this affects all validators. If a validator uses a custom interval, their voting could be degraded when the rest of the cluster is validating and voting on blocks at normal speeds. Validators with a custom full snapshot interval should consider removing the following CLI parameter:

agave-validator --full-snapshot-interval-slots XXXX

Improve Vote Delivery
  1. Votes are sent over UDP and therefore do not receive acks

Validators may want to consider resending votes every 100ms to ensure they weren't dropped. There is no Agave configuration parameter for this yet. PR's welcome!

  1. Send votes to the current leader

Votes are ideally sent to the current leader but that's not always straightforward. Agave currently delivers votes by fanning them out over the next few expected leader slots. This behavior might not be sufficient. If validators find a more effective strategy or can collect some metrics on this, PR's are welcome!

Aside: How Do Vote Mods Earn More Vote Credits?

Several different categories of modified voting behaviors have been observed on mainnet. But before discussing those modifications, the Anza core dev team would like to warn validator operators against running unaudited code on mainnet. This has the potential to cause unrecoverable failures in their validators or lead to slashable actions.

Vote Lagging

The first category is vote lagging. Rather than using the default behavior of attempting to voting on the most-likely-to-be-confirmed fork, some validators have implemented mods that simply delay voting until the rest of the cluster has already confirmed a block. The benefit of this strategy is obvious, such validators never vote incorrectly and miss out on vote credits due to voting on an incorrect block. But the drawback is clear too: if enough validators lag their votes, eventually everyone will be waiting to see what everyone else will vote on and confirmation times get much worse which can even result in the cluster failing to reach consensus on any new blocks. This modification strategy is clearly bad for the Solana protocol and is why TVC was implemented. TVC is designed to penalize vote laggers by awarding proportionally fewer vote credits to votes that were received with a delay.

Vote Backfilling

The second category of modifications is vote backfilling. If for some reason a validator stops voting for awhile or falls behind the cluster, they may skip voting on quite a few blocks until they catch up. If a validator modifies their voting behavior, they can attempt to vote on past blocks that were already confirmed by the cluster in order to earn more credits. The stock Agave validator behavior is not optimized to do this sort of backfilling because it doesn't really improve the health of consensus on a cluster. By default, the Agave validator creates vote transactions with the "recent blockhash" set to the blockhash of the block they are voting on. Even if they are voting on a block which is recent enough to earn vote credits, the blockhash could be old enough that the transaction is dropped. In this type of situation, modified validators that fetch a recent blockhash have an opportunity to earn some vote credits that they otherwise would have missed. After TVC is activated, these extra vote credits will be pretty negligible but still offer a slight benefit to validators that implement this mod. But this category of modification has a roughly neutral effect on cluster consensus health since block finalization doesn't require validators to vote on each block in a fork, a vote on one block of a fork is effectively a vote on all ancestor blocks as well.

Vote Lockout Adjustments

The third category of modifications is vote lockout adjustments. To understand this class of modification, one must first understand how the "tower" data structure in Solana's Tower BFT algorithm works. Every time a vote is recorded in a validator's local tower structure, any "expired" votes are removed from the tower before pushing the new vote and these expired votes do not earn any credits. This behavior comes from the mechanism used to force validators to wait until votes on an incorrect fork expire before they're allowed to start voting on a competing fork. Vote recording on-chain is done essentially the same way as tower vote updates despite it being impossible to process a vote for a competing fork in a block produced on a separate fork. In order to avoid votes expiring, validators can choose to increase the lockout expiration time of their previous votes. By doing so, they are increasing their commitment to that fork in order to avoid missing out on credits.

Vote Accuracy

Lastly, we have modifications for vote accuracy. There are likely multiple strategies in use by different competing validators but the basic idea is that validators attempt to make a more informed decision on how likely a block will be confirmed or not. They could assign scores to different leaders and predict which slot leaders are likely to have their blocks skipped or confirmed based on the scores of the current and following block leaders. These types of modifications are likely (but not necessarily) helpful rather than harmful to consensus health as long as they don't involve lagging. Ideally some of these strategies are eventually upstreamed into the default voting heuristics to improve vote accuracy across the cluster.

Upcoming Protocol Improvements

The Anza core dev team is also actively working on protocol improvements to help validators reduce their vote latency. Here are a few of the upcoming improvements that will be applied soon or are being considered for the future:

  1. Partitioned epoch rewards

Epoch boundaries are notoriously slow but this affects every validator right now. Partitioned epoch rewards will help alleviate this slowness and should be activated soon.

  1. Faster snapshot creation

Full snapshot creation is very compute and memory intensive and noticeably slows down block validation and confirmation times. Solana core developers are actively working on transitioning to a faster approach which can be followed on SIMD-0125.

  1. Vote ingestion via QUIC

This change is still in the proposal review process and can be followed on SIMD-0195. The motivation for migrating to QUIC for vote ingestion is to improve vote delivery guarantees.

  1. Voting grace period increase

This change has been proposed by a few folks in the Solana validator community. Due to increased network latency, he current grace period of 2 slots is seen as unfair to nodes located outside of North America and Europe.

  1. Vote credit algorithm improvements

Due to the risk of validators running unaudited code to modify voting behavior, the Anza core dev team aims to improve the vote credit algorithm to eliminate the advantage of certain voting mods such as backfilling and lockout adjustments.

Links

Timeline

Credits

Thanks to everyone who took the time to give feedback on this post: