Back to Blog

Table of Contents

Highlights

Precompile Alignment Bug: Root Cause Analysis

Written By

Andrew Fitzerald, Anza & Cavey Cool, Temporal

April 24, 2025

Summary

A bug in agave’s implementation of the ed25519 and secp256r1 precompile programs were exposed in validators running with the new --transaction-structure view option introduced in v2.2. The precompile programs had a bug in their logic, assuming that transaction instruction data, directly from transaction type, was always 2-byte aligned. This assumption does not hold when using the transaction-view, which has no alignment guarantees for individual instructions’ data. This caused block-producing nodes using the transaction-view to return an error for the precompile instruction, while nodes validating the block returned a success for the instruction - eventually causing the leader process to exit due to bankhash mismatch. The initial mismatch was reported to Anza on April 9th, and a fix was released on April 11th. This represented a loss-of-availability bug, which could be used to take down validators running with the --transaction-structure view option. If the use of the option were widespread, network availability would have been at risk. This bug did not pose any risk to funds.

Bug Details

The implementation of the ed25519 and secp256k1 precompile programs in agave used the bytemuck crate to cast raw instruction data into a reference to a Ed25519SignatureOffsets or Secp256r1SignatureOffsets structure:

// bytemuck wants structures aligned
let offsets: &Ed25519SignatureOffsets = bytemuck::try_from_bytes(&data[start..end])
    .map_err(|_| PrecompileError::InvalidDataOffsets)?;

These structures have 2-byte alignment – meaning that if the address of the structure within the instruction data is not at a multiple of 2, bytemuck returns an error instead of a reference. The error propagates up the stack and ultimately results in a Custom(3) instruction error. In the solana-sdk transaction type SanitizedTransaction, each instruction’s data was stored in a separate Vec<u8>, which the allocator (jemalloc) guaranteed to have at least 8-byte alignment. As a result, the assumptions in the precompile implementations were always met.

In v2.2, agave introduced a new zero-copy transaction parser in block-production: transaction-view. This structure does not deserialize the packet; instead, it retains the original packet data and caches relevant offsets for accessing individual fields. This means that the instruction data are no longer held in unique Vec<u8>, that have alignment guarantees, and instead accesses instruction data via offsets into the serialized packet. If the offset is not 2-byte aligned, a valid precompile signature will fail with a Custom(3) error on the leader node.

Until recently, failures in precompile instructions were treated as protocol violations. However, with the activation of feature 9ypxGLzkMxi89eDerRKXWDXe44UY2z4hBig4mDhNq5Dp (SIMD-0159) in epoch 760, such failures became regular transaction errors. Prior to this feature’s activation, the misalignment issue in transaction view would have caused the leader to drop otherwise valid transactions. With the activation of this feature, leaders began including failed precompile signature transactions into their blocks, and the network would accept them as intended. However, the leader would see a different bankhash because a transaction might fail on the leader but succeed on the rest of the cluster. 

This represented a critical loss-of-availability bug. If agave v2.2 had seen widespread adoption of the --transaction-structure view, an attacker could have submitted specially crafted transactions to crash validators remotely and repeatedly.

To fix this vulnerability, the precompile implementations of ed25519 and secp256r1 were updated to not assume any alignment:

// SAFETY:
// - data[start..] is guaranteed to be >= size of Ed25519SignatureOffsets
// - Ed25519SignatureOffsets is a POD type, so we can safely read it as an unaligned struct
let offsets = unsafe {
    core::ptr::read_unaligned(data.as_ptr().add(start) as *const Ed25519SignatureOffsets)
};

The secp256k1 precompile instruction uses bincode to deserialize the similar offset structures, and thus did not exhibit the same vulnerability.

Timeline

  • 6:00pm April 9th, 2025 - Temporal alerted Anza of bankhash mismatches in their validator that was recently upgraded to v2.2.7. This mismatch indicates their node executed the block differently than the rest of the cluster.

  • 9:00am April 9th, 2025 - Anza engineers assisted Temporal engineers in uncovering the bug, and identifying a transaction with different status between leader execution and replay. The difference was due to a Ed25519 SigVerify Precompile instruction which succeeded during replay but failed on the leader node with Err(InstructionError(0, Custom(3))). At this point, the root-cause was identified.

  • 3:40pm April 10th, 2025 - Announcements were made in cluster discord channels advising nodes running v2.2 should not use the --transaction-structure view CLI flag

  • 10:30am April 11th, 2025 - #5765 merged to master

  • 11:30am April 11th, 2025 - #5768 merged to v2.2

  • 3:30pm April 11th, 2025 - v2.2.8 released with fix