Conference Talk Breakpoint 25

Compass: Ensuring SPL-Token Stays Safe on Pinocchio Runtime Verification

By Solana 🧭 Compass Dec 12, 2024 · 00:00 UTC 9 min read

Learn how Runtime Verification is using Compass to mathematically prove the Solana Token Program's Pinocchio upgrade won't break anything

The notes below are AI generated and may not be 100% accurate. Watch the video to be sure!

Note: these notes were generated by AI to help surface more Solana content

When Solana's Token Program upgrades its backend from SPL Token to P Token (powered by the Pinocchio library), billions of dollars in assets will depend on the new code behaving exactly like the old. Runtime Verification is using mathematical proofs—not just testing—to guarantee nothing breaks.

Keep up to date with the Solana eco

Summary

Daniel Cumming from Runtime Verification presented at Breakpoint 2024 how his team is tackling one of blockchain's most critical challenges: proving that a major infrastructure upgrade won't introduce unexpected behavior. The Solana Token Program, which handles all token operations on the network, is transitioning from using the Solana SDK to the more efficient Pinocchio library. While this promises lower compute unit costs, the stakes couldn't be higher—any behavioral difference could affect every token transaction on Solana.

The team is employing a technique called formal verification, which uses mathematical proofs to guarantee that two different codebases will produce identical results under all possible conditions. Unlike traditional testing, which only checks specific inputs, formal verification explores every possible state the program could encounter. This exhaustive approach provides a level of confidence that manual reviews and even extensive fuzzing cannot match.

The work involves proving 27 instruction proofs and 20 multi-signature proofs—and that's for each implementation, effectively doubling the workload. Initially, each proof took over five hours to run, but the team has optimized this to 10-20 minutes, with further improvements expected. This represents a massive engineering effort to ensure the Solana ecosystem can upgrade with confidence.

Key Points:

The Pinocchio Upgrade: What's Changing and Why

The Solana Token Program is the foundational code that handles all token operations—minting, transferring, burning, and more. Currently, it uses SPL Token built on the Solana SDK. The upgrade will switch to P Token, which uses the Pinocchio library instead. The primary motivation is efficiency: Pinocchio is designed to be more computationally efficient, meaning lower costs for users in the form of reduced compute units (CUs).

However, efficiency gains are worthless if they come at the cost of reliability. The critical requirement is that while the new implementation should be faster, it must behave identically in every other way. The same inputs should produce the same outputs, errors should occur under the same conditions, and no new restrictions or unexpected behaviors should emerge.

Understanding Formal Verification

Formal verification represents the gold standard in software correctness. Cumming explained that there's a spectrum of approaches to checking if two programs are equivalent, arranged by increasing confidence. Manual review by experts is useful but limited by human oversight. Differential testing runs both programs through identical test suites but can only verify the specific cases tested. Formal verification, by contrast, mathematically proves equivalence across all possible inputs.

The process works through what Cumming called a "proof harness." The system takes an implementation (the actual code), a precondition (the valid starting state), and a postcondition (what must be true after execution). The proof engine then verifies that for every possible valid input, the implementation will satisfy the postcondition. If it passes, the code is proven correct against that specification.

The Dual-Proof Strategy for Equivalence

Runtime Verification's approach is elegant: prove both SPL Token and P Token correct against the same specification. If both implementations satisfy identical preconditions and postconditions, they must be equivalent in behavior. This creates what Cumming described as a "cross product" of proofs—both must pass for the equivalence claim to hold.

This strategy requires writing a complete specification of what the Solana Token Program should do. This specification then serves as the benchmark against which both implementations are measured. The specification is written by Runtime Verification, while the implementations come from Anza (P Token) and the existing SPL Token codebase.

Symbolic Execution: Testing All Paths

The breakthrough enabling this verification is symbolic execution. Rather than testing with concrete values like a traditional unit test would, the system represents inputs as symbols that can take any valid value. When the proof engine encounters a branching condition—like checking if a mint is initialized—it explores both paths simultaneously.

Cumming illustrated this with a Mint2 instruction example. The verification considers all possible states: Is the account a signer? Is it writable? Is the mint initialized or uninitialized? Is the destination account frozen? Each of these creates branches in an exploration tree, with the final leaves representing all possible execution paths. At each endpoint, the system verifies the correct outcome occurred.

Scale of the Verification Effort

The numbers involved are substantial. The team must complete 27 instruction proofs covering all token program operations, plus 20 multi-signature proofs. Each of these must be done for both P Token and SPL Token, resulting in nearly 100 individual proofs. Additionally, the team wrote sound semantics for Rust to enable the Compass prover to reason about the code correctly.

Performance optimization has been critical. Initial proof runs exceeded five hours each, creating an impractical feedback loop for engineering work. The team has reduced this to 10-20 minutes per proof for most cases, with expectations of further improvements. Even at current speeds, the verification represents a significant computational and engineering investment.

Facts + Figures

The Solana Token Program is upgrading from SPL Token (using Solana SDK) to P Token (using Pinocchio library)
The primary benefit of the upgrade is increased efficiency through lower compute unit (CU) costs
Manual security review of the code was performed by Neodym and Zellic
Differential fuzzing was completed by Neodym as part of their security engagement
The verification requires proving 27 instruction proofs and 20 multi-signature proofs
Each proof must be done for both implementations, effectively doubling the workload to nearly 100 proofs
Initial proof run times exceeded 5 hours per proof
Current proof run times have been optimized to 10-20 minutes per proof
Further optimization improvements are expected
Runtime Verification wrote sound semantics of Rust to enable the Compass prover
The specification (preconditions and postconditions) was written by Runtime Verification
The P Token implementation was written by Anza

Top quotes

"While we want the change to be an increase in efficiency, we do not want there to be a change in behavior. We still want to have the Solana token program that we have come to know and love."

"Using the power of mathematics, we can prove that they're going to do the same thing."

"When you're performing a test, you will instantiate these with concrete values. And you will run that test and you will get some information about those specific values that you tested. We want to represent these values symbolically so we can express all paths and have a higher degree of confidence."

"Originally for the first card, these proofs were running on the order of over five hours each. So this is not a great feedback time to get for anything in engineering. We've reduced that down to about 10 to 20 minutes to run for most of these proofs."

"By proving both, we end up in... the same specification is going to prove both implementations. So we do two proofs and then we use the cross product of the proofs."

Questions Answered

What is the difference between SPL Token and P Token?

SPL Token is the current implementation of the Solana Token Program, built using the Solana SDK. P Token is the new implementation that uses the Pinocchio library instead. The key difference is efficiency—Pinocchio is designed to be more computationally efficient, resulting in lower compute unit costs for token operations. Both implementations are meant to provide identical functionality for minting, transferring, burning tokens, and other operations; only the underlying code structure differs.

Why isn't regular testing enough to verify the upgrade is safe?

Traditional testing, even extensive fuzzing, can only verify specific inputs that testers think to check. It's impossible to test every possible combination of inputs and states that a program might encounter. Formal verification solves this by using symbolic execution to explore all possible paths through the code simultaneously. Instead of testing "what happens when I mint 100 tokens," it proves "for any amount of tokens that could ever be minted, the program behaves correctly."

How does Runtime Verification prove two different programs are equivalent?

Rather than directly comparing the two codebases, Runtime Verification proves both programs correct against the same specification. The specification defines what the Solana Token Program should do—the valid inputs, the expected outputs, and the correct error conditions. If both SPL Token and P Token satisfy this identical specification, they must behave equivalently. This approach is more tractable than direct comparison and provides clear documentation of expected behavior.

What operations does the verification cover?

The verification covers all 27 instructions in the token program, including minting, transferring, burning, freezing, and other token operations. Additionally, 20 multi-signature proofs ensure that multi-sig functionality works correctly. Each of these proofs is performed for both the SPL Token and P Token implementations, ensuring complete coverage of the token program's functionality.

How long does the verification process take?

Initially, individual proofs took over five hours each to run, which created an impractical development workflow. Through optimization efforts, the team has reduced most proof run times to 10-20 minutes. Given nearly 100 total proofs needed (47 proof types times two implementations), even at optimized speeds the complete verification represents a substantial computational effort. The team expects further improvements to proof performance.

What happens if a proof fails?

A failed proof indicates either a bug in the implementation or an issue with the specification. If P Token fails a proof that SPL Token passes, it would reveal a behavioral difference that needs to be addressed before the upgrade can safely proceed. This is precisely the value of formal verification—catching such differences before they affect users' funds and transactions on the live network.

Back to Learn

Solana 🧭 Compass

@SolanaCompass

Solana Compass is an independent Solana analytics and staking platform, operating a validator on Solana mainnet since September 2021. Its network statistics and...