Scale or Die 2025: Solving The Pains of Indexing On Solana
Revolutionary indexing solution for Solana: The Graph's Substreams tackles high TPS, historical data, reorgs, and more
In a groundbreaking presentation at Scale or Die 2025, Giuliano Francescangeli, Substreams Product Manager at The Graph, unveiled a revolutionary solution to the most pressing challenges faced by developers when indexing data on the Solana blockchain. This new technology promises to dramatically improve the efficiency and ease of building applications on Solana, potentially unlocking a new wave of innovation in the ecosystem.
Summary
The Graph's Substreams technology addresses five major pain points in Solana indexing: real-time indexing, historical data processing, handling blockchain reorganizations (reorgs), accessing account state, and decoding Interface Description Language (IDL) data. By tackling these issues, Substreams offers a comprehensive solution that could significantly reduce development time and complexity for Solana projects.
Francescangeli highlighted the unique challenges posed by Solana's high-speed, high-throughput architecture, which requires near-instantaneous data processing to keep up with the blockchain's rapid pace. Traditional indexing methods often fall short, leading to delays and missed data. Substreams' approach involves parallel processing, efficient data compression, and innovative handling of blockchain-specific issues like reorgs and account state changes.
The presentation revealed impressive performance metrics, including a mere 1.5-second average drift from the blockchain's head, with speeds as low as 600 milliseconds in some cases. For historical data processing, Substreams boasts a staggering 10,000% performance boost compared to traditional methods, potentially reducing processing times from weeks to days.
Key Points:
Real-time Indexing Solutions
Solana's high transaction throughput and short block times present a significant challenge for real-time indexing. Traditional approaches using RPCs or streaming services like Yellowstone often fall behind quickly, leading to missed data and inconsistencies.
Substreams tackles this issue head-on with a high-performance, deterministic approach. By achieving an average of just 1.5 seconds drift from the blockchain's head, with speeds as low as 600 milliseconds in some cases, it ensures that indexers can keep pace with Solana's rapid block production. The use of gRPC protocol with protocol buffers for message encoding further optimizes data transmission, reducing bandwidth requirements and improving overall efficiency.
Historical Data Processing
Processing historical blockchain data on Solana has been a major bottleneck for many projects. The sheer size of Solana blocks (up to 40 megabytes for 100 blocks) and the complexity of data structures like inner instructions make this task particularly challenging.
Substreams introduces a revolutionary approach to historical data processing. By grouping blocks into 1,000-block flat files and allocating workers to process these files in parallel, Substreams achieves a remarkable 10,000% performance boost compared to traditional linear processing methods. This improvement can reduce processing times from weeks to mere days, significantly accelerating development and data analysis tasks.
Reorg Handling
Blockchain reorganizations (reorgs) are a common occurrence on Solana, presenting a significant challenge for data consistency and reliability. Traditional approaches often involve waiting for finalized blocks or implementing complex custom signals and database management systems.
Substreams simplifies reorg handling with an in-memory replication of the blockchain and its branches. This approach allows for flexible, safe, and easy management of reorgs without imposing additional burden on developers. Each unique Substreams request maintains its own chain representation, eliminating conflicts between different users and ensuring data consistency.
Account State Access
Solana's account-based model offers advantages in terms of parallelization but presents challenges when tracking historical state changes. Existing solutions often lack access to historical data and may face issues with large or frequent requests.
Substreams addresses these limitations by providing a three-month moving window of historical account state data. This feature allows developers to back-process state changes, track ownership transfers, and access the most recent account changes efficiently. By rounding account changes to the block level, Substreams reduces data overhead and simplifies state tracking for developers.
IDL Compatibility and Decoding
The lack of standardization in Interface Description Language (IDL) formats and frequent breaking changes in Solana's ecosystem pose significant challenges for developers. Manual importing and mapping of instructions and events can be time-consuming and error-prone.
Substreams aims to automate IDL handling and decoding, addressing edge cases and streamlining the development process. By decoding data at the point of extraction and offering community support for unsupported IDLs, Substreams significantly reduces the burden on developers in managing versioning and compatibility issues.
Facts + Figures
- Substreams achieves an average of 1.5 seconds drift from Solana's blockchain head
- In some cases, Substreams can process data as quickly as 600 milliseconds from the head
- Historical data processing sees a 10,000% performance boost with Substreams
- 100 Solana blocks can be up to or above 40 megabytes in size
- Substreams processes historical data in groups of 1,000-block flat files
- A three-month moving window of account state history is provided by Substreams
- Substreams is natively Rust-based, enhancing compatibility with Solana's ecosystem
- The solution can reduce historical data processing times from weeks to days
- Substreams uses gRPC protocol with protocol buffers for efficient data transmission
- The technology addresses five major pain points in Solana indexing: real-time indexing, historical data processing, reorg handling, account state access, and IDL compatibility
Top quotes
- "If you're not sitting extraction next to execution, you're missing out right away."
- "We really tried to push performance while keeping maintaining determinism."
- "What we've seen in recent history is that 100 blocks may be up to or above 40 megabytes."
- "We're seeing 10,000% performance boost. You have to think weeks to days. It's crazy."
- "Reorgs will vary in size. The legacy approach to solving this problem is either you're going to wait for finalized blocks, but this again puts you quite behind head, or you implement custom signals."
- "We have an in-memory replication of the chain and its branches."
- "The account state model in Solana is great because you don't have this monolithic state structure like you have on other ecosystems like EVM."
- "We're really aiming for ease of mind here."
- "The core crux of the problem is the versioning itself. They impose breaking changes on and breaking changes are imposed on developers every week."
- "The most important thing is the user experience. The user experience is affected. It's very important. And especially on a chain like Solana, that's not acceptable."
Questions Answered
What is Substreams and how does it help with Solana indexing?
Substreams is a technology developed by The Graph to address major challenges in indexing Solana blockchain data. It offers solutions for real-time indexing, historical data processing, handling blockchain reorganizations, accessing account state, and decoding IDL data. By tackling these issues, Substreams significantly improves the speed and efficiency of data indexing on Solana, allowing developers to build applications more easily and quickly.
How does Substreams improve real-time indexing on Solana?
Substreams achieves near real-time indexing on Solana by maintaining an average of just 1.5 seconds drift from the blockchain's head, with speeds as low as 600 milliseconds in some cases. It uses a high-performance, deterministic approach and employs gRPC protocol with protocol buffers for efficient data transmission. This ensures that indexers can keep up with Solana's high transaction throughput and short block times, which is crucial for maintaining data consistency and providing a smooth user experience.
What performance improvements does Substreams offer for historical data processing?
Substreams provides a remarkable 10,000% performance boost for historical data processing compared to traditional methods. It achieves this by grouping blocks into 1,000-block flat files and allocating workers to process these files in parallel. This approach can reduce processing times from weeks to mere days, significantly accelerating development and data analysis tasks on the Solana blockchain.
How does Substreams handle blockchain reorganizations (reorgs)?
Substreams simplifies reorg handling with an in-memory replication of the blockchain and its branches. This approach allows for flexible, safe, and easy management of reorgs without imposing additional burden on developers. Each unique Substreams request maintains its own chain representation, eliminating conflicts between different users and ensuring data consistency. This is a significant improvement over traditional methods that often require waiting for finalized blocks or implementing complex custom signals.
What solutions does Substreams offer for accessing account state on Solana?
Substreams provides a three-month moving window of historical account state data, allowing developers to back-process state changes and track ownership transfers efficiently. It also rounds account changes to the block level, reducing data overhead and simplifying state tracking. This approach addresses the limitations of existing solutions that often lack access to historical data and may face issues with large or frequent requests.
How does Substreams address IDL compatibility and decoding issues?
Substreams aims to automate IDL handling and decoding, addressing edge cases and streamlining the development process. It decodes data at the point of extraction and offers community support for unsupported IDLs. This significantly reduces the burden on developers in managing versioning and compatibility issues, which are particularly challenging in the Solana ecosystem due to frequent breaking changes and lack of standardization in IDL formats.
Why is Substreams important for developers building on Solana?
Substreams is crucial for Solana developers because it addresses major pain points in data indexing, which is a fundamental requirement for many blockchain applications. By simplifying and accelerating indexing processes, Substreams allows developers to focus more on building innovative features and improving user experience, rather than dealing with complex infrastructure issues. This can potentially lead to faster development cycles, more robust applications, and ultimately, a more vibrant Solana ecosystem.
Comments
Please login to leave a comment.
On this page
- Summary
- Key Points:
- Facts + Figures
- Top quotes
-
Questions Answered
- What is Substreams and how does it help with Solana indexing?
- How does Substreams improve real-time indexing on Solana?
- What performance improvements does Substreams offer for historical data processing?
- How does Substreams handle blockchain reorganizations (reorgs)?
- What solutions does Substreams offer for accessing account state on Solana?
- How does Substreams address IDL compatibility and decoding issues?
- Why is Substreams important for developers building on Solana?
Related Content
Scale or Die 2025: Solving The Pains of Indexing On Solana
Discover how The Graph's Substreams revolutionizes Solana data indexing, solving real-time processing, historical data, and more
Breakpoint 2024: Technical Talk: Dyndexer, Indexing Solana On-Chain Data at Scale
Discover Dyndexer: Syndicate's game-changing solution for indexing Solana data, offering unparalleled flexibility and performance
Breakpoint 2023: Solana RPC 2.0 Roundtable
Key insights from industry experts on the future of RPC 2.0 in Solana and its impact on blockchain development.
Jump Crypto: How To Improve Solana?
Jump Crypto's Michael McGee reveals where Solana's biggest performance wins are hiding, how Firedancer achieves hundreds of thousands of TPS, and why most blockchain problems are just bugs waiting to be fixed.
Scale or Die at Accelerate 2025: Indexing Solana programs with Carbon
Revolutionizing Solana development: Carbon framework simplifies indexing, boosts efficiency
Ship or Die at Accelerate 2025: Your Social Graph Should Be Yours
Farcaster announces Solana support, discusses open social graphs and the future of Web3 social networks
Solana Changelog Jul 31 - New Hackathon, Custom Anchor Discriminators, and Blockchain Optimizations
Explore Solana's latest developments including the Radar hackathon, custom Anchor discriminators, ZK compression on DevNet, and blockchain optimizations in this comprehensive changelog.
Solana Changelog - April 9 - Flare and GetEpochStake
Discover the latest Solana upgrades including GetEpochStake, improved indexing, and the Flare CLI tool. Learn how these changes enhance performance and developer experience on Solana.
Scale or Die 2025: Scaling Smart Wallets: How To Build Onchain Infrastructure At Fintech Scale
Squads introduces Grid: revolutionizing blockchain infrastructure for fintech-scale operations on Solana
How Metaplex Core Moves Beyond NFT Standards w/ Stephen Hess (Metaplex Studios)
Discover how Metaplex Core is transforming the NFT landscape on Solana, moving beyond traditional standards to create a more versatile and efficient digital asset protocol.
WTF RPC? w/ Brian Long (Triton One)
Dive deep into the world of RPC services with Brian Long from Triton One. Explore how RPCs enhance blockchain usability, data retrieval, and transaction processing on Solana and beyond.
Storing Solana History on IPFS/Filecoin - Project Old Faithful with Brian Long
Discover how Project Old Faithful is making Solana's entire transaction history accessible through IPFS and Filecoin, transforming blockchain data availability for developers and users alike.
Scale or Die: From Localnet to Mainnet w/ Surfpool and Infrastructure as Code
Surfpool brings mainnet data to localnet, enabling realistic testing and powerful infrastructure management for Solana developers
Wen Firedancer
Firedancer is live on Solana mainnet! Learn how Jump Trading's independent validator client achieved sub-minute startup times and became the fastest voter on the network.
How Dynamic NFTs Will Replace Coffee Shop Punchcards (Underdog Protocol)
Discover how Underdog Protocol is transforming NFTs into powerful tools for businesses and creators, leveraging Solana's compressed NFTs to build rich, personalized user experiences and drive adoption of Web3 technologies.
Solana Token Markets
