Scale or Die 2025: Solving The Pains of Indexing On Solana
Discover how The Graph's Substreams revolutionizes Solana data indexing, solving real-time processing, historical data, and more
In a groundbreaking presentation at Scale or Die 2025, Giuliano, the Substreams Product Manager at The Graph, unveils a revolutionary solution to the most pressing challenges faced by developers when indexing data on the Solana blockchain. This innovative approach promises to dramatically improve performance, ease of use, and unlock new possibilities for decentralized applications on Solana.
Summary
The Graph's Substreams technology addresses five major pain points in Solana data indexing: real-time processing, historical data backfilling, reorganization handling, account state access, and IDL compatibility. By tackling these issues head-on, Substreams offers a 10,000% performance boost in historical data processing, reduces drift from the blockchain head to as low as 600 milliseconds, and provides a three-month moving window of account state history.
These improvements are set to revolutionize the way developers interact with Solana data, enabling more efficient and responsive decentralized applications. The solution's focus on parallelization, efficient data encoding, and automated handling of complex blockchain structures promises to significantly reduce the burden on developers, allowing them to focus on creating value for users rather than managing infrastructure.
Substreams' approach to solving these challenges opens up new possibilities for analytics, AI agents, DeFi projects, and trading applications on Solana. By providing near real-time data access, efficient historical processing, and simplified data models, The Graph is paving the way for the next generation of Solana-based applications and services.
Key Points:
Real-time Indexing
Solana's high transaction throughput and short block times present a significant challenge for real-time data indexing. Traditional approaches often fall behind the blockchain's head rapidly, leading to outdated information. Substreams addresses this by optimizing for performance while maintaining determinism. The solution achieves an average drift of just 1.5 seconds from the blockchain head, with instances as low as 600 milliseconds.
Substreams utilizes gRPC protocol with protocol buffers to encode messages into binary format, significantly improving efficiency in data transmission. This approach not only optimizes bandwidth usage but also ensures that data is quickly interpretable by machines, enabling faster processing and reducing latency.
Historical Data Processing
Processing historical data on Solana has been a major bottleneck due to the complexity of its data structures and the sheer volume of information. Substreams tackles this challenge through innovative parallelization techniques. By grouping blocks into flat files of 1,000 blocks each and allocating workers to process these files concurrently, Substreams achieves a staggering 10,000% performance improvement over traditional methods.
This approach reduces processing times from weeks to days, dramatically enhancing the ability to analyze historical trends and patterns. Furthermore, Substreams implements a caching system that allocates resources based on unique module hashes, ensuring that competing requests don't interfere with each other and maintaining consistent performance for all users.
Reorganization Handling
Blockchain reorganizations (reorgs) are a common occurrence on Solana, presenting a significant challenge for data indexers. Substreams addresses this issue by implementing an in-memory replication of the blockchain and its branches for each unique request. This approach allows for flexible and safe handling of reorgs without imposing the burden of management on developers.
By automating the detection and handling of reorgs, Substreams eliminates the need for custom signals, manual database management, and complex pruning and backfilling operations. This not only improves the reliability of data but also significantly reduces the complexity of building applications that depend on accurate blockchain state.
Account State Access
Solana's account-based model, while efficient for transaction processing, presents challenges for accessing historical state changes. Substreams introduces a three-month moving window of account state history, allowing developers to access past states even after disconnections or for back-processing needs.
The solution also provides built-in tracking of changes prior to account deletion, addressing complex scenarios such as program ownership transfers. By rounding account changes to the block level and only providing the most recent changes, Substreams significantly reduces the amount of data developers need to process and store, streamlining application development and improving efficiency.
IDL Compatibility
The lack of standardization and frequent breaking changes in Interface Description Language (IDL) versions have been a persistent headache for Solana developers. Substreams aims to automate the handling of these changes and edge cases, reducing the manual effort required to adapt to new IDL versions.
By decoding data at the point of extraction and providing community support for unsupported IDLs, Substreams simplifies the development process and improves compatibility across different versions of Solana programs. This approach allows developers to focus on building features rather than managing complex data structures and version incompatibilities.
Facts + Figures
- Substreams achieves an average of 1.5 second drift from the Solana blockchain head
- In some cases, Substreams can process new blocks as quickly as 600 milliseconds after they're created
- Historical data processing sees a 10,000% performance boost, reducing processing times from weeks to days
- Substreams groups blocks into flat files of 1,000 blocks for efficient parallel processing
- The solution provides a three-month moving window of account state history
- Solana blocks can be massive, with 100 blocks potentially exceeding 40 megabytes in size
- Substreams uses gRPC protocol with protocol buffers for efficient binary encoding of messages
- The technology is natively Rust-based, enhancing compatibility with Solana's ecosystem
- Substreams handles daily frequent reorganizations (reorgs) automatically
- The solution rounds account changes to the block level to reduce data processing requirements
Top quotes
- "If you're not sitting extraction next to execution, you're missing out right away."
- "We really tried to push performance while keeping maintaining determinism."
- "Substreams from the ground up was designed with parallelization in mind."
- "We're seeing 10,000% performance boost. You have to think weeks to days. It's crazy."
- "Reorgs are part of the game with Solana a bit, but they shouldn't be imposed or managed by the user."
- "We have an in-memory replication of the chain and its branches."
- "The account state model in Solana is great because you don't have this monolithic state structure like you have on other ecosystems like EVM."
- "We're really aiming for ease of mind here."
- "The most important thing is the user experience. The user experience is affected. It's very important. And especially on a chain like Solana, that's not acceptable."
- "You do not want to deal with the pains of managing the different versions of these anchor ideals."
Questions Answered
What are the main challenges of indexing data on Solana?
The main challenges of indexing data on Solana include real-time processing due to high transaction throughput, handling massive amounts of historical data, managing frequent blockchain reorganizations, accessing account state information, and dealing with compatibility issues across different IDL versions. These challenges stem from Solana's unique architecture and high-performance design, which can make it difficult for developers to build efficient and responsive applications.
How does Substreams improve real-time data indexing on Solana?
Substreams improves real-time data indexing on Solana by optimizing for performance while maintaining determinism. It achieves an average drift of just 1.5 seconds from the blockchain head, with instances as low as 600 milliseconds. This is accomplished through the use of efficient data encoding techniques, such as gRPC protocol with protocol buffers, which allows for quick binary encoding and transmission of blockchain data. This approach ensures that applications can stay up-to-date with the latest blockchain state, enabling more responsive and accurate user experiences.
What performance improvements does Substreams offer for processing historical Solana data?
Substreams offers a staggering 10,000% performance improvement for processing historical Solana data. This is achieved through innovative parallelization techniques, where blocks are grouped into flat files of 1,000 blocks each and processed concurrently by allocated workers. This approach reduces processing times from weeks to days, dramatically enhancing the ability to analyze historical trends and patterns. Additionally, Substreams implements a caching system that allocates resources based on unique module hashes, ensuring consistent performance for all users even when dealing with competing requests.
How does Substreams handle blockchain reorganizations (reorgs) on Solana?
Substreams handles blockchain reorganizations (reorgs) on Solana by implementing an in-memory replication of the blockchain and its branches for each unique request. This approach allows for flexible and safe handling of reorgs without imposing the burden of management on developers. By automating the detection and handling of reorgs, Substreams eliminates the need for custom signals, manual database management, and complex pruning and backfilling operations. This not only improves the reliability of data but also significantly reduces the complexity of building applications that depend on accurate blockchain state.
What solutions does Substreams provide for accessing account state on Solana?
Substreams provides several solutions for accessing account state on Solana. It introduces a three-month moving window of account state history, allowing developers to access past states even after disconnections or for back-processing needs. The solution also offers built-in tracking of changes prior to account deletion, addressing complex scenarios such as program ownership transfers. By rounding account changes to the block level and only providing the most recent changes, Substreams significantly reduces the amount of data developers need to process and store, streamlining application development and improving efficiency.
How does Substreams address IDL compatibility issues on Solana?
Substreams addresses IDL compatibility issues on Solana by automating the handling of changes and edge cases in Interface Description Language (IDL) versions. It aims to reduce the manual effort required to adapt to new IDL versions by decoding data at the point of extraction. The solution also provides community support for unsupported IDLs, simplifying the development process and improving compatibility across different versions of Solana programs. This approach allows developers to focus on building features rather than managing complex data structures and version incompatibilities.
What new possibilities does Substreams unlock for Solana developers?
Substreams unlocks new possibilities for Solana developers by providing near real-time data access, efficient historical processing, and simplified data models. This enables the creation of more sophisticated analytics tools, AI agents, DeFi projects, and trading applications on Solana. By reducing the burden of infrastructure management and data processing, developers can focus on creating value for users and building innovative features. The improved performance and ease of use offered by Substreams pave the way for the next generation of Solana-based applications and services, potentially leading to more complex and responsive decentralized systems.
How does Substreams improve the overall developer experience on Solana?
Substreams improves the overall developer experience on Solana by automating many of the complex and time-consuming tasks associated with blockchain data indexing and processing. It handles real-time data streaming, efficient historical data processing, automatic reorg management, simplified account state access, and IDL compatibility issues. This comprehensive approach allows developers to focus on building application logic and user-facing features, rather than worrying about the intricacies of data management and infrastructure. By providing a more streamlined and efficient development process, Substreams enables faster iteration and innovation in the Solana ecosystem.
Comments
Please login to leave a comment.
On this page
- Summary
- Key Points:
- Facts + Figures
- Top quotes
-
Questions Answered
- What are the main challenges of indexing data on Solana?
- How does Substreams improve real-time data indexing on Solana?
- What performance improvements does Substreams offer for processing historical Solana data?
- How does Substreams handle blockchain reorganizations (reorgs) on Solana?
- What solutions does Substreams provide for accessing account state on Solana?
- How does Substreams address IDL compatibility issues on Solana?
- What new possibilities does Substreams unlock for Solana developers?
- How does Substreams improve the overall developer experience on Solana?
Related Content
Scale or Die 2025: Solving The Pains of Indexing On Solana
Revolutionary indexing solution for Solana: The Graph's Substreams tackles high TPS, historical data, reorgs, and more
Breakpoint 2024: Technical Talk: Dyndexer, Indexing Solana On-Chain Data at Scale
Discover Dyndexer: Syndicate's game-changing solution for indexing Solana data, offering unparalleled flexibility and performance
The Crypto and AI Renaissance | Josh Rosenthal
Explore the parallels between the Renaissance and today's crypto revolution with historian Josh Rosenthal. Learn how decentralized technologies are reshaping finance, communication, and identity.
Breakpoint 2023: Solana RPC 2.0 Roundtable
Key insights from industry experts on the future of RPC 2.0 in Solana and its impact on blockchain development.
Scale or Die at Accelerate 2025: Indexing Solana programs with Carbon
Revolutionizing Solana development: Carbon framework simplifies indexing, boosts efficiency
Ship or Die at Accelerate 2025: Your Social Graph Should Be Yours
Farcaster announces Solana support, discusses open social graphs and the future of Web3 social networks
Storing Solana History on IPFS/Filecoin - Project Old Faithful with Brian Long
Discover how Project Old Faithful is making Solana's entire transaction history accessible through IPFS and Filecoin, transforming blockchain data availability for developers and users alike.
WTF RPC? w/ Brian Long (Triton One)
Dive deep into the world of RPC services with Brian Long from Triton One. Explore how RPCs enhance blockchain usability, data retrieval, and transaction processing on Solana and beyond.
Building User-Friendly Block Explorers for Solana | Fathur Rahman, SolanaFM
Discover how SolanaFM is transforming block explorers for Solana, making blockchain data more accessible and user-friendly for developers and newcomers alike.
How Dynamic NFTs Will Replace Coffee Shop Punchcards (Underdog Protocol)
Discover how Underdog Protocol is transforming NFTs into powerful tools for businesses and creators, leveraging Solana's compressed NFTs to build rich, personalized user experiences and drive adoption of Web3 technologies.
Solana Changelog Jul 31 - New Hackathon, Custom Anchor Discriminators, and Blockchain Optimizations
Explore Solana's latest developments including the Radar hackathon, custom Anchor discriminators, ZK compression on DevNet, and blockchain optimizations in this comprehensive changelog.
Solana Changelog - April 9 - Flare and GetEpochStake
Discover the latest Solana upgrades including GetEpochStake, improved indexing, and the Flare CLI tool. Learn how these changes enhance performance and developer experience on Solana.
How Metaplex Core Moves Beyond NFT Standards w/ Stephen Hess (Metaplex Studios)
Discover how Metaplex Core is transforming the NFT landscape on Solana, moving beyond traditional standards to create a more versatile and efficient digital asset protocol.
Scale or Die 2025: Scaling Smart Wallets: How To Build Onchain Infrastructure At Fintech Scale
Squads introduces Grid: revolutionizing blockchain infrastructure for fintech-scale operations on Solana
Wen Firedancer
Firedancer is live on Solana mainnet! Learn how Jump Trading's independent validator client achieved sub-minute startup times and became the fastest voter on the network.
Solana Token Markets
