Depth Analysis of Subsquid: What Determines Competitiveness in the Web3 Industry?

This report, written by Tiger Research, analyzes Subsquid's decentralized data infrastructure, which aims to bridge the gap between blockchain data transparency and accessibility.

Key Points Summary

  • Subsquid (hereinafter referred to as SQD) simplifies blockchain data access through decentralized infrastructure. It supports over 200 blockchains and distributes data across multiple nodes.
  • The SQD network adopts a modular structure, allowing developers to freely configure data processing and storage methods. This enables users to efficiently utilize data through a unified structure in a multi-chain environment.
  • Subsquid aims to become the data backbone of Web3, setting a standard similar to Snowflake's "one platform, multiple workloads." Through its recent acquisition of Rezolve AI, it is expanding its business into the AI and payments sectors. SQD is expected to become the core infrastructure connecting Web3 and the agent economy.

1. Is blockchain data really open to everyone?

One of the defining characteristics of blockchain technology is that all its data is open to everyone. Traditional industries store data in closed databases that are inaccessible externally. Blockchain operates differently. All records are transparently published on the chain.

However, data transparency does not guarantee ease of use. Data transparency does not ensure accessibility. Blockchain is optimized to securely execute transactions and achieve network consensus. It is not an infrastructure designed for data analysis. The functionalities for verifying and storing data have progressed, but the infrastructure for efficiently querying and utilizing that data is still lacking. The methods for querying on-chain data have not changed significantly from ten years ago to today.

Source: Tiger Research

Consider an analogy. There is a town called "Tiger Town" that has a huge river named "Ethereum." This river is a public good. Anyone can take water from it. However, retrieving water is difficult and inefficient. Everyone must bring a bucket to the riverbank to fetch water directly. To use it as drinking water, they must go through a purification process by boiling or filtering.

The current blockchain development environment operates like this. Abundant data is readily available, but there is a lack of infrastructure to utilize it. For example, suppose a developer wants to build a dApp using the trading data from the decentralized exchange Uniswap. The developer must request data through Ethereum's RPC nodes, process it, and store it. However, RPC nodes have limitations for large-scale data analysis or executing complex queries. The blockchain ecosystem operates in a multi-chain environment that includes multiple blockchains. This makes the problem even more complex.

Developers can use centralized services like Alchemy or Infura to address these limitations. However, this approach undermines the core value of decentralization in Blockchain technology. Even if smart contracts are decentralized, centralized data access can introduce censorship risks and single points of failure. The Blockchain ecosystem needs fundamental innovation in data access methods to achieve true accessibility.

2. Subsquid: A New Paradigm for Blockchain Data Infrastructure

Source: SQD

Subsquid (hereinafter referred to as SQD) is a decentralized data infrastructure project aimed at addressing the complexity and inefficiency of accessing Blockchain data. The goal of SQD is to enable anyone to easily leverage Blockchain data.

Source: Tiger Research

Returning to the previous analogy. In the past, everyone had to carry a bucket to the river to fetch water directly. Now, distributed water purification plants draw water from the rivers and purify it. The people in the town no longer need to go to the riverbank. They can access clean water whenever they need it. The SQD team provides this infrastructure through the "SQD Network."

SQD Network operates as a distributed query engine and data lake. It currently supports processing data from over 200 blockchain networks. Since the mainnet launch in June 2024, its scale has grown to handle hundreds of millions of queries each month. This growth stems from three core features. These features elevate SQD beyond a simple data indexing platform and showcase the evolution of blockchain data infrastructure.

2.1. Decentralized architecture for high availability

A large part of the existing Blockchain data infrastructure relies on centralized providers like Alchemy. This approach has advantages in terms of initial accessibility and management efficiency. However, it limits users to only the chains supported by the provider and can incur high costs as usage increases. It is also susceptible to single points of failure. This centralized structure contradicts the core value of Decentralization in Blockchain.

The SQD network addresses these limitations through a decentralized architecture. Data providers collect raw data from multiple blockchains such as Ethereum and Solana. They segment the data into blocks, compress it, and upload it to the network along with metadata. Worker nodes split the data stored in permanent storage created by data providers into blocks for distributed storage. When query requests arrive, they are processed and responded to quickly. Each worker node acts like a mini API, providing its stored data. The entire network operates like thousands of distributed API servers. Gateway operators serve as the interface between end users and the network. They receive user queries and forward them to the appropriate worker nodes for processing.

Source: SQD

Anyone can participate as a work node or gateway operator. This allows the network capacity and processing performance to be horizontally scaled. Data is redundantly stored across multiple work nodes. Even if some nodes fail, overall data access will not be affected. This ensures high availability and resilience.

During the initial guidance phase, data providers are currently managed by the SQD team. This strategy ensures the initial data quality and stability. As the network matures, external providers will be able to participate through token governance. This will make the data procurement phase completely decentralized.

2.2. Token Economics for Ensuring Network Sustainability

For a distributed network to function properly, participants need a motivation to act voluntarily. SQD addresses this issue through an economic incentive structure centered around the native token $SQD . Each participant stakes or delegates tokens based on their roles and responsibilities. This collectively builds the stability and reliability of the network.

The work nodes are the core operators managing blockchain data. To participate, they must stake 100,000 $SQD as collateral to address malicious behavior or the provision of incorrect data. If issues arise, the network will confiscate their deposit. Nodes that continuously provide stable and accurate data will earn $SQD token rewards. This naturally incentivizes responsible operation.

Gateway operators must lock $SQD tokens to process user requests. The amount of locked tokens determines their bandwidth, which is the number of requests they can handle. A longer locking period allows them to handle more requests.

Token holders can participate in the network indirectly without running a node themselves. They can delegate their stake to trusted working nodes. Nodes that receive more delegation gain the ability to process more queries and earn more rewards. Delegators share a portion of these rewards. Currently, there are no minimum delegation requirements or lock-up period restrictions. This creates a permissionless curation system where the community can select nodes in real time. The entire community participates in network quality management through this structure.

2.3. Modular structure for achieving flexibility

Another notable feature of the SQD network is its modular structure. Existing indexing solutions adopt a monolithic structure, handling all aspects such as data collection, processing, storage, and querying within a single system. This simplifies the initial setup but limits developers' freedom to choose data processing methods or storage locations.

SQD completely separates the data access layer from the processing layer. The SQD network only handles the E (Extract) part of the ETL (Extract-Transform-Load) process. It only serves as a "data feed", quickly and reliably extracting raw data from the blockchain. Developers can freely choose how to transform and store data using the SQD SDK.

This structure provides practical flexibility. Developers can store data in PostgreSQL and serve it through the GraphQL API. They can export it as CSV or Parquet files. They can directly load it into cloud data warehouses like Google BigQuery. Future plans include supporting large-scale data analysis environments through Snowflake, as well as integrating with Kafka for direct data streaming without separate storage, to enable real-time analysis and monitoring platforms.

The co-founder of SQD, Dmitry Zhelezov, likened this to "providing Lego blocks". SQD does not provide finished products, but hands over the highest-performing, most reliable raw materials to developers. Developers can assemble these materials according to their needs to complete their data infrastructure. Traditional enterprises and crypto projects can use familiar tools and languages to process blockchain data. They can flexibly build data pipelines optimized for their specific industries and use cases.

3. The Next Step for Subsquid: Moving Towards Better Data Infrastructure

The SQD team has reduced the complexity and inefficiency of accessing blockchain data through the SQD network, laying the foundation for a decentralized data infrastructure. However, as the scale and scope of blockchain data usage expand rapidly, simple accessibility is no longer sufficient. The ecosystem now requires faster processing speeds and more flexible utilization environments.

The SQD team is advancing the network architecture to meet these needs. The team focuses on improving data processing speed and creating structures capable of processing data without relying on servers. To achieve this goal, SQD is developing the 1) SQD Portal and the 2) Light Squid in phases.

3.1. SQD Portal: Decentralization Parallel Processing and Real-Time Data

In the existing SQD network, the gateway acts as an intermediary connecting the end users and the working nodes. When a user requests a query, the gateway forwards it to the appropriate working node and returns the response to the end user. This process is stable, but queries can only be processed sequentially at a time. Large-scale queries take a considerable amount of time. Even with thousands of working nodes available, the system has failed to fully utilize their processing capabilities.

Source: SQD

The SQD team aims to solve this issue through the SQD Portal. The core of the Portal is decentralization and parallel processing. It splits a single query into multiple parts and simultaneously sends requests to about 3000 or more worker nodes. Each worker node processes the part assigned to it in parallel. The Portal then collects these responses in real-time and delivers them via streaming.

Portal will pre-fetch data into the buffer. This ensures uninterrupted delivery even in the event of network delays or temporary failures. Just like YouTube buffers videos for seamless playback, users can receive data without waiting. The team has also refactored the original Python-based query engine into Rust. This has significantly improved parallel processing performance. The overall processing speed has increased by several dozen times compared to before.

Portal takes a step further to solve real-time data issues. No matter how fast data processing becomes, work nodes only store confirmed historical Blocks. They cannot retrieve the latest transactions or Block information that were just generated. Previously, users had to rely on external RPC nodes to obtain this information. Portal addresses this issue through a real-time distributed stream called "Hotblocks." Hotblocks collect newly generated unconfirmed Blocks in real-time from Blockchain RPC nodes or dedicated streaming services and store them internally within Portal. Portal merges the confirmed historical data from work nodes with the latest Block data from Hotblocks. Users can receive data from the past to the present in a single request, without needing separate RPC connections.

The SQD team plans to fully transition the existing gateway to Portal. Portal is currently in a closed testing phase. In the future, anyone will be able to run Portal nodes directly and perform the role of a gateway in the network. Existing gateway operators will naturally transition to Portal operators. (The SQD network architecture can be found at this link.)

3.2. Light Squid: Indexing in Local Environment

The SQD network reliably provides data, but developers still face limitations in operating independent servers. Even when retrieving data from worker nodes through the Portal, large database servers like PostgreSQL are needed to process and deliver it to users. This process requires significant infrastructure build and maintenance costs. Data still relies on a single provider (developer's server), which is far from a truly decentralized structure.

Light Squid simplifies this intermediate step. The original structure resembles a wholesaler (developer) operating a large warehouse (server) to distribute data to consumers. Light Squid transforms it into a D2C (Direct-to-Consumer) approach, delivering data directly from the source (SQD Network) to the end user. Users receive the necessary data through the Portal and store it in their local environment. They can query it directly in their browser or personal devices. Developers do not need to maintain separate servers. Even if the user's network connection is interrupted, they can view the locally stored data.

For example, an application that displays NFT transaction history can now run directly in the user's browser without a central server. This is similar to how Instagram displays information feeds offline in Web2. It aims to provide a smooth user experience for dApps in a local environment. However, Light Squid is designed as an option aimed at achieving the same indexing environment locally. It does not completely replace a server-centric structure. Data is still supplied through a distributed network. As the utilization expands to the user level, the SQD ecosystem is expected to evolve into a more accessible form.

4. How Subsquid Works in Practice

The SQD network is merely an infrastructure that provides data, but its application scope is limitless. Just as all IT-based industries begin with data, improvements in data infrastructure expand the possibilities of all services built upon it. SQD is already changing the way blockchain data is utilized across various fields and has delivered concrete results.

4.1. DApp Developers: Unified Multi-Chain Data Management

The decentralized exchange PancakeSwap is a representative case. In a multi-chain environment, the exchange must summarize the transaction volume, liquidity pool data, and token pair information of each chain in real time. In the past, developers had to connect RPC nodes for each chain, parse event logs, and align different data structures individually. This process would be repeated every time a new chain was added. With each protocol upgrade, the maintenance burden would increase.

After adopting SQD, PancakeSwap can now manage data from multiple chains through a unified pipeline. SQD provides data from each chain in a standardized format. Now, a single indexer can handle all chains simultaneously. Adding a new chain now only requires a configuration change. The data processing logic is consistently managed from a central location. The development team has reduced the time spent on managing data infrastructure. They can now focus more on improving core services.

4.2. Data Analyst: Flexible Data Processing and Integration Analysis

On-chain analysis platforms like Dune and Artemis offer high accessibility and convenience by allowing data queries to be performed quickly and easily using SQL. However, their limitations lie in the fact that work can only be done within the chains and data structures supported by the platform. When combining external data or executing complex transformations, additional processes are required.

SQD enhances this environment, allowing data analysts to handle data more freely. Users can directly extract necessary Blockchain data, convert it into the required format, and load it into their own databases or repositories. For example, analysts can retrieve trading data from specific decentralized exchanges, aggregate it over time periods, combine it with existing financial data, and apply it to their own analysis models. SQD does not replace the convenience of existing platforms. It increases the freedom and scalability of data processing. Analysts can expand the depth and application range of on-chain data analysis through a broader range of data and customized processing methods.

4.3. AI Agents: The Core Infrastructure of the Agent Economy

In order for AI agents to make autonomous decisions and execute trades, they need a reliable and transparent infrastructure. Blockchain provides a suitable foundation for autonomous agents. All transaction records are transparent and difficult to tamper with. Cryptocurrency payments enable automated execution.

However, AI agents currently find it difficult to directly access Blockchain infrastructure. Each developer must build and integrate data sources individually. The network structures vary, hindering standardized access. Even centralized API services require multiple steps, including account registration, key issuance, and payment setup. These processes presuppose human intervention, which is not suitable for autonomous environments.

SQD Network bridges this gap. Based on a permissionless architecture, agents automate data requests and payments through the $SQD token. They receive necessary information in real-time and process it independently. This establishes an operational foundation for autonomous AI that connects directly to the data network without human intervention.

Source: Rezolve.Ai

On October 9, 2025, Rezolve AI announced the acquisition of SQD, further clarifying this direction. Rezolve is a Nasdaq-listed AI-based business solutions provider. Through this acquisition, Rezolve is building the core infrastructure of the AI entity economy. Rezolve plans to combine the digital asset payment infrastructure of the previously acquired Smartpay with the distributed data layer of SQD. This will create an integrated infrastructure allowing AI to process data, intelligence, and payments in a single workflow. Once Rezolve completes this integration, the AI entity will analyze Blockchain data in real-time and execute transactions independently. This marks an important turning point for SQD as a data infrastructure for the AI entity economy.

4.4. Institutional Investors: Real-time Data Infrastructure for the Institutional Market

With the expansion of Real World Asset tokenization (RWA), institutional investors are actively participating on the blockchain. Institutions need a data infrastructure that ensures accuracy and transparency to utilize on-chain data for trading, settlement, and risk management.

Source: OceanStream

SQD has launched OceanStream to meet this demand. OceanStream is a decentralized data lakehouse platform that can stream data from over 200 blockchains in real-time. The platform is designed to provide institutional-level data quality and stability. It combines sub-second latency streaming with over 3PB of indexed historical data to improve the backtesting, market analysis, and risk assessment environment for financial institutions. This enables institutions to monitor more chains and asset classes in real-time at a lower cost. They can execute regulatory reporting and market monitoring within a unified integrated system.

OceanStream participated in a roundtable meeting of the cryptocurrency working group hosted by the U.S. Securities and Exchange Commission, discussing how the transparency and verifiability of on-chain data affect market stability and investor protection. This indicates that SQD is establishing itself as a data-driven structure that connects tokenized financial markets with institutional capital, rather than merely developing infrastructure.

5. Vision of SQD: Building the Data Pillar of Web3

The competitiveness of the Web3 industry depends on its ability to utilize data. However, due to different Blockchain structures, data remains fragmented. The infrastructure to effectively address this issue is still in its early stages. SQD bridges this gap by building a standardized data layer that processes all Blockchain data within a single structure. In addition to on-chain data, SQD also plans to integrate off-chain data, including financial transactions, social media, and enterprise operations, to create an analytical environment that spans both worlds.

This vision is similar to how Snowflake sets the standard for data integration in traditional industries with "one platform, multiple workloads." SQD aims to establish itself as the data pillar of Web3 by integrating Blockchain data and connecting off-chain data sources.

However, SQD needs time to develop into a fully decentralized infrastructure. The project is currently in the guidance stage, and the SQD team still plays an important role. There are limitations in terms of the scale of the developer community and the diversity of the ecosystem. Nevertheless, the growth demonstrated in just over a year since the autonomous online launch, along with the strategic expansion through the acquisition of Rezolve AI, shows a clear direction. SQD is paving the way for blockchain data infrastructure and evolving into the data foundation that supports the entire Web3 ecosystem—from dApp development to institutional investment, and to the AI agent economy. Its potential is expected to grow significantly.

SQD0.44%
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)