As the combination of Web3 and AI becomes a hot topic in the encryption currency world, the AI infrastructure in the encryption world is thriving. However, there are not many actual applications that utilize AI or are built by AI, and the homogenization problem of AI infrastructure is gradually emerging. Recently, our participation in the first round of financing of RedPill has triggered some deeper understanding.
The main toolkits for building AI Dapps include Decentralization OpenAI access, GPU network, inference network, and proxy network.
The reason why GPU networking is even more popular than the ‘BTCMining era’ is because: the AI market is larger, and the rise is fast and stable; AI supports millions of applications every day; AI requires diverse GPU models and server locations; the technology is more mature than before; and the target customer base is also wider.
Inference networks and proxy networks have similar infrastructure, but with different focuses. Inference networks are mainly for experienced developers to deploy their own models, and running non-LLM models does not necessarily require a GPU. Proxy networks, on the other hand, focus more on LLM, where developers do not need to bring their own models but rather emphasize prompt engineering and how to connect different proxies together. Proxy networks always require high-performance GPUs.
AI infrastructure projects promise huge and are still constantly rolling out new features.
Most native encryption projects are still in the Testnet stage, with poor stability, complex configuration, limited functionality, and need time to prove their security and privacy.
Assuming that AI Dapp becomes a major trend, there are still many undeveloped areas, such as monitoring, infrastructure related to RAG, Web3 native models, built-in encryption native API and data Decentralization agent, evaluation network, etc.
Vertical integration is a significant trend. Infrastructure projects attempt to provide one-stop services to simplify the work of AI Dapp developers.
The future will be hybrid. Some reasoning is done on the front end, and some is done on-chain, considering cost and verifiability factors.
Source: IOSG
Introduction
The combination of Web3 and AI is one of the most eye-catching topics in the field of encryption. Talented developers are building AI infrastructure for the encryption world, aiming to bring intelligence into Smart Contracts. Building an AI Dapp is an extremely complex task, and developers need to handle a range of aspects including data, models, computing power, operations, deployment, and integration with the blockchain.
To meet these needs, Web3 founders have developed many preliminary solutions, such as GPU networks, community data annotation, community-trained models, verifiable AI inference and training, and agent stores. However, in this prosperous infrastructure background, there are not many actual applications using AI or built for AI.
Developers looking for AI Dapp development tutorials find that there are not many tutorials related to the native encryption AI infrastructure. Most tutorials only involve calling the OpenAI API on the front end.
Source: IOSG Ventures
The current application has not fully utilized the Decentralization and verifiability features of blockchain, but this situation will soon change. Now, most of the AI infrastructure focused on encryption has launched test networks and plans to be officially operational in the next 6 months. This study will provide a detailed introduction to the main tools available in the AI infrastructure in the encryption field. Let’s get ready for the GPT-3.5 moment in the encryption world!
RedPill: Providing Decentralization authorization to OpenAI
The RedPill we mentioned earlier is a great entry point. OpenAI has several world-class powerful models, such as GPT-4-vision, GPT-4-turbo, and GPT-4o, which are the preferred options for building advanced AI Dapps. Developers can integrate them into Dapps by calling the OpenAI API through the Oracle Machine or frontend interface.
RedPill integrates OpenAI APIs from different developers into a single interface, providing fast, economical, and verifiable AI services to global users, thus democratizing access to top AI model resources. RedPill’s routing algorithm will direct developers’ requests to a single contributor. API requests will be executed through its distribution network, bypassing any potential restrictions from OpenAI, addressing some common issues that developers face, such as: 01928374656574839201
• TPM Limitation (Tokens Per Minute): The use of Tokens by new accounts is limited and cannot meet the needs of popular AI-dependent Dapps.
• Access Restrictions: Some models have access restrictions set for new accounts or certain countries.
By using the same request code but changing the hostname, developers can access OpenAI models at low cost, high scalability, and unlimited capacity.
GPU Network
In addition to using OpenAI’s API, many developers also choose to host models at home. They can rely on decentralized GPU networks such as io.net, Aethir, Akash, and other popular networks to build GPU clusters, deploy, and run various powerful internal or open-source models themselves.
Such a Decentralization GPU network can leverage the computing power of individuals or small data centers to provide flexible configurations, more server location choices, and lower costs, allowing developers to easily conduct AI-related experiments within a limited budget. However, due to the nature of Decentralization, such GPU networks still have certain limitations in functionality, availability, and data privacy.
In the past few months, there has been a high demand for GPUs, surpassing the previous BTCMining craze. The reasons for this phenomenon include:
The growing number of target customers, the GPU network now serves AI developers, whose numbers are not only huge but also more loyal, and will not be affected by the Fluctuation of Cryptocurrency prices.
Compared to the Mining-specific equipment, Decentralization GPU offers a greater variety of models and specifications, better meeting the requirements. Especially for large-scale model processing, higher VRAM is needed, while smaller tasks have more suitable GPU options. Meanwhile, Decentralization GPU can serve end users at close range, drop latency.
As the technology matures, GPU networks rely on high-speed blockchains such as Solana Settlement, Docker virtualization technology, and Ray computing clusters.
In terms of investment returns, the AI market is expanding, with many opportunities for the development of new applications and models. The expected return rate of the H100 model is 60-70%, while BTCMining is more complex, with winners taking all and limited output.
BTCMining companies such as Iris Energy, Core Scientific, and Bitdeer have also started supporting GPU networks, providing AI services, and actively purchasing GPUs designed for AI, such as H100.
Recommendation: For Web2 developers who do not prioritize SLA, io.net provides a simple and user-friendly experience, making it a cost-effective choice.
This is the core of the encryption native AI infrastructure. It will support billions of AI inference operations in the future. Many AI layer1 or layer2 provide developers with the ability to call AI inference natively on-chain. Market leaders include Ritual, Valence, and Fetch.ai.
These networks differ in the following aspects: performance (latency, computation time), supported model verifiability, cost (on-chain consumption cost, inference cost), and development experience.
3.1 Target
The ideal scenario is that developers can easily access custom AI inference services anywhere, in any form of proof, with almost no obstacles in the integration process. The inference network provides all the basic support developers need, including on-demand generation and proof of validation, inference computation, relay and verification of inference data, providing interfaces for Web2 and Web3, one-click model deployment, system monitoring, cross-chain interaction operations, synchronous integration and scheduled execution, etc.
With these features, developers can seamlessly integrate inference services into their existing Smart Contracts. For example, when building decentralized finance trading robots, these robots use machine learning models to find the buying and selling opportunities of specific trading pairs and execute corresponding trading strategies on the underlying trading platform.
In a completely ideal state, all infrastructure is cloud-hosted. Developers only need to upload their trading strategy models in a common format such as torch, and the inference network will store and provide models for Web2 and Web3 queries.
Once all model deployment steps are completed, developers can directly call the model inference through Web3 API or smart contracts. The inference network will continue to execute these trading strategies and feed the results back to the underlying smart contract. If the amount of community funds managed by the developer is large, verification of the inference results is also required. Once the inference results are received, the smart contract will execute trades based on these results.
3.1.1 Asynchronous and Synchronous
In theory, asynchronous reasoning operations can lead to better performance; however, this approach may be inconvenient in terms of development experience. When using asynchronous mode, developers need to first submit tasks to the intelligent contract of the reasoning network. When the reasoning task is completed, the Smart Contract of the reasoning network will return the results. In this programming mode, logic is divided into two parts: reasoning invocation and reasoning result processing.
If developers have nested inference calls and a lot of control logic, the situation will get worse.
Asynchronous programming makes it difficult to integrate with existing Smart Contract. This requires developers to write a large amount of additional code, and to handle errors and manage dependencies. In contrast, synchronous programming is more intuitive for developers, but it introduces problems in response time and Block chain design. For example, if the input data is fast-changing data such as Block time or price, then the data is no longer fresh after the inference is completed, which may cause the execution of Smart Contract to require Rollback in specific cases. Imagine trading with an outdated price.
Most AI infrastructures adopt asynchronous processing, but Valence is trying to address these issues.
3.2 Real Situation
In fact, many new inference networks are still in the testing phase, such as the Ritual network. According to their public documents, the current functionality of these networks is relatively limited (functions such as verification and proof have not yet been launched). They do not currently provide a cloud infrastructure to support on-chain AI computing, but instead offer a framework for self-hosted AI computing and delivering results on-chain. This is an architecture for running AIGC Non-fungible Tokens. The diffusion model generates Non-fungible Tokens and uploads them to Arweave. The inference network will use this Arweave Address to on-chain mint the Non-fungible Token.
This process is very complex, developers need to deploy and maintain most of the infrastructure themselves, such as Ritual Node with custom service logic, Stable Diffusion Node, and Non-fungible Token Smart Contract. Recommendation: The current reasoning network is quite complex in integrating and deploying custom models, and at this stage, most networks do not yet support verification functions. Applying AI technology to the front end will provide developers with a relatively simple choice. If you really need verification functionality, ZKML provider Giza is a good choice.
Proxy Network
The proxy network allows users to easily customize proxies. Such a network is composed of entities or smart contracts that can autonomously perform tasks, interact with each other and interact with the blockchain network, without the need for direct human intervention. It is mainly targeted at LLM technology. For example, it can provide a GPT chatbot that has a deep understanding of Ethereum. The current tools for this chatbot are limited, and developers cannot develop complex applications based on it.
But in the future, the proxy network will provide more tools for agents to use, not just knowledge, but also the ability to call external APIs and perform specific tasks. Developers will be able to connect multiple agents together to build workflows. For example, writing Solidity smart contracts involves multiple specialized agents, including protocol design agents, Solidity development agents, code security review agents, and Solidity deployment agents.
We coordinate the collaboration of these agents by using prompts and scenarios. Some examples of agent networks include Flock.ai, Myshell, Theoriq.
Recommendation: Most of the current proxies have relatively limited functionality. For specific use cases, Web2 proxies can better serve and have mature orchestration tools, such as Langchain, Llamaindex.
The difference between proxy networks and inference networks
The proxy network focuses more on LLM, providing tools such as Langchain to integrate multiple proxies. In most cases, developers do not need to develop machine learning models themselves, as the proxy network has simplified the process of model development and deployment. They only need to link the necessary proxies and tools. In most cases, end users will directly use these proxies.
The reasoning network is the underlying infrastructure support of the agent network. It provides developers with lower-level access permissions. In normal circumstances, end users do not directly use the reasoning network. Developers need to deploy their own models, which is not limited to LLM, and they can use them through off-chain or on-chain access points. The agent network and the reasoning network are not completely independent products. We have begun to see some vertically integrated products. Because these two functions rely on similar infrastructure, they simultaneously provide agent and reasoning capabilities.
New Opportunity Land
In addition to model inference, training, and proxy networks, there are many new areas worth exploring in the web3 field:
Dataset: How to transform blockchain data into machine learning-ready datasets? Machine learning developers need more specific and specialized data. For example, Giza provides high-quality datasets specifically for training machine learning models in Decentralized Finance. The ideal dataset should not only include simple tabular data, but also graphical data that can describe interactions in the blockchain world. Currently, we still have some shortcomings in this area. Some projects are addressing this issue by incentivizing individuals to create new datasets, such as Bagel and Sahara, which promise to protect the privacy of personal data.
Model Storage: Storing, distributing, and version controlling large models is crucial for on-chain machine learning, as it affects the performance and cost. In this field, pioneering projects such as FIL, AR, and 0g have made progress.
Model training: Distributed and verifiable model training is a challenge. Significant progress has been made by Gensyn, Bittensor, Flock, Allora, etc. Monitoring: As model inference occurs on-chain and off-chain, we need new infrastructure to help web3 developers track the usage and detect potential issues and biases. With appropriate monitoring tools, web3 ML developers can make timely adjustments and continuously optimize model accuracy.
RAG infrastructure: Distributed RAG requires a brand new infrastructure environment, with high demands for storage, embedded computing, and vector databases, while ensuring the privacy and security of data. This is very different from the current Web3 AI infrastructure, most of which rely on third parties to complete RAG, such as Firstbatch and Bagel.
A model tailored for Web3: Not all models are suitable for the Web3 scenario. In most cases, it is necessary to retrain the model to adapt to specific applications such as price prediction and recommendations. With the prosperous development of AI infrastructure, we expect to have more web3 native models to serve AI applications in the future. For example, Pond is developing a blockchain GNN for various scenarios such as price prediction, recommendations, fraud detection, and anti-money laundering.
Network evaluation: It is not easy to evaluate agents without human feedback. With the popularization of agent creation tools, there will be countless agents in the market. This requires a system to showcase the abilities of these agents and help users determine which agent performs best in specific situations. For example, Neuronets is a participant in this field.
Consensus Mechanism: For AI tasks, PoS may not be the best choice. The main challenges facing PoS are computational complexity, difficulty in verification, and lack of determinism. Bittensor has created a new intelligent Consensus Mechanism that rewards Nodes for contributing machine learning models and outputs to the network.
Future Outlook
We have observed a trend of vertical integration. By building a foundational computing layer, the network is able to support various machine learning tasks, including training, inference, and proxy network services. This model aims to provide a comprehensive one-stop solution for Web3 machine learning developers. Currently, on-chain inference, despite its high cost and slow speed, offers excellent verifiability and seamless integration with backend systems such as Smart Contracts. I believe the future will move towards hybrid applications. Some inference processing will be done on the frontend or off-chain, while critical and decision-making inferences will be performed on-chain. This model has already been applied on mobile devices. By leveraging the inherent characteristics of mobile devices, it can quickly run small models locally and migrate more complex tasks to the cloud, utilizing larger LLM processing.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
IOSG: Where is the way out for homogeneous WEB3 + AI infrastructure?
Author: IOSG
TL;DR
As the combination of Web3 and AI becomes a hot topic in the encryption currency world, the AI infrastructure in the encryption world is thriving. However, there are not many actual applications that utilize AI or are built by AI, and the homogenization problem of AI infrastructure is gradually emerging. Recently, our participation in the first round of financing of RedPill has triggered some deeper understanding.
The main toolkits for building AI Dapps include Decentralization OpenAI access, GPU network, inference network, and proxy network.
The reason why GPU networking is even more popular than the ‘BTCMining era’ is because: the AI market is larger, and the rise is fast and stable; AI supports millions of applications every day; AI requires diverse GPU models and server locations; the technology is more mature than before; and the target customer base is also wider.
Inference networks and proxy networks have similar infrastructure, but with different focuses. Inference networks are mainly for experienced developers to deploy their own models, and running non-LLM models does not necessarily require a GPU. Proxy networks, on the other hand, focus more on LLM, where developers do not need to bring their own models but rather emphasize prompt engineering and how to connect different proxies together. Proxy networks always require high-performance GPUs.
AI infrastructure projects promise huge and are still constantly rolling out new features.
Most native encryption projects are still in the Testnet stage, with poor stability, complex configuration, limited functionality, and need time to prove their security and privacy.
Assuming that AI Dapp becomes a major trend, there are still many undeveloped areas, such as monitoring, infrastructure related to RAG, Web3 native models, built-in encryption native API and data Decentralization agent, evaluation network, etc.
Vertical integration is a significant trend. Infrastructure projects attempt to provide one-stop services to simplify the work of AI Dapp developers.
The future will be hybrid. Some reasoning is done on the front end, and some is done on-chain, considering cost and verifiability factors.
Source: IOSG
Introduction
The combination of Web3 and AI is one of the most eye-catching topics in the field of encryption. Talented developers are building AI infrastructure for the encryption world, aiming to bring intelligence into Smart Contracts. Building an AI Dapp is an extremely complex task, and developers need to handle a range of aspects including data, models, computing power, operations, deployment, and integration with the blockchain.
To meet these needs, Web3 founders have developed many preliminary solutions, such as GPU networks, community data annotation, community-trained models, verifiable AI inference and training, and agent stores. However, in this prosperous infrastructure background, there are not many actual applications using AI or built for AI.
Developers looking for AI Dapp development tutorials find that there are not many tutorials related to the native encryption AI infrastructure. Most tutorials only involve calling the OpenAI API on the front end.
Source: IOSG Ventures
The current application has not fully utilized the Decentralization and verifiability features of blockchain, but this situation will soon change. Now, most of the AI infrastructure focused on encryption has launched test networks and plans to be officially operational in the next 6 months. This study will provide a detailed introduction to the main tools available in the AI infrastructure in the encryption field. Let’s get ready for the GPT-3.5 moment in the encryption world!
The RedPill we mentioned earlier is a great entry point. OpenAI has several world-class powerful models, such as GPT-4-vision, GPT-4-turbo, and GPT-4o, which are the preferred options for building advanced AI Dapps. Developers can integrate them into Dapps by calling the OpenAI API through the Oracle Machine or frontend interface.
RedPill integrates OpenAI APIs from different developers into a single interface, providing fast, economical, and verifiable AI services to global users, thus democratizing access to top AI model resources. RedPill’s routing algorithm will direct developers’ requests to a single contributor. API requests will be executed through its distribution network, bypassing any potential restrictions from OpenAI, addressing some common issues that developers face, such as: 01928374656574839201
• TPM Limitation (Tokens Per Minute): The use of Tokens by new accounts is limited and cannot meet the needs of popular AI-dependent Dapps.
• Access Restrictions: Some models have access restrictions set for new accounts or certain countries.
By using the same request code but changing the hostname, developers can access OpenAI models at low cost, high scalability, and unlimited capacity.
In addition to using OpenAI’s API, many developers also choose to host models at home. They can rely on decentralized GPU networks such as io.net, Aethir, Akash, and other popular networks to build GPU clusters, deploy, and run various powerful internal or open-source models themselves.
Such a Decentralization GPU network can leverage the computing power of individuals or small data centers to provide flexible configurations, more server location choices, and lower costs, allowing developers to easily conduct AI-related experiments within a limited budget. However, due to the nature of Decentralization, such GPU networks still have certain limitations in functionality, availability, and data privacy.
In the past few months, there has been a high demand for GPUs, surpassing the previous BTCMining craze. The reasons for this phenomenon include:
The growing number of target customers, the GPU network now serves AI developers, whose numbers are not only huge but also more loyal, and will not be affected by the Fluctuation of Cryptocurrency prices.
Compared to the Mining-specific equipment, Decentralization GPU offers a greater variety of models and specifications, better meeting the requirements. Especially for large-scale model processing, higher VRAM is needed, while smaller tasks have more suitable GPU options. Meanwhile, Decentralization GPU can serve end users at close range, drop latency.
As the technology matures, GPU networks rely on high-speed blockchains such as Solana Settlement, Docker virtualization technology, and Ray computing clusters.
In terms of investment returns, the AI market is expanding, with many opportunities for the development of new applications and models. The expected return rate of the H100 model is 60-70%, while BTCMining is more complex, with winners taking all and limited output.
BTCMining companies such as Iris Energy, Core Scientific, and Bitdeer have also started supporting GPU networks, providing AI services, and actively purchasing GPUs designed for AI, such as H100.
Recommendation: For Web2 developers who do not prioritize SLA, io.net provides a simple and user-friendly experience, making it a cost-effective choice.
This is the core of the encryption native AI infrastructure. It will support billions of AI inference operations in the future. Many AI layer1 or layer2 provide developers with the ability to call AI inference natively on-chain. Market leaders include Ritual, Valence, and Fetch.ai.
These networks differ in the following aspects: performance (latency, computation time), supported model verifiability, cost (on-chain consumption cost, inference cost), and development experience.
3.1 Target
The ideal scenario is that developers can easily access custom AI inference services anywhere, in any form of proof, with almost no obstacles in the integration process. The inference network provides all the basic support developers need, including on-demand generation and proof of validation, inference computation, relay and verification of inference data, providing interfaces for Web2 and Web3, one-click model deployment, system monitoring, cross-chain interaction operations, synchronous integration and scheduled execution, etc.
With these features, developers can seamlessly integrate inference services into their existing Smart Contracts. For example, when building decentralized finance trading robots, these robots use machine learning models to find the buying and selling opportunities of specific trading pairs and execute corresponding trading strategies on the underlying trading platform.
In a completely ideal state, all infrastructure is cloud-hosted. Developers only need to upload their trading strategy models in a common format such as torch, and the inference network will store and provide models for Web2 and Web3 queries.
Once all model deployment steps are completed, developers can directly call the model inference through Web3 API or smart contracts. The inference network will continue to execute these trading strategies and feed the results back to the underlying smart contract. If the amount of community funds managed by the developer is large, verification of the inference results is also required. Once the inference results are received, the smart contract will execute trades based on these results.
3.1.1 Asynchronous and Synchronous
In theory, asynchronous reasoning operations can lead to better performance; however, this approach may be inconvenient in terms of development experience. When using asynchronous mode, developers need to first submit tasks to the intelligent contract of the reasoning network. When the reasoning task is completed, the Smart Contract of the reasoning network will return the results. In this programming mode, logic is divided into two parts: reasoning invocation and reasoning result processing.
If developers have nested inference calls and a lot of control logic, the situation will get worse.
Asynchronous programming makes it difficult to integrate with existing Smart Contract. This requires developers to write a large amount of additional code, and to handle errors and manage dependencies. In contrast, synchronous programming is more intuitive for developers, but it introduces problems in response time and Block chain design. For example, if the input data is fast-changing data such as Block time or price, then the data is no longer fresh after the inference is completed, which may cause the execution of Smart Contract to require Rollback in specific cases. Imagine trading with an outdated price.
Most AI infrastructures adopt asynchronous processing, but Valence is trying to address these issues.
3.2 Real Situation
In fact, many new inference networks are still in the testing phase, such as the Ritual network. According to their public documents, the current functionality of these networks is relatively limited (functions such as verification and proof have not yet been launched). They do not currently provide a cloud infrastructure to support on-chain AI computing, but instead offer a framework for self-hosted AI computing and delivering results on-chain. This is an architecture for running AIGC Non-fungible Tokens. The diffusion model generates Non-fungible Tokens and uploads them to Arweave. The inference network will use this Arweave Address to on-chain mint the Non-fungible Token.
This process is very complex, developers need to deploy and maintain most of the infrastructure themselves, such as Ritual Node with custom service logic, Stable Diffusion Node, and Non-fungible Token Smart Contract. Recommendation: The current reasoning network is quite complex in integrating and deploying custom models, and at this stage, most networks do not yet support verification functions. Applying AI technology to the front end will provide developers with a relatively simple choice. If you really need verification functionality, ZKML provider Giza is a good choice.
The proxy network allows users to easily customize proxies. Such a network is composed of entities or smart contracts that can autonomously perform tasks, interact with each other and interact with the blockchain network, without the need for direct human intervention. It is mainly targeted at LLM technology. For example, it can provide a GPT chatbot that has a deep understanding of Ethereum. The current tools for this chatbot are limited, and developers cannot develop complex applications based on it.
But in the future, the proxy network will provide more tools for agents to use, not just knowledge, but also the ability to call external APIs and perform specific tasks. Developers will be able to connect multiple agents together to build workflows. For example, writing Solidity smart contracts involves multiple specialized agents, including protocol design agents, Solidity development agents, code security review agents, and Solidity deployment agents.
We coordinate the collaboration of these agents by using prompts and scenarios. Some examples of agent networks include Flock.ai, Myshell, Theoriq. Recommendation: Most of the current proxies have relatively limited functionality. For specific use cases, Web2 proxies can better serve and have mature orchestration tools, such as Langchain, Llamaindex.
The difference between proxy networks and inference networks
The proxy network focuses more on LLM, providing tools such as Langchain to integrate multiple proxies. In most cases, developers do not need to develop machine learning models themselves, as the proxy network has simplified the process of model development and deployment. They only need to link the necessary proxies and tools. In most cases, end users will directly use these proxies.
The reasoning network is the underlying infrastructure support of the agent network. It provides developers with lower-level access permissions. In normal circumstances, end users do not directly use the reasoning network. Developers need to deploy their own models, which is not limited to LLM, and they can use them through off-chain or on-chain access points. The agent network and the reasoning network are not completely independent products. We have begun to see some vertically integrated products. Because these two functions rely on similar infrastructure, they simultaneously provide agent and reasoning capabilities.
Dataset: How to transform blockchain data into machine learning-ready datasets? Machine learning developers need more specific and specialized data. For example, Giza provides high-quality datasets specifically for training machine learning models in Decentralized Finance. The ideal dataset should not only include simple tabular data, but also graphical data that can describe interactions in the blockchain world. Currently, we still have some shortcomings in this area. Some projects are addressing this issue by incentivizing individuals to create new datasets, such as Bagel and Sahara, which promise to protect the privacy of personal data.
Model Storage: Storing, distributing, and version controlling large models is crucial for on-chain machine learning, as it affects the performance and cost. In this field, pioneering projects such as FIL, AR, and 0g have made progress.
Model training: Distributed and verifiable model training is a challenge. Significant progress has been made by Gensyn, Bittensor, Flock, Allora, etc. Monitoring: As model inference occurs on-chain and off-chain, we need new infrastructure to help web3 developers track the usage and detect potential issues and biases. With appropriate monitoring tools, web3 ML developers can make timely adjustments and continuously optimize model accuracy.
RAG infrastructure: Distributed RAG requires a brand new infrastructure environment, with high demands for storage, embedded computing, and vector databases, while ensuring the privacy and security of data. This is very different from the current Web3 AI infrastructure, most of which rely on third parties to complete RAG, such as Firstbatch and Bagel.
A model tailored for Web3: Not all models are suitable for the Web3 scenario. In most cases, it is necessary to retrain the model to adapt to specific applications such as price prediction and recommendations. With the prosperous development of AI infrastructure, we expect to have more web3 native models to serve AI applications in the future. For example, Pond is developing a blockchain GNN for various scenarios such as price prediction, recommendations, fraud detection, and anti-money laundering.
Network evaluation: It is not easy to evaluate agents without human feedback. With the popularization of agent creation tools, there will be countless agents in the market. This requires a system to showcase the abilities of these agents and help users determine which agent performs best in specific situations. For example, Neuronets is a participant in this field.
Consensus Mechanism: For AI tasks, PoS may not be the best choice. The main challenges facing PoS are computational complexity, difficulty in verification, and lack of determinism. Bittensor has created a new intelligent Consensus Mechanism that rewards Nodes for contributing machine learning models and outputs to the network.
We have observed a trend of vertical integration. By building a foundational computing layer, the network is able to support various machine learning tasks, including training, inference, and proxy network services. This model aims to provide a comprehensive one-stop solution for Web3 machine learning developers. Currently, on-chain inference, despite its high cost and slow speed, offers excellent verifiability and seamless integration with backend systems such as Smart Contracts. I believe the future will move towards hybrid applications. Some inference processing will be done on the frontend or off-chain, while critical and decision-making inferences will be performed on-chain. This model has already been applied on mobile devices. By leveraging the inherent characteristics of mobile devices, it can quickly run small models locally and migrate more complex tasks to the cloud, utilizing larger LLM processing.