Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Launchpad
Be early to the next big token project
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
The Shoal framework significantly drops the delay of Bullshark on Aptos, and the pipeline and reputation mechanism greatly enhance performance.
Shoal Framework: Improved Bullshark Latency on Aptos
Aptos Labs recently addressed two significant open problems in DAG BFT, significantly reducing latency and eliminating the need for pausing in deterministic practical protocols for the first time. Overall, Bullshark’s latency improved by 40% in fault-free conditions and by 80% in fault conditions.
Shoal is a framework that enhances any Narwhal-based consensus protocol ( such as DAG-Rider, Tusk, Bullshark ) through pipelining and leader reputation. Pipelining reduces DAG sorting latency by introducing an anchor point in each round, while leader reputation further improves latency by ensuring that anchor points are associated with the fastest validating nodes. Additionally, leader reputation allows Shoal to leverage asynchronous DAG construction to eliminate timeouts in all scenarios. This enables Shoal to provide an attribute we call universal responsiveness, which encompasses the optimistic responses typically required.
Technically, Shoal runs multiple instances of the underlying protocol in order. So when instantiating Bullshark, we get a group of “sharks” participating in a relay race.
Motivation
In pursuing high performance in blockchain networks, there has been a consistent focus on reducing communication complexity. However, this approach has not resulted in a significant increase in throughput. For example, the Hotstuff implemented in early versions of Diem only achieved 3500 TPS, far below the target of 100k+ TPS.
The recent breakthrough stems from the realization that data dissemination is the main bottleneck based on leader protocol, which can benefit from parallelization. The Narwhal system separates data dissemination from consensus logic, proposing an architecture where all validators disseminate data simultaneously, and the consensus component only sorts a small amount of metadata. The Narwhal paper reports a throughput of 160,000 TPS.
Previously, we introduced Quorum Store, which is the Narwhal implementation that separates data propagation from consensus, and how it can be used to scale the current consensus protocol Jolteon. Jolteon is a leader-based protocol that combines Tendermint’s linear fast path with PBFT-style view changes, reducing Hotstuff latency by 33%. However, leader-based consensus protocols clearly cannot fully leverage Narwhal’s throughput potential. Although data propagation and consensus are separated, as throughput increases, the leader of Hotstuff/Jolteon is still limited.
Therefore, we have decided to deploy Bullshark, a zero-communication-overhead consensus protocol, on top of the Narwhal DAG. Unfortunately, compared to Jolteon, the DAG structure that supports high throughput for Bullshark incurs a 50% latency cost.
This article introduces how Shoal significantly reduces Bullshark latency.
DAG-BFT Background
Each vertex in the Narwhal DAG is associated with a round. To enter round r, a validator must first obtain n-f vertices belonging to round r-1. Each validator can broadcast one vertex per round, and each vertex must reference at least n-f vertices from the previous round. Due to network asynchrony, different validators may observe different local views of the DAG at any given time.
A key property of DAG is that it is unambiguous: if two validating nodes have the same vertex v in their local view of the DAG, then they have exactly the same causal history of v.
Foreword
Consensus on the total order of all vertices in the DAG can be achieved without additional communication overhead. To this end, the validators in DAG-Rider, Tusk, and Bullshark interpret the DAG structure as a consensus protocol, where vertices represent proposals and edges represent votes.
Although the logic of group intersection in DAG structures is different, all existing consensus protocols based on Narwhal have the following structure:
Scheduled Anchor Point: Every few rounds (, like in Bullshark, there will be a predetermined leader, and the peak of the leader is called the anchor point;
Sorting Anchor Points: Validators independently but deterministically decide which anchor points to sort and which anchor points to skip;
Causal History of Ordering: Validators process the ordered anchor point list one by one, sorting all previously unordered vertices in their causal history for each anchor point using deterministic rules.
The key to ensuring security is to ensure that in step )2(, the ordered anchor point list created by all honest validating nodes shares the same prefix. In Shoal, we make the following observations regarding all the protocols mentioned above:
All validators agree on the first ordered anchor point.
![Detailed Explanation of the Shoal Framework: How to Reduce Bullshark Latency on Aptos?])https://img-cdn.gateio.im/webp-social/moments-b7ed8888da112bae8d34c0fdb338b138.webp(
Bullshark Delay
The latency of Bullshark depends on the number of rounds between the ordered anchors in the DAG. While the more practical part of the synchronous version of Bullshark has better latency than the asynchronous version, it is still far from optimal.
Question 1: Average Block Delay. In Bullshark, each even round has an anchor point, and the vertices of each odd round are interpreted as votes. In common cases, two rounds of DAG are required to sort the anchor points; however, the vertices in the causal history of the anchor require more rounds to wait for the anchor to be sorted. In common cases, vertices in odd rounds require three rounds, while non-anchor vertices in even rounds require four rounds.
Question 2: Delay in failure cases. The above delay analysis applies to the fault-free situation; on the other hand, if a leader in a round fails to broadcast the anchor point quickly enough, the anchor point cannot be ordered ) and is thus skipped (. Consequently, all unordered vertices from previous rounds must wait for the next anchor point to be ordered. This significantly reduces the performance of the geographical replication network, especially since Bullshark uses timeout to wait for the leader.
![In-depth Explanation of the Shoal Framework: How to Reduce Bullshark Latency on Aptos?])https://img-cdn.gateio.im/webp-social/moments-46d37add0d9e81b2f295edf8eddd907f.webp(
Shoal Framework
Shoal resolves these two latency issues by enhancing Bullshark ) or any other Narwhal-based BFT protocol ( through pipelining, allowing for an anchor point in each round and reducing the latency of all non-anchor vertices in the DAG to three rounds. Shoal also introduces a zero-cost leader reputation mechanism in the DAG, which biases the selection towards fast leaders.
Challenge
In the context of the DAG protocol, pipeline and leader reputation are considered difficult issues for the following reasons:
Previous assembly lines attempted to modify the core Bullshark logic, but this seems to be fundamentally impossible.
The introduction of leader reputation in DiemBFT and its formalization in Carousel is based on the dynamic selection of future leaders according to the past performance of validators, which is the idea of anchoring in Bullshark ). Although discrepancies in leader identity do not violate the security of these protocols, in Bullshark, it may lead to completely different rankings, which raises the core issue that dynamically and deterministically selecting round anchors is essential for solving consensus, and validators need to reach consensus on an ordered history to choose future anchors.
As evidence of the difficulty of the issue, we note that the implementation of Bullshark, including the current implementation in the production environment, does not support these features.
Agreement
Despite the challenges mentioned above, the solution has proven to be hidden in simplicity.
In Shoal, we rely on the ability to perform local computations on the DAG and achieve the capability of preserving and reinterpreting information from previous rounds. With all validators agreeing on the core insight of the first ordered anchor point, Shoal sequentially combines multiple Bullshark instances for pipelining, making ( the first ordered anchor point the switching point of instances, and the causal history of ) anchors used to calculate the reputation of the leader.
Assembly Line
V that maps rounds to leaders. Shoal runs instances of Bullshark one after another, so for each instance, the anchor is predetermined by the mapping F. Each instance sorts an anchor, which triggers a switch to the next instance.
Initially, Shoal launched the first instance of Bullshark in the first round of the DAG and ran it until the first ordered anchor point was determined, such as in round r. All validators agreed on this anchor point. Therefore, all validators can confidently agree to reinterpret the DAG starting from round r+1. Shoal simply launched a new instance of Bullshark in round r+1.
In the best case, this allows Shoal to sort an anchor in each round. The anchor points of the first round are sorted by the first instance. Then, Shoal starts a new instance in the second round, which itself has an anchor point that is sorted by that instance, and then another new instance sorts the anchor points in the third round, and the process continues.
Leader Reputation
During the Bullshark sorting, skipping anchor points will increase the delay. In this case, pipeline technology is powerless because a new instance cannot be started before the previous instance sorts the anchor point. Shoal ensures that the corresponding leader is less likely to be chosen to handle the lost anchor points in the future by assigning a score to each validator node based on the historical activity of each validator node. Validators that respond to and participate in the protocol will receive high scores; otherwise, the validator nodes will be assigned low scores as they may crash, slow down, or act maliciously.
The concept is to deterministically recalculate the predefined mapping F from rounds to leaders each time the score is updated, favoring the leader with a higher score. To reach consensus on the new mapping, validators should reach consensus on the scores, thereby achieving agreement on the history used to derive the scores.
In Shoal, pipelines and leadership reputation can naturally combine, as they both use the same core technology, which is to reinterpret the DAG after reaching consensus on the first ordered anchor point.
In fact, the only difference is that after sorting the anchors in round r, the validators only need to calculate the new mapping F’ starting from round r+1 based on the causal history of the ordered anchors in round r. Then, the validating nodes execute a new instance of Bullshark using the updated anchor selection function F’ starting from round r+1.
No more timeouts
Timeouts play a critical role in all leader-based deterministic partially synchronous BFT implementations. However, the complexity they introduce increases the number of internal states that need to be managed and monitored, which adds to the complexity of the debugging process and requires more observability techniques.
Timeouts can also significantly increase latency, as it is very important to configure them appropriately, and they often require dynamic adjustments because they heavily depend on the environment ( network ). Before transitioning to the next leader, the protocol imposes a full timeout penalty on the faulty leader. Therefore, the timeout settings cannot be too conservative, but if the timeout is too short, the protocol may skip good leaders. For example, we observed that under high load, leaders in Jolteon/Hotstuff become overwhelmed, and timeouts expire before they can push progress.
Unfortunately, based on leadership