Skip to content

Conversation

@DaMandal0rian
Copy link
Contributor

@DaMandal0rian DaMandal0rian commented Aug 23, 2025

Summary

Adds weighted RPC load balancing with dynamic health-aware traffic distribution across Ethereum RPC providers.

Key changes

  • Weighted adapter selection: Probabilistic selection via WeightedIndex using configurable provider weights (0.0–1.0)
  • Real health checks: Periodic eth_blockNumber calls (5s timeout, 10s interval) replace the previous stub; health scores adjust weights dynamically
  • Atomics over RwLock: Health metrics use AtomicU64/AtomicU32 to avoid lock poisoning
  • Per-chain health checkers: Each chain gets its own set of health checkers instead of a shared global list
  • Cancellation support: Health check task accepts a CancellationToken for graceful shutdown
  • O(1) health lookup: HashMap<String, Arc<Health>> replaces linear scan
  • Encapsulated adapter field: EthereumNetworkAdapter.adapter is now private with a getter

Configuration

weighted_rpc_steering = true

[chains.mainnet]
provider = [
  { label = "primary", url = "http://rpc1.io", weight = 0.7 },
  { label = "backup",  url = "http://rpc2.io", weight = 0.3 },
]
  • Weights must be between 0.0 and 1.0; they don't need to sum to 1.0
  • Weight of 0.0 disables a provider from weighted selection while keeping it available for error-retesting
  • Weight is only meaningful for RPC providers (ignored for firehose)

Closes OPS-727

@DaMandal0rian DaMandal0rian marked this pull request as draft August 23, 2025 11:51
@DaMandal0rian DaMandal0rian force-pushed the feature/add-weighted-random-steering-load-balancing branch from 53221d2 to 1962635 Compare August 23, 2025 17:42
@DaMandal0rian DaMandal0rian marked this pull request as ready for review August 23, 2025 17:49
@DaMandal0rian DaMandal0rian changed the title feat: implement weighted RPC load balancing with comprehensive improv… feat: implement weighted RPC load balancing Aug 23, 2025
@DaMandal0rian DaMandal0rian changed the title feat: implement weighted RPC load balancing feat: implement weighted RPC load balancing with traffic distribution Aug 23, 2025
DaMandal0rian added a commit that referenced this pull request Aug 24, 2025
…lience

This commit introduces dynamic weight adjustment for RPC providers, improving failover and resilience by adapting to real-time provider health.

Key changes include:
- Introduced a `Health` module (`chain/ethereum/src/health.rs`) to monitor RPC provider latency, error rates, and consecutive failures.
- Integrated health metrics into the RPC provider selection logic in `chain/ethereum/src/network.rs`.
- Dynamically adjusts provider weights based on their health scores, ensuring traffic is steered away from underperforming endpoints.
- Updated `node/src/network_setup.rs` to initialize and manage health checkers for Ethereum RPC adapters.
- Added `tokio` dependency to `chain/ethereum/Cargo.toml` and `node/Cargo.toml` for asynchronous health checks.
- Refactored test cases in `chain/ethereum/src/network.rs` to accommodate dynamic weighting.

This enhancement builds upon the existing static weighted RPC steering, allowing for more adaptive and robust RPC management.

Fixes #6126
DaMandal0rian added a commit that referenced this pull request Aug 24, 2025
…lience

This commit introduces dynamic weight adjustment for RPC providers, improving failover and resilience by adapting to real-time provider health.

Key changes include:
- Introduced a `Health` module (`chain/ethereum/src/health.rs`) to monitor RPC provider latency, error rates, and consecutive failures.
- Integrated health metrics into the RPC provider selection logic in `chain/ethereum/src/network.rs`.
- Dynamically adjusts provider weights based on their health scores, ensuring traffic is steered away from underperforming endpoints.
- Updated `node/src/network_setup.rs` to initialize and manage health checkers for Ethereum RPC adapters.
- Added `tokio` dependency to `chain/ethereum/Cargo.toml` and `node/Cargo.toml` for asynchronous health checks.
- Refactored test cases in `chain/ethereum/src/network.rs` to accommodate dynamic weighting.

This enhancement builds upon the existing static weighted RPC steering, allowing for more adaptive and robust RPC management.

Fixes #6126
@DaMandal0rian DaMandal0rian force-pushed the feature/add-weighted-random-steering-load-balancing branch 2 times, most recently from 8c89755 to 1962635 Compare August 24, 2025 17:33
@github-actions
Copy link

This pull request hasn't had any activity for the last 90 days. If there's no more activity over the course of the next 14 days, it will automatically be closed.

@github-actions github-actions bot added the Stale label Dec 20, 2025
@github-actions github-actions bot closed this Jan 6, 2026
@DaMandal0rian DaMandal0rian reopened this Jan 21, 2026
@DaMandal0rian DaMandal0rian requested a review from lutter January 21, 2026 19:28
@linear
Copy link

linear bot commented Jan 21, 2026

@github-actions github-actions bot removed the Stale label Jan 22, 2026
DaMandal0rian added a commit that referenced this pull request Jan 22, 2026
…lience (#6128)

* feat: Implement dynamic weighted RPC load balancing for enhanced resilience

This commit introduces dynamic weight adjustment for RPC providers, improving failover and resilience by adapting to real-time provider health.

Key changes include:
- Introduced a `Health` module (`chain/ethereum/src/health.rs`) to monitor RPC provider latency, error rates, and consecutive failures.
- Integrated health metrics into the RPC provider selection logic in `chain/ethereum/src/network.rs`.
- Dynamically adjusts provider weights based on their health scores, ensuring traffic is steered away from underperforming endpoints.
- Updated `node/src/network_setup.rs` to initialize and manage health checkers for Ethereum RPC adapters.
- Added `tokio` dependency to `chain/ethereum/Cargo.toml` and `node/Cargo.toml` for asynchronous health checks.
- Refactored test cases in `chain/ethereum/src/network.rs` to accommodate dynamic weighting.

This enhancement builds upon the existing static weighted RPC steering, allowing for more adaptive and robust RPC management.

Fixes #6126

* bump: tokio
Copilot AI review requested due to automatic review settings January 22, 2026 22:31
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements weighted load balancing for RPC endpoints in graph-node, allowing operators to configure traffic distribution across providers using configurable weights (0.0-1.0). The implementation includes a health checking system that monitors provider performance and adjusts routing weights dynamically.

Changes:

  • Added weighted RPC steering feature flag and configuration validation for provider weights
  • Implemented probabilistic adapter selection using WeightedIndex with health score integration
  • Created health checking system that monitors RPC providers and calculates performance scores

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
node/src/opt.rs Adds CLI flag for enabling weighted RPC steering
node/src/config.rs Adds weight field to Provider struct with validation (0.0-1.0 range)
node/src/network_setup.rs Integrates weighted flag and health checker initialization into network setup
node/src/chain.rs Passes provider weight values to Ethereum adapters
chain/ethereum/src/network.rs Implements weighted adapter selection algorithm with health score integration
chain/ethereum/src/health.rs New health checking module with provider monitoring and scoring
chain/ethereum/src/lib.rs Exports new health module
node/resources/tests/full_config.toml Adds documentation and examples for weight configuration
Cargo.toml files Adds tokio dependency for async health checking

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

DaMandal0rian and others added 5 commits January 31, 2026 12:04
…ements (#6090)

This commit introduces a complete weighted load balancing system for RPC endpoints
with traffic distribution based on configurable provider weights (0.0-1.0).

- Implements probabilistic selection using WeightedIndex from rand crate
- Supports decimal weights (0.0-1.0) for precise traffic distribution
- Weights are relative and don't need to sum to 1.0 (normalized internally)
- Graceful fallback to random selection if weights are invalid

- Improved error retesting logic that preserves weight distribution
- Error retesting now occurs AFTER weight-based selection to minimize skew
- Maintains existing failover capabilities while respecting configured weights
- Robust handling of edge cases (all zero weights, invalid configurations)

- Added `weighted_rpc_steering` flag to enable/disable weighted selection
- Provider weight validation ensures values are between 0.0 and 1.0
- Validation prevents all-zero weight configurations
- Comprehensive configuration documentation with usage examples

- Refactored adapter selection into modular, well-documented functions:
  - `select_best_adapter()`: Chooses between weighted/random strategies
  - `select_weighted_adapter()`: Implements WeightedIndex-based selection
  - `select_random_adapter()`: Enhanced random selection with error consideration
- Added comprehensive inline documentation explaining algorithms
- Maintains thread safety with proper Arc usage and thread-safe RNG
- Added test coverage for weighted selection with statistical validation

- Extended Provider struct with f64 weight field (default: 1.0)
- Added weight validation in Provider::validate() method
- Added Chain-level validation to prevent all-zero weight configurations
- Integrated with existing configuration validation pipeline

- Added --weighted-rpc-steering command line flag (node/src/opt.rs)
- Integrated weighted flag through network setup pipeline (node/src/network_setup.rs)
- Updated chain configuration to pass weight values to adapters (node/src/chain.rs)

- Added comprehensive configuration documentation in full_config.toml
- Includes weight range explanation, distribution examples, and usage guidelines
- Clear examples showing relative weight calculations and traffic distribution

- Updated rand dependency to use appropriate version with WeightedIndex support
- Proper import paths for rand 0.9 distribution modules
- Fixed compilation issues with correct trait imports (Distribution)

- Comprehensive inline documentation for all weight-related methods
- Clear separation of concerns with single-responsibility functions
- Maintained backward compatibility with existing random selection
- Added statistical test validation for weight distribution accuracy

- Comprehensive test suite validates weight distribution over 1000 iterations
- Statistical validation with 10% tolerance for weight accuracy
- All existing tests continue to pass, ensuring no regression
- Build verification across all affected packages

```toml
weighted_rpc_steering = true

[chains.mainnet]
provider = [
  { label = "primary", url = "http://rpc1.io", weight = 0.7 },   # 70% traffic
  { label = "backup", url = "http://rpc2.io", weight = 0.3 },    # 30% traffic
]
```

This implementation provides production-ready weighted load balancing with
robust error handling, comprehensive validation, and excellent maintainability.

🤖 Generated with Claude Code
- Remove unused one_f64() function that was causing CI warnings
- Remove unused serde default attribute from Provider.weight field
- Add missing weighted_rpc_steering field to test fixtures
- Apply cargo fmt formatting fixes

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude
…lience (#6128)

* feat: Implement dynamic weighted RPC load balancing for enhanced resilience

This commit introduces dynamic weight adjustment for RPC providers, improving failover and resilience by adapting to real-time provider health.

Key changes include:
- Introduced a `Health` module (`chain/ethereum/src/health.rs`) to monitor RPC provider latency, error rates, and consecutive failures.
- Integrated health metrics into the RPC provider selection logic in `chain/ethereum/src/network.rs`.
- Dynamically adjusts provider weights based on their health scores, ensuring traffic is steered away from underperforming endpoints.
- Updated `node/src/network_setup.rs` to initialize and manage health checkers for Ethereum RPC adapters.
- Added `tokio` dependency to `chain/ethereum/Cargo.toml` and `node/Cargo.toml` for asynchronous health checks.
- Refactored test cases in `chain/ethereum/src/network.rs` to accommodate dynamic weighting.

This enhancement builds upon the existing static weighted RPC steering, allowing for more adaptive and robust RPC management.

Fixes #6126

* bump: tokio
- Add `health_check()` method to EthereumAdapter using `eth_blockNumber`
  with a fixed 5s timeout independent of json_rpc_timeout
- Replace RwLock with atomics (AtomicU64/AtomicU32) in Health struct,
  following the EndpointMetrics pattern to avoid lock poisoning
- Add CancellationToken support to health_check_task for graceful shutdown
- Add tokio-util dependency for CancellationToken
- Make `adapter` field private on EthereumNetworkAdapter, add getter
- Replace Vec-based health checker lookup with HashMap<String, Arc<Health>>
  for O(1) lookups instead of O(n*m)
- Remove redundant empty check in select_weighted_adapter; WeightedIndex
  already returns Err for empty input, falling through to random selection
- Replace struct literal construction in tests with ::new() calls
- Add explicit assertions that health scores start at 1.0
Previously all health checkers were stored in a single Vec and passed
to every chain's EthereumNetworkAdapters. Now they are grouped by
ChainName so each chain only receives its own health checkers.
- Document that weight 0.0 is intentional (disables from weighted
  selection while keeping the provider for error-retesting)
- Fix contradictory example in full_config.toml that showed weights >1.0
  despite validation rejecting them
- Remove weight from firehose provider config since it is only used
  for RPC providers
@DaMandal0rian DaMandal0rian force-pushed the feature/add-weighted-random-steering-load-balancing branch from ea11181 to 6e826c7 Compare January 31, 2026 17:16
@DaMandal0rian
Copy link
Contributor Author

@lutter I have done the rebase for this branch. PR is ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants