ADR-0004: Async Runtime Selection¶

Status¶

Accepted

Date¶

2025-02-27

Context¶

The floxide framework is designed around asynchronous operations to efficiently handle workflow execution. In Rust, async operations require an explicit runtime to execute futures.

There are several viable async runtimes available in the Rust ecosystem, each with different trade-offs:

Tokio: Full-featured, production-ready, widely adopted
async-std: Similar to the standard library, focused on ergonomics
smol: Small and simple runtime
Custom runtimes: Roll-our-own or specialized solutions

We need to select an appropriate async runtime that will support the framework's execution model, particularly for parallel node execution and orchestration.

Decision¶

We will use Tokio as the primary async runtime for the floxide framework implementation, with full feature set enabled. Additionally, we will use the async_trait crate for trait methods that return futures.

Implementation Details¶

Core Runtime Usage:

// In floxide-transform crate
pub fn run_flow<S, F>(flow: F, shared_state: &mut S) -> Result<(), FloxideError>
where
    F: BaseNode<S>,
{
    let rt = tokio::runtime::Runtime::new()?;
    rt.block_on(async {
        flow.run(shared_state).await
    })
}

Async Trait Implementation:

use async_trait::async_trait;

#[async_trait]
pub trait Node<Context, A = DefaultAction>
where
    A: ActionType,
{
    type Output;

    async fn process(&self, ctx: &mut Context) -> Result<NodeOutcome<Self::Output, A>, FloxideError>;
}

Parallelism for BatchFlow:

// In BatchFlow implementation
async fn exec_core(&self, prep_results: Vec<I>) -> Result<Vec<Result<(), FloxideError>>, FloxideError> {
    let mut handles = Vec::with_capacity(prep_results.len());

    for item in prep_results {
        // Create a cloned shared state for each parallel execution
        let state_clone = /* clone shared state */;
        let start_node = self.flow.get_start_node()?;

        let handle = tokio::spawn(async move {
            start_node.run(&mut state_clone).await
        });

        handles.push(handle);
    }

    let mut results = Vec::with_capacity(handles.len());
    for handle in handles {
        results.push(handle.await.unwrap_or_else(|e| Err(FloxideError::JoinError(e.to_string()))));
    }

    Ok(results)
}

Configurable Runtime Options:

pub struct FloxideRuntimeConfig {
    worker_threads: Option<usize>,
    thread_name_prefix: String,
    thread_stack_size: Option<usize>,
}

impl Default for FloxideRuntimeConfig {
    fn default() -> Self {
        Self {
            worker_threads: None, // Use Tokio default
            thread_name_prefix: "floxide-worker-".to_string(),
            thread_stack_size: None, // Use Tokio default
        }
    }
}

pub fn create_runtime(config: FloxideRuntimeConfig) -> Result<tokio::runtime::Runtime, FloxideError> {
    let mut rt_builder = tokio::runtime::Builder::new_multi_thread();

    if let Some(threads) = config.worker_threads {
        rt_builder.worker_threads(threads);
    }

    rt_builder.thread_name(config.thread_name_prefix);

    if let Some(stack_size) = config.thread_stack_size {
        rt_builder.thread_stack_size(stack_size);
    }

    rt_builder.enable_all()
        .build()
        .map_err(|e| FloxideError::RuntimeCreationError(e.to_string()))
}

Abstraction Layer:
We will create a runtime abstraction layer in the floxide-transform crate
This will allow for potential future runtime switching
The public API will remain stable even if the underlying runtime changes

Feature Flags¶

We will use feature flags to allow custom runtime configuration:

# In floxide-transform/Cargo.toml
[features]
default = ["tokio-full"]
tokio-full = ["tokio/full"]
tokio-minimal = ["tokio/rt", "tokio/sync"]
custom-runtime = []

[dependencies]
tokio = { version = "1.36", features = ["full"] }
async-trait = "0.1.77"

Consequences¶

Positive¶

Ecosystem Compatibility: Tokio is the most widely used async runtime in Rust, providing compatibility with a large ecosystem of libraries
Production-Ready: Tokio is battle-tested and used in many production environments
Feature-Rich: Includes timers, I/O utilities, and synchronization primitives that will be useful for the framework
Active Maintenance: Actively developed and maintained
Scalability: Well-suited for high-performance, concurrent workloads
Async Trait Support: Using async_trait simplifies writing async methods in traits

Negative¶

Runtime Dependency: Creates a dependency on a specific runtime
Binary Size: Tokio with full features adds to the binary size of applications using the framework
Learning Curve: Tokio has its own patterns and concepts to learn
Opinionated: Some design decisions in Tokio may not align perfectly with all use cases
Macro Overhead: async_trait adds some runtime overhead compared to native async traits (which aren't stable yet)

Alternatives Considered¶

1. async-std¶

Pros:
Familiar API that mirrors the standard library
Good documentation
Focuses on ergonomics
Cons:
Less widely adopted than Tokio
Smaller ecosystem of compatible libraries
Some performance differences compared to Tokio

2. smol¶

Pros:
Minimal footprint
Simple API
Lightweight
Cons:
Less feature-rich
Smaller ecosystem
Less battle-tested in large-scale production environments

3. Runtime Agnostic Design¶

Pros:
Maximum flexibility for consumers
No runtime dependency
Cons:
Significantly more complex implementation
Would require extensive abstraction layers
Would limit the use of runtime-specific features

4. Allow Pluggable Runtimes¶

Pros:
Flexibility for different environments
Could adapt to special requirements
Cons:
Increased maintenance burden
More complex API
Testing complexity increases exponentially

5. Wait for native async traits¶

Pros:
No need for async_trait macro
Better performance
More idiomatic Rust
Cons:
Feature is not stable yet
Would delay development
Migration cost when the feature stabilizes

We chose Tokio with full features and async_trait because it provides the best balance of features, ecosystem compatibility, and production readiness. The abstraction layer will help mitigate some of the downsides by allowing for potential future changes without disrupting the API.