8000 GitHub - Ark-Stack/docs: ArkStack docs
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Ark-Stack/docs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

arkstack

CA: FZ5gbR353Dy6FQhky8MsUq8xsbNG2pARcTbTFtcApump

1. Introduction

1.1 Purpose

This RFC outlines the design and implementation of ArkStack that integrates Chidori, a reactive runtime for orchestrating agentic goals, with smolagents, a lightweight library for building auto generating code-driven AI agents. The goal of this is to provide developers with a flexible, efficient, and transparent framework for creating AI agents that can interact with the real world, solve complex tasks, and adapt..

By combining Chidori’s advanced runtime features—such as pause/resume, time-travel debugging, and execution graph visualization—with smolagents’ code generation and tool integration capabilities, we adress key challenges in agent development and unlock new powerful features for multi-step, multi-agent systems. This integration allows devs to build understandable, secure, and transparent agents that can handle real-world complex use-case with ease.


1.2 What is Chidori?

Chidori is an open-source orchestrator, runtime, and IDE designed for building software in symbiosis with modern AI tools. It is specifically tailored for AI agents and provides solutions to the following challenges:

  1. Understanding Agent Behavior:

    • Chidori enables developers to understand what an agent is doing and how it arrived at a given state through time-travel debugging and visual execution graphs.
  2. Pause and Resume Execution:

    • Developers can pause agent execution, interact with the system (e.g., inject human feedback), and resume execution seamlessly.
  3. Handling Complex State-Space Exploration:

    • Chidori supports branching and reverting execution, allowing developers to explore different outcomes from a given state.
  4. Monitoring and Observability:

    • Chidori logs all state transitions and actions, providing a comprehensive audit trail for debugging and analysis.
  5. Code Interpreter Environments:

    • Chidori supports secure execution of Python and JavaScript code, with restrictions on imports and operations to prevent malicious behavior.

Chidori’s reactive runtime and visual debugging interface make it an ideal framework for building and debugging complex agents, especially in enterprise environments where reliability and observability are critical.


1.3 What is smolagents?

smolagents is a lightweight library for building code-driven AI agents. It focuses on simplicity and efficiency, enabling developers to create powerful agents in just a few lines of code. Key features of smolagents include:

  1. Code-Driven Agents:

    • smolagents’ CodeAgent generates and executes Python code to perform actions, offering greater flexibility and efficiency compared to traditional JSON-based tool calls.
  2. Tool Integration:

    • smolagents supports integration with a wide range of tools, including web search APIs, image generation APIs, and custom APIs.
  3. Multi-Agent Systems:

    • smolagents enables hierarchical multi-agent systems through the ManagedAgent class, allowing agents to collaborate on complex tasks.
  4. Security and Sandboxing:

    • smolagents provides secure execution environments, such as the local Python interpreter and E2B sandbox, to ensure safe execution of generated code.

smolagents’ simplicity and code-first approach make it a powerful tool for building AI agents, but it lacks some of the advanced orchestration and debugging features provided by Chidori.


1.4 Problem Statement

While both Chidori and smolagents are powerful tools for building AI agents, they address different aspects of the development process:

  1. Chidori’s Limitations:

    • Chidori excels at orchestration and debugging but does not provide native support for code generation or tool integration.
  2. smolagents’ Limitations:

    • smolagents simplifies code generation and tool integration but lacks advanced features for state management, debugging, and observability.

This creates a gap in development:

  • Developers using smolagents struggle with complex debugging and state management.
  • Developers using Chidori lack a streamlined way to generate code and integrate tools.

By integrating Chidori and smolagents, we can bridge this gap and provide a unified frameowrk that combines the strengths of both tools. Eg Chidori allows devs to peek behind the curtain and understand what code smolagents generate.


1.5 Why Merge Chidori and smolagents?

The integration of Chidori and smolagents makes sense for several reasons:

  1. Complementary Features:

    • Chidori’s reactive runtime and debugging capabilities complement smolagents’ code generation and tool integration, creating a more comprehensive platform.
  2. Improved Developer Experience:

    • Developers can use a single platform to build, orchestrate, and debug AI agents, reducing the complexity of the development process.
  3. Enhanced Flexibility:

    • The integration enables developers to create code-driven agents that can be paused, resumed, and debugged, providing greater flexibility in designing agent workflows.
  4. Scalability:

    • The platform supports multi-agent systems, enabling developers to build scalable solutions for complex tasks.
  5. Security and Observability:

    • The combination of Chidori’s secure execution environments and smolagents’ sandboxing ensures safe execution of generated code, while Chidori’s observability features provide insights into agent behavior.

1.6 Key Concepts

To understand the platform, it’s important to familiarize yourself with the following key concepts:

  1. Reactive Runtime:

    • A runtime that orchestrates agent workflows by reacting to changes in state and enabling features like pause/resume and time-travel debugging.
  2. Code Agents:

    • Agents that generate and execute Python code to perform actions, offering greater flexibility and efficiency compared to traditional JSON-based tool calls.
  3. Multi-Agent Systems:

    • Systems where multiple agents collaborate to solve tasks, with each agent specializing in a specific sub-task.
  4. Sandboxed Execution:

    • Execution of generated code in isolated environments (e.g., E2B) to prevent unintended side effects and enhance security.

1.7 Scope

This document focuses on the integration of Chidori and smolagents, exploring how their complementary features can be combined to create the ArkStack agent platform. Key areas of focus include:

  • Runtime Design: Leveraging Chidori’s reactive runtime for managing agent workflows, including state management, branching, and time-travel debugging.
  • Code Generation: Utilizing smolagents’ CodeAgent to generate and execute Python code for agent actions, enabling more efficient and expressive workflows.
  • Agent Coordination: Designing communication patterns and protocols for multi-agent systems, where agents can collaborate to solve complex tasks.
  • Security and Sandboxing: Ensuring safe execution of generated code through sandboxed environments (e.g., E2B) and secure Python interpreters.

While the platform is designed to be extensible and support broader AI agent concepts, the primary focus is on the integration of Chidori and smolagents.


2. Architectural Overview

2.1 High-Level Design

The platform is built around three core components:

  1. ArkStack Runtime: Manages the execution of agent workflows, including state management, branching, and time-travel debugging.
  2. smolagents Library: Provides tools for building code-driven agents, including CodeAgent for generating Python code and ToolCallingAgent for traditional JSON-based tool calls.
  3. Execution Environment: Ensures secure execution of generated code, either through a local Python interpreter with restricted imports or a sandboxed environment like E2B.

These components work together to create a modular and extensible platform for building AI agents.


2.2 Core Components

ArkStack Runtime

  • Reactive Execution: Agents react to changes in state, enabling dynamic workflows.
  • Time-Travel Debugging: Developers can revert to previous states to debug and analyze agent behavior.
  • Visual Debugging: A graphical interface for visualizing and manipulating the execution graph.

smolagents Library

  • CodeAgent: Generates and executes Python code for agent actions, offering greater flexibility and efficiency.
  • Tool Integration: Supports a wide range of tools, including web search, image generation, and custom APIs.
  • Multi-Agent Systems: Enables hierarchical agent coordination through the ManagedAgent class.

Execution Environment

  • Local Python Interpreter: Executes generated code with restricted imports and capped operations for security.
  • E2B Sandbox: Runs code in an isolated container for maximum security.

2.3 Workflow

The platform follows a multi-step workflow for executing agent tasks:

  1. Task Initialization: The user defines a task, which is passed to the agent.
  2. Code Generation: The agent generates Python code to perform the task, leveraging available tools.
  3. Code Execution: The code is executed in a secure environment (local or sandboxed), and the results are observed.
  4. State Management: The runtime updates the agent’s state and memory based on the results.
  5. Iteration: The agent repeats the process until the task is completed or a stopping condition is met.
  6. Debugging and Analysis: Developers can use ArkStack’s time-travel debugging and visual interface to analyze the agent’s behavior and optimize its performance.

3. Runtime Design (Chidori Integration)

3.1 Reactive Runtime

The reactive runtime is the core of ArkStack, enabling agents to dynamically respond to changes in state. It uses a graph-based execution model, where each node represents a state or action, and edges represent transitions between states. This model allows for pause/resume, time-travel debugging, and branching of execution.

Execution Graph

The execution graph is a directed acyclic graph (DAG) that captures the flow of agent execution. Each node in the graph represents a state or action, and edges represent tr 57A6 ansitions triggered by changes in state.

+-------------------+       +-------------------+       +-------------------+
|   Initial State   | ----> |   Action Node 1   | ----> |   Action Node 2   |
+-------------------+       +-------------------+       +-------------------+
        |                         |                         |
        v                         v                         v
+-------------------+       +-------------------+       +-------------------+
|   State Update    |       |   State Update    |       |   State Update    |
+-------------------+       +-------------------+       +-------------------+

State Transitions

State transitions are triggered by changes in the system, such as the completion of an action or the arrival of new data. The runtime subscribes to these changes and updates the execution graph accordingly.

Reactive Subscriptions

Nodes in the execution graph can subscribe to changes in other nodes, enabling reactive behavior. For example, a node representing a data processing step can subscribe to changes in a node representing a data source.


3.2 Pause and Resume

ArkStack supports pausing and resuming agent execution, allowing developers to inspect and modify the agent’s state at any point in time. This is particularly useful for debugging and human-in-the-loop workflows.

Pause Mechanism

When execution is paused, the runtime captures the current state of the execution graph and stores it. Developers can inspect the state, modify it, or inject new data before resuming execution.

Resume Mechanism

When execution is resumed, the runtime restores the captured state and continues execution from the point of interruption. Developers can also modify the execution graph before resuming.

Example: Human-in-the-Loop Workflow

+-------------------+       +-------------------+       +-------------------+
|   Running State   | ----> |   Paused State    | ----> |   Resumed State   |
+-------------------+       +-------------------+       +-------------------+
        |                         |                         |
        v                         v                         v
+-------------------+       +-------------------+       +-------------------+
|   Capture State   |       |   User Input      |       |   Restore State   |
+-------------------+       +-------------------+       +-------------------+

3.3 Monitoring and Observability

ArkStack provides comprehensive monitoring and observability by logging all state transitions and actions. This data can be visualized in ArkStack's debugging interface, enabling developers to understand and debug agent behavior.

Logging Mechanism

The runtime logs the following data for each state transition:

  • Timestamp: The time of the transition.
  • Node ID: The ID of the node triggering the transition.
  • Event: The event causing the transition (e.g., "action_completed").
  • State Data: The state of the system before and after the transition.

Visual Debugging

ArkStack’s debugging interface visualizes the execution graph and logs, allowing developers to inspect the flow of execution and identify issues.

Example:

+-------------------+       +-------------------+       +-------------------+
|   Initial State   | ----> |   Action Node 1   | ----> |   Action Node 2   |
+-------------------+       +-------------------+       +-------------------+
        |                         |                         |
        v                         v                         v
+-------------------+       +-------------------+       +-------------------+
|   Log Entry 1     |       |   Log Entry 2     |       |   Log Entry 3     |
+-------------------+       +-------------------+       +-------------------+

3.5 Branching and Time-Travel Debugging

ArkStack supports branching and time-travel debugging, allowing developers to explore different execution paths and revert to previous states.

Branching

Branching creates a copy of the execution graph at a specific state, enabling developers to explore alternative workflows.

Example:

+-------------------+       +-------------------+
|   Initial State   | ----> |   Branch Point    |
+-------------------+       +-------------------+
        |                         |
        v                         v
+-------------------+       +-------------------+
|   Path A          |       |   Path B          |
+-------------------+       +-------------------+

Time-Travel Debugging

Time-travel debugging allows developers to revert the execution graph to a previous state, inspect it, and modify it before continuing execution.

Example:

+-------------------+       +-------------------+       +-------------------+
|   State 1         | ----> |   State 2         | ----> |   State 3         |
+-------------------+       +-------------------+       +-------------------+
        |                         |                         |
        v                         v                         v
+-------------------+       +-------------------+       +-------------------+
|   Revert to State 1       |   Modify State 1  |       |   Continue Execution |
+-------------------+       +-------------------+       +-------------------+

4. Code Generation (smolagents Integration)

4.1 Code Agents

Code Agents are a core feature of smolagents, enabling agents to generate and execute Python code to perform actions. This approach offers greater flexibility and efficiency compared to traditional JSON-based tool calls, reducing the number of steps required to complete tasks and improving performance on complex benchmarks.

Code Generation Process

The code generation process involves the following steps:

  1. Task Analysis: The agent analyzes the task and identifies the tools and operations required to solve it.
  2. Code Generation: The agent generates Python code to perform the identified operations.
  3. Code Execution: The generated code is executed in a secure environment (local or sandboxed).
  4. Result Handling: The results of the code execution are returned to the agent for further processing.

Example: Code Agent Workflow

+-------------------+       +-------------------+       +-------------------+
|   Task Input      | ----> |   Code Generation | ----> |   Code Execution  |
+-------------------+       +-------------------+       +-------------------+
        |                         |                         |
        v                         v                         v
+-------------------+       +-------------------+       +-------------------+
|   LLM Engine      |       |   Code Validation |       |   Result Output   |
+-------------------+       +-------------------+       +-------------------+

4.2 Tool Integration

smolagents supports integration with a wide range of tools, including web search APIs, image generation APIs, and custom APIs. These tools are exposed to the agent as Python functions, enabling seamless integration into generated code.

+-------------------+       +-------------------+       +-------------------+
|   Task Input      | ----> |   Tool Selection  | ----> |   Code Generation |
+-------------------+       +-------------------+       +-------------------+
        |                         |                         |
        v                         v                         v
+-------------------+       +-------------------+       +-------------------+
|   Tool Registry   |       |   Tool Execution  |       |   Result Output   |
+-------------------+       +-------------------+       +-------------------+

4.3 Security and Sandboxing

To ensure safe execution of generated code, smolagents provides security mechanisms such as restricted imports and sandboxed environments. This prevents malicious behavior and protects the host system.

Local Python Interpreter

The local interpreter executes code with restricted imports and capped operations to prevent unauthorized access to system resources.

Example: Restricted Imports

# Allowed imports
import math
import json

# Blocked imports
import os  # Raises SecurityError

E2B Sandbox

The E2B sandbox runs code in an isolated container, providing maximum security. This is ideal for executing untrusted code or code that requires access to external resources.

4.4 Managed Agents

smolagents supports multi-agent systems through the ManagedAgent class, enabling hierarchical coordination of agents. A manager agent can coordinate the actions of subordinate agents, each specializing in a specific sub-task.

Managed Agent Workflow

  1. Task Delegation: The manager agent delegates sub-tasks to subordinate agents.
  2. Sub-Task Execution: Each subordinate agent generates and executes code to perform its assigned sub-task.
  3. Result Aggregation: The manager agent aggregates the results from subordinate agents and returns the final output.

Example: Multi-Agent Workflow

+-------------------+       +-------------------+       +-------------------+
|   Task Input      | ----> |   Manager Agent   | ----> |   Sub-Agent 1     |
+-------------------+       +-------------------+       +-------------------+
        |                         |                         |
        v                         v                         v
+-------------------+       +-------------------+       +-------------------+
|   Result Output   | <---- |   Sub-Agent 2     | <---- |   Sub-Agent 3     |
+-------------------+       +-------------------+       +-------------------+

5. Agent Coordination and Communication

5.1 Multi-Agent Systems

Multi-agent systems involve multiple agents collaborating to solve complex tasks. Each agent specializes in a specific sub-task, improving efficiency and reducing the cognitive load on individual agents. The integration of smolagents and Chidori provides a robust framework for building and managing such systems.

Key Concepts

  • Specialization: Each agent is designed to perform a specific function (e.g., web search, data processing).
  • Coordination: A manager agent coordinates the actions of subordinate agents, ensuring that sub-tasks are executed in the correct order.
  • Communication: Agents communicate through message passing or shared memory, enabling them to exchange data and results.

5.2 smolagents Multi-Agent Workflow

smolagents supports multi-agent systems through the ManagedAgent class, which enables hierarchical coordination of agents. A manager agent delegates sub-tasks to subordinate agents and aggregates their results.

Managed Agent Workflow

  1. Task Delegation: The manager agent analyzes the task and delegates sub-tasks to subordinate agents.
  2. Sub-Task Execution: Each subordinate agent generates and executes code to perform its assigned sub-task.
  3. Result Aggregation: The manager agent collects the results from subordinate agents and returns the final output.

Example: Multi-Agent Workflow

+-------------------+       +-------------------+       +-------------------+
|   Task Input      | ----> |   Manager Agent   | ----> |   Sub-Agent 1     |
+-------------------+       +-------------------+       +-------------------+
        |                         |                         |
        v                         v                         v
+-------------------+       +-------------------+       +-------------------+
|   Result Output   | <---- |   Sub-Agent 2     | <---- |   Sub-Agent 3     |
+-------------------+       +-------------------+       +-------------------+

Example Code Snippet:

from smolagents import ManagedAgent, CodeAgent, DuckDuckGoSearchTool, HfApiModel

# Define subordinate agents
web_agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=HfApiModel())
data_agent = CodeAgent(tools=[], model=HfApiModel())

# Define manager agent
manager_agent = ManagedAgent(
    agents=[web_agent, data_agent],
    description="Coordinates web search and data processing agents"
)

# Run the multi-agent system
result = manager_agent.run("Analyze the latest AI research papers")

5.3 ArkStack Support for Multi-Agent Execution

ArkStack provides execution tracking and observability for multi-agent systems, enabling developers to monitor and debug the behavior of individual agents and their interactions.

Execution Graph for Multi-Agent Systems

ArkStack’s execution graph captures the flow of execution across multiple agents, including state transitions, task delegations, and result aggregations. Each agent is represented as a node in the graph, and edges represent communication or data flow between agents.

Example: Execution Graph for Multi-Agent System

+-------------------+       +-------------------+       +-------------------+
|   Manager Agent   | ----> |   Sub-Agent 1     | ----> |   Sub-Agent 2     |
+-------------------+       +-------------------+       +-------------------+
        |                         |                         |
        v                         v                         v
+-------------------+       +-------------------+       +-------------------+
|   Task Delegation |       |   Sub-Task Exec   |       |   Sub-Task Exec   |
+-------------------+       +-------------------+       +-------------------+
        |                         |                         |
        v                         v                         v
+-------------------+       +-------------------+       +-------------------+
|   Result Aggregation <---- |   Result Output   | <---- |   Result Output   |
+-------------------+       +-------------------+       +-------------------+

Monitoring and Debugging

ArkStack logs all state transitions and interactions between agents, providing a comprehensive audit trail for debugging. Developers can use ArkStack's time-travel debugging and visual interface to inspect the execution graph and identify issues.


5.4 Communication Protocols

Agents communicate using message passing or shared memory, depending on the requirements of the task. ArkStack ensures that communication is tracked and logged for observability.

Message Passing

In message passing, agents exchange data through explicit messages. This approach is ideal for loosely coupled systems where agents operate independently.

Example: Message Passing

Agent A -> Message -> Agent B
Agent B -> Result -> Agent A

Shared Memory

In shared memory, agents access a common data structure to exchange information. This approach is ideal for tightly coupled systems where agents need to collaborate closely.

Example: Shared Memory

Agent A -> Writes Data -> Shared Memory
Agent B -> Reads Data -> Shared Memory

5.5 Example: Multi-Agent System for Data Analysis

This example demonstrates a multi-agent system where:

  1. A manager agent coordinates the workflow.
  2. A web search agent retrieves data from the web.
  3. A data processing agent analyzes the retrieved data.

Workflow:

+-------------------+       +-------------------+       +-------------------+
|   Manager Agent   | ----> |   Web Search Agent| ----> |   Data Processing |
+-------------------+       +-------------------+       +-------------------+
        |                         |                         |
        v                         v                         v
+-------------------+       +-------------------+       +-------------------+
|   Task Delegation |       |   Retrieve Data   |       |   Analyze Data    |
+-------------------+       +-------------------+       +-------------------+
        |                         |                         |
        v                         v                         v
+-------------------+       +-------------------+       +-------------------+
|   Result Aggregation <---- |   Return Data     | <---- |   Return Analysis |
+-------------------+       +-------------------+       +-------------------+

Code Implementation:

from smolagents import ManagedAgent, CodeAgent, DuckDuckGoSearchTool, HfApiModel

# Define subordinate agents
web_agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=HfApiModel())
data_agent = CodeAgent(tools=[], model=HfApiModel())

# Define manager agent
manager_agent = ManagedAgent(
    agents=[web_agent, data_agent],
    description="Coordinates web search and data processing agents"
)

# Run the multi-agent system
result = manager_agent.run("Analyze the latest AI research papers")

6. Vector Store Integration

6.1 Overview

Vector stores are essential for tasks like retrieval-augmented generation (RAG), semantic search, and memory management in AI agents. This section explores how ArkStack integrates vector stores to enhance agent capabilities, enabling efficient storage, retrieval, and management of embeddings.

6.2 Built-in Vector Store

ArkStack includes a simple, local vector store for development and prototyping. This built-in store is optimized for efficiency and ease of use, making it ideal for small-scale applications.

Features:

  • Efficient Similarity Search:
    • Use algorithms like FAISS or Annoy to perform fast similarity searches on embeddings.
  • Lightweight and Portable:
    • The local vector store is lightweight and requires no external dependencies, making it easy to set up and use.
  • Memory Management:
    • Store and retrieve embeddings to manage agent memory, enabling long-term context retention.

6.3 Support for External Vector Stores

ArkStack supports integration with external vector databases, enabling scalable and production-ready solutions for managing embeddings.

Supported Databases:

  • Pinecone: A fully managed vector database for large-scale applications.
  • Weaviate: An open-source vector search engine with built-in ML models.
  • Milvus: A scalable vector database designed for AI applications.
  • Qdrant: A high-performance vector search engine with a focus on simplicity.

Example:

# Connect to an external vector store (e.g., Pinecone)
vector_store = PineconeVectorStore(api_key="your_api_key", index_name="your_index")

# Store and retrieve embeddings
vector_store.store("key1", embedding1)
results = vector_store.search(query_embedding, top_k=5)

6.4 Use Cases

  1. Retrieval-Augmented Generation (RAG):
    • Use vector stores to retrieve relevant documents or data for generating context-aware responses.
  2. Semantic Search:
    • Enable agents to perform semantic searches on large datasets, such as product catalogs or knowledge bases.
  3. Memory Management:
    • Store conversation history or task-specific data in a vector store for future reference.

7. Remote Tools and Functions

7.1 Overview

In ArkStack, tools are remote functions that agents can call to perform specific tasks, such as web searches, data processing, or API interactions. These tools are stored remotely and can be dynamically discovered and integrated into agent workflows. This section explores how ArkStack leverages smolagents’ support for remote tools and provides a marketplace and hub for sharing and using these tools.


7.2 Tools in smolagents

In smolagents, tools are Python functions that are exposed to agents for performing actions. These tools can be local (defined within the agent’s code) or remote (stored and accessed from external sources). Remote tools are particularly powerful because they allow developers to share, reuse, and scale functionality across multiple agents and projects.

Key Features of Remote Tools:

  • Dynamic Discovery:
    • Agents can discover and load remote tools at runtime, making them adaptable to new tasks.
  • Centralized Management:
    • Tools are stored in a central repository (e.g., Hugging Face Hub), enabling easy access and version control.
  • Security and Sandboxing:
    • Remote tools are executed in secure environments, ensuring safe and reliable operation.

Example:

from smolagents import load_tool, CodeAgent

model_download_tool = load_tool(
    "{your_username}/hf-model-downloads",
    trust_remote_code=True
)

7.3 Marketplace and Hub

ArkStack medium term will provide a marketplace and hub for uploading, sharing, and using remote tools. This hub would serve as a centralized platform for developers to discover and contribute tools.

Features of the Marketplace:

  • Tool Discovery:
    • Browse and search for tools by category, functionality, or popularity.
  • Tool Uploading:
    • Developers can upload their own tools to the hub, complete with documentation and usage examples.
  • Version Control:
    • Tools are versioned, allowing developers to specify which version to use in their workflows.
  • Community Contributions:
    • The hub encourages community contributions, enabling developers to share and collaborate on tools.

Example:

# Upload a tool to the ArkStack Hub
model_downloads_tool.push_to_hub("{your_username}/hf-model-downloads", token="")

7.4 Use Cases

  1. Rapid Prototyping:
    • Use pre-built tools from the marketplace to quickly prototype agent workflows.
  2. Dynamic Workflows:
    • Enable agents to adapt to new tasks by dynamically discovering and integrating remote tools.
  3. Collaboration and Sharing:
    • Share tools with the community and leverage contributions from other developers.
  4. Scalability:
    • Use remote tools to scale agent functionality across multiple projects and environments.

8. Enhanced Debugging and Observability

8.1 Overview

ArkStack builds on Chidori’s debugging features to provide advanced tools for monitoring and analyzing agent behavior. This chapter explores execution comparison, interactive debugging, and metrics dashboards.

8.2 Execution Comparison

ArkStack enables developers to compare different executions of the same task, helping them identify performance bottlenecks or errors.

Features:

  • Visual Comparison:
    • Visualize differences in execution graphs, state transitions, or tool calls.
  • Performance Metrics:
    • Compare metrics like execution time, memory usage, and error rates across executions.

Example:

# Compare two executions of the same task
comparison = agent.compare_executions(execution_id_1, execution_id_2)
comparison.visualize()

8.3 Interactive Debugging

ArkStack supports interactive debugging, allowing developers to modify agent behavior during execution.

Features:

  • Real-Time Modifications:
    • Inject new data, modify tool calls, or adjust parameters in real-time.
  • State Inspection:
    • Inspect and modify the agent’s state at any point during execution.

8.4 Metrics and Dashboards

ArkStack provides metrics and dashboards for monitoring agent performance and behavior.

Features:

  • Task Completion Time:
    • Track how long it takes for agents to complete tasks.
  • Error Rates:
    • Monitor the frequency and types of errors encountered during execution.
  • Tool Usage:
    • Analyze which tools are being used and how often.

Example:

# View a dashboard of agent metrics
dashboard = agent.get_metrics_dashboard()
dashboard.show()

8.5 Use Cases

  1. Performance Optimization:
    • Use execution comparison and metrics dashboards to identify and resolve performance bottlenecks.
  2. Error Debugging:
    • Leverage interactive debugging to diagnose and fix errors in real-time.
  3. Operational Monitoring:
    • Monitor agent performance in production to ensure reliability and efficiency.

9. 🛣️ Roadmap

Short term

  • Reactive subscriptions between nodes
  • Branching and time travel debugging, reverting execution of a graph
  • Node.js, Python, and Rust support for building and executing graphs
  • Simple local vector db for development
  • Integrate smolagents library for code generation execution (in-progress)
  • Combine Chidori’s time-travel debugging with smolagents’ execution logs for a seamless debugging experience.
  • Expand the tool registry to support dynamic tool discovery and registration.

Medium term

  • Analysis tools for comparing executions
  • Adding support for more vector databases
  • Adding support for other LLM sources
  • Adding support for more code interpreter environments
  • Port smolagents to TypeScript
  • Launch tool marketplace
  • Agent re-evaluation with feedback
  • Add metrics, dashboards, and alerts for monitoring agent performance and behavior.
  • Enable agents to dynamically adapt to changing contexts or environments during execution.

10. Conclusion

10.1 Summary

This RFC proposes ArkStack, a feature rich AI agent platform that integrates Chidori, a reactive runtime for orchestrating agentic workflows, with smolagents, a lightweight library for building code-driven AI agents. Beyond the core integration, ArkStack extends its capabilities with Vector Store Integration, Smolagents Remote Tools and Functions, Enhanced Debugging and Observability, and future smolagent support for TypeScript, making it a unified, extensible, and developer-friendly framework for building, orchestrating, and debugging AI agents.

ArkStack addresses key challenges in AI agent development, including:

  • State Management: Chidori’s reactive runtime ensures efficient state management, dynamic workflows, and advanced debugging features like time-travel debugging and execution graph visualization.
  • Code Generation: smolagents’ CodeAgent enables agents to generate and execute Python code, reducing the number of steps required to complete tasks and improving efficiency.
  • Multi-Agent Coordination: The platform supports hierarchical multi-agent systems, enabling agents to collaborate on complex tasks through managed agents and communication protocols.
  • Vector Store Integration: Built-in and external vector store support enables retrieval-augmented generation (RAG), semantic search, and memory management for agents.
  • smolagent Remote Tools and Functions: A marketplace and hub for remote tools allows developers to dynamically discover, share, and integrate tools into agent workflows.
  • Enhanced Debugging and Observability: Advanced features like execution comparison, interactive debugging, and metrics dashboards provide deep insights into agent behavior and performance.
  • Security and Sandboxing: Secure execution environments, such as the local Python interpreter and E2B sandbox, ensure safe execution of generated code.
  • Language Flexibility: Support for Python and TypeScript (upcoming) provides developers with flexibility and improved tooling for building agents.

10.2 Benefits of ArkStack

ArkStack offers several key benefits for developers, making it a powerful platform for building and managing AI agents:

  1. Unified Framework:

    • Combines Chidori’s orchestration and debugging capabilities with smolagents’ code generation and tool integration into a single platform.
    • Extends functionality with vector stores, remote tools, and enhanced observability, providing a comprehensive solution for AI agent development.
  2. Flexibility and Extensibility:

    • Supports multiple programming languages (Python and TypeScript) and integrates with external tools and databases (e.g., Pinecone, Weaviate).
    • Enables developers to dynamically discover and integrate remote tools through the marketplace and hub.
  3. Efficiency and Performance:

    • Code-driven agents reduce the number of steps required to complete tasks, improving performance on complex benchmarks.
    • Vector store integration enables efficient retrieval-augmented generation (RAG) and semantic search, enhancing agent capabilities.
  4. Debugging and Observability:

    • Advanced debugging features like time-travel debugging, execution comparison, and interactive debugging provide deep insights into agent behavior.
    • Metrics dashboards and logs enable real-time monitoring and optimization of agent performance.
  5. Scalability:

    • Supports multi-agent systems, enabling developers to build scalable solutions for complex tasks.
    • Integration with external vector databases and remote tools ensures scalability for production environments.
  6. Security and Reliability:

    • Secure execution environments (e.g., local Python interpreter, E2B sandbox) ensure safe execution of generated code.
    • Comprehensive logging and monitoring features provide a robust audit trail for debugging and analysis.

10.3 Future dev directions

Future developments include:

  • TypeScript smolagent Port: One of the primary goals for future development is to also port smolagents to TypeScript. Chidori and E2B natively support JavaScript/TypeScript. The port will provide language flexibility and also improve the developer experience by providing type safety, better tooling, and enhanced performance.
  • Tool Marketplace: A centralized hub for sharing and discovering remote tools, enabling rapid prototyping and collaboration.
  • Advanced Debugging Features: Continued enhancements to debugging and observability tools, including real-time metrics and automated error detection.

10.4 Final Thoughts

ArkStack represents an interersting experiment in AI agent development, combining the strengths of Chidori and smolagents with a suite of advanced features to address the challenges of modern AI workflows. By providing a flexible, efficient, and secure platform, ArkStack empowers developers to build scalable, interpretable, and reliable AI agents that can tackle real-world problems with ease.

About

ArkStack docs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0