8000 [pull] main from mcp-use:main by pull[bot] · Pull Request #11 · yzkee/mcp-use · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[pull] main from mcp-use:main #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 13, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
8000
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -122,3 +122,7 @@ dmypy.json

# macOS
.DS_Store

# AI
.cursor
.claude
170 changes: 170 additions & 0 deletions docs/development/telemetry.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
---
title: "Telemetry"
description: "Understanding MCP-use's telemetry system"
---

## Overview

MCP-use includes an **opt-out telemetry system** that helps us understand how the library is being used in practice. This data enables us to:

- **Prioritize development** based on real usage patterns
- **Optimize performance** for common workflows
- **Improve compatibility** with popular model providers
- **Focus on the most valuable features**

<Warning>
**Privacy First**: All telemetry is **anonymized** and can be **completely disabled** with a single environment variable.
</Warning>

## What We Collect

### Agent Execution Data
When you use `MCPAgent.run()` or `MCPAgent.astream()`, we collect:

- **Query and response content** (to understand use cases)
- **Model provider and name** (e.g., "openai", "gpt-4")
- **MCP servers connected** (types and count, not specific URLs/paths)
- **Tools used** (which MCP tools are popular)
- **Performance metrics** (execution time, steps taken)
- **Configuration settings** (memory enabled, max steps, etc.)

### System Information
- **Package version** (for version adoption tracking)
- **Error types** (for debugging and improvement)

### What We DON'T Collect
- **Personal information** (no names, emails, or identifiers)
- **Server URLs or file paths** (only connection types)
- **API keys or credentials** (never transmitted)
- **IP addresses** (PostHog configured with `disable_geoip=False`)

## How to Disable Telemetry

### Environment Variable (Recommended)
```bash
export MCP_USE_ANONYMIZED_TELEMETRY=false
```

### In Your Code
```python
import os
os.environ["MCP_USE_ANONYMIZED_TELEMETRY"] = "false"

# Now use MCP-use normally - no telemetry will be collected
from mcp_use import MCPAgent
```

### Verification
When telemetry is disabled, you'll see this debug message:
```
DEBUG: Telemetry disabled
```

When enabled, you'll see:
```
INFO: Anonymized telemetry enabled. Set MCP_USE_ANONYMIZED_TELEMETRY=false to disable.
```

## Data Storage and Privacy

### Anonymous User IDs
- A random UUID is generated and stored locally in your cache directory
- **Linux/Unix**: `~/.cache/mcp_use/telemetry_user_id`
- **macOS**: `~/Library/Caches/mcp_use/telemetry_user_id`
- **Windows**: `%LOCALAPPDATA%\mcp_use\telemetry_user_id`

### Data Transmission
- All data is sent to PostHog (EU servers: `https://eu.i.posthog.com`)
- **No personal information** is ever transmitted
- Data is used only for **aggregate analysis**

## Example Telemetry Event

Here's what a typical telemetry event looks like:

```json
{
"event": "mcp_agent_execution",
"distinct_id": "550e8400-e29b-41d4-a716-446655440000",
"properties": {
"mcp_use_version": "1.3.0",
"execution_method": "run",
"query": "Help me analyze sales data from the CSV file",
"response": "I'll help you analyze the sales data...",
"model_provider": "openai",
"model_name": "gpt-4",
"server_count": 2,
"server_identifiers": [
{"type": "stdio", "command": "python -m server"},
{"type": "http", "base_url": "localhost:8080"}
],
"tools_used_names": ["read_file", "analyze_data"],
"execution_time_ms": 2500,
"success": true
}
}
```

## Benefits to the Community

### For Users
- **Better library** through data-driven improvements
- **Faster issue resolution** via error pattern detection
- **Feature prioritization** based on actual usage

### For Developers
- **Compatibility insights** for new model providers
- **Performance optimization** targets
- **Usage pattern understanding** for better APIs

## Technical Implementation

### Clean Architecture
The telemetry system uses a **decorator pattern** that ensures:
- **Zero overhead** when disabled
- **No exceptions** if PostHog is unavailable
- **Graceful degradation** in all failure scenarios

### Code Example
```python
# This is how telemetry works internally:
@requires_telemetry
def track_agent_execution(self, ...):
# This method only executes if telemetry is enabled
# If disabled, it returns None immediately
pass
```

## Frequently Asked Questions

### Can I see what data is being sent?
Yes! Set your logging level to DEBUG to see telemetry events:
```python
import logging
logging.basicConfig(level=logging.DEBUG)
```

### Does telemetry affect performance?
- **When disabled**: Zero performance impact
- **When enabled**: Minimal impact (async data transmission)

### Can I opt out after using the library?
Yes! Set the environment variable and restart your application. You can also delete the user ID file to reset your anonymous identifier.

### Is this GDPR compliant?
Yes. The telemetry system:
- Collects no personal data
- Uses anonymous identifiers
- Provides easy opt-out
- Processes data for legitimate business interests (software improvement)

## Support

If you have questions about telemetry:
- **Disable it**: Use `MCP_USE_ANONYMIZED_TELEMETRY=false`
- **Report issues**: [GitHub Issues](https://github.com/anthropics/mcp-use/issues)
- **Check logs**: Enable DEBUG logging to see telemetry activity

<Note>
Remember: Telemetry helps us build a better library for everyone, but **your privacy comes first**. We've designed the system to be transparent, minimal, and easily disabled.
</Note>
3 changes: 2 additions & 1 deletion docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@
{
"group": "Development",
"pages": [
"development"
"development",
"development/telemetry"
]
}
]
Expand Down
119 changes: 109 additions & 10 deletions mcp_use/agents/mcpagent.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
"""

import logging
import time
from collections.abc import AsyncIterator

from langchain.agents import AgentExecutor, create_tool_calling_agent
Expand All @@ -22,6 +23,8 @@

from mcp_use.client import MCPClient
from mcp_use.connectors.base import BaseConnector
from mcp_use.telemetry.posthog import Telemetry
from mcp_use.telemetry.utils import extract_model_info

from ..adapters.langchain_adapter import LangChainAdapter
from ..logging import logger
Expand Down Expand Up @@ -93,6 +96,9 @@ def __init__(
# Create the adapter for tool conversion
self.adapter = LangChainAdapter(disallowed_tools=self.disallowed_tools)

# Initialize telemetry
self.telemetry = Telemetry()

# Initialize server manager if requested
self.server_manager = None
if self.use_server_manager:
Expand All @@ -104,6 +110,9 @@ def __init__(
self._agent_executor: AgentExecutor | None = None
self._system_message: SystemMessage | None = None

# Track model info for telemetry
self._model_provider, self._model_name = extract_model_info(self.llm)

async def initialize(self) -> None:
"""Initialize the MCP client and agent."""
logger.info("🚀 Initializing MCP agent and connecting to services...")
Expand All @@ -130,6 +139,7 @@ async def initialize(self) -> None:
if not self._sessions:
logger.info("🔄 No active sessions found, creating new ones...")
self._sessions = await self.client.create_all_sessions()
self.connectors = [session.connector for session in self._sessions.values()]
logger.info(f"✅ Created {len(self._sessions)} new sessions")

# Create LangChain tools directly from the client using the adapter
Expand All @@ -141,7 +151,7 @@ async def initialize(self) -> None:
connectors_to_use = self.connectors
logger.info(f"🔗 Connecting to {len(connectors_to_use)} direct connectors...")
for connector in connectors_to_use:
if not hasattr(connector, "client") or connector.client_session is None:
if not hasattr(connector, "client_session") or connector.client_session is None:
await connector.connect()

# Create LangChain tools using the adapter with connectors
Expand Down Expand Up @@ -374,13 +384,58 @@ async def astream(
async for chunk in agent.astream("hello"):
print(chunk, end="|", flush=True)
"""
async for chunk in self._generate_response_chunks_async(
query=query,
max_steps=max_steps,
manage_connector=manage_connector,
external_history=external_history,
):
yield chunk
start_time = time.time()
success = False
chunk_count = 0
total_response_length = 0

try:
async for chunk in self._generate_response_chunks_async(
query=query,
max_steps=max_steps,
manage_connector=manage_connector,
external_history=external_history,
):
chunk_count += 1
if isinstance(chunk, str):
total_response_length += len(chunk)
yield chunk
success = True
finally:
# Track comprehensive execution data for streaming
execution_time_ms = int((time.time() - start_time) * 1000)

server_count = 0
if self.client:
server_count = len(self.client.get_all_active_sessions())
elif self.connectors:
server_count = len(self.connectors)

conversation_history_length = (
len(self._conversation_history) if self.memory_enabled else 0
)

self.telemetry.track_agent_execution(
execution_method="astream",
query=query,
success=success,
model_provider=self._model_provider,
model_name=self._model_name,
server_count=server_count,
server_identifiers=[connector.public_identifier for connector in self.connectors],
total_tools_available=len(self._tools) if self._tools else 0,
tools_available_names=[tool.name for tool in self._tools],
max_steps_configured=self.max_steps,
memory_enabled=self.memory_enabled,
use_server_manager=self.use_server_manager,
max_steps_used=max_steps,
manage_connector=manage_connector,
external_history_used=external_history is not None,
response=f"[STREAMED RESPONSE - {total_response_length} chars]",
execution_time_ms=execution_time_ms,
error_type=None if success else "streaming_error",
conversation_history_length=conversation_history_length,
)

async def run(
self,
Expand Down Expand Up @@ -409,6 +464,10 @@ async def run(
"""
result = ""
initialized_here = False
start_time = time.time()
tools_used_names = []
steps_taken = 0
success = False

try:
# Initialize if needed
Expand Down Expand Up @@ -464,6 +523,7 @@ async def run(
logger.info(f"🏁 Starting agent execution with max_steps={steps}")

for step_num in range(steps):
steps_taken = step_num + 1
# --- Check for tool updates if using server manager ---
if self.use_server_manager and self.server_manager:
current_tools = self.server_manager.tools
Expand Down Expand Up @@ -510,9 +570,10 @@ async def run(
# If it's actions/steps, add to intermediate steps
intermediate_steps.extend(next_step_output)

# Log tool calls
# Log tool calls and track tool usage
for action, output in next_step_output:
tool_name = action.tool
tools_used_names.append(tool_name)
tool_input_str = str(action.tool_input)
# Truncate long inputs for readability
if len(tool_input_str) > 100:
Expand Down Expand Up @@ -555,7 +616,8 @@ async def run(
if self.memory_enabled:
self.add_to_history(AIMessage(content=result))

logger.info("🎉 Agent execution complete")
logger.info(f"🎉 Agent execution complete in {time.time() - start_time} seconds")
success = True
return result

except Exception as e:
Expand All @@ -566,6 +628,43 @@ async def run(
raise

finally:
# Track comprehensive execution data
execution_time_ms = int((time.time() - start_time) * 1000)

server_count = 0
if self.client:
server_count = len(self.client.get_all_active_sessions())
elif self.connectors:
server_count = len(self.connectors)

conversation_history_length = (
len(self._conversation_history) if self.memory_enabled else 0
)
self.telemetry.track_agent_execution(
execution_method="run",
query=query,
success=success,
model_provider=self._model_provider,
model_name=self._model_name,
server_count=server_count,
server_identifiers=[connector.public_identifier for connector in self.connectors],
total_tools_available=len(self._tools) if self._tools else 0,
tools_available_names=[tool.name for tool in self._tools],
max_steps_configured=self.max_steps,
memory_enabled=self.memory_enabled,
use_server_manager=self.use_server_manager,
max_steps_used=max_steps,
manage_connector=manage_connector,
external_history_used=external_history is not None,
steps_taken=steps_taken,
tools_used_count=len(tools_used_names),
tools_used_names=tools_used_names,
response=result,
execution_time_ms=execution_time_ms,
error_type=None if success else "execution_error",
conversation_history_length=conversation_history_length,
)

# Clean up if necessary (e.g., if not using client-managed sessions)
if manage_connector and not self.client and not initialized_here:
logger.info("🧹 Closing agent after query completion")
Expand Down
Loading
0