Code Execution

🐍 Python Code Execution Sandbox

Code Execution enables your AI agent to write and run Python code in a secure, isolated sandbox environment. This powerful capability allows agents to perform complex data analysis, mathematical computations, file processing, and system operations that go beyond simple text generation.

Prerequisites

Before enabling code execution:

  1. PixelML Connection Required: You must add a PixelML connection in your workspace connections

  2. Credit-Based Usage: Code execution consumes credits based on session runtime

  3. Usage Tracking: Charges are calculated per second of active code session time

To set up:

  • Navigate to Workspace Settings → Connections

  • Add a new PixelML connection with your API key

  • Credits are automatically deducted during code execution sessions


🔧 What Is Code Execution?

Code Execution provides your AI agent with four specialized tools:

  • Execute Python Code: Run Python scripts in a persistent sandbox environment

  • Upload Files to Sandbox: Transfer files from Drive storage to the code execution environment

  • Download Files from Sandbox: Save generated files back to Drive storage

  • Execute Shell Commands: Run system commands for package installation and file operations


💡 Key Capabilities

Data Analysis & Processing

# Analyze CSV data
import pandas as pd
import matplotlib.pyplot as plt

# Load and analyze data
df = pd.read_csv('sales_data.csv')
summary = df.describe()
df.groupby('region')['revenue'].sum().plot(kind='bar')
plt.savefig('revenue_by_region.png')

Mathematical Computations

# Complex calculations
import numpy as np
from scipy import stats

# Statistical analysis
data = [23, 45, 67, 89, 12, 34, 56, 78]
mean = np.mean(data)
std_dev = np.std(data)
confidence_interval = stats.t.interval(0.95, len(data)-1,
                                       loc=mean,
                                       scale=stats.sem(data))

Image Processing

# Image manipulation
from PIL import Image

# Resize and optimize images
img = Image.open('photo.jpg')
img_resized = img.resize((800, 600))
img_resized.save('photo_optimized.jpg', quality=85, optimize=True)

File Format Conversions

# Convert between formats
import json
import csv

# JSON to CSV conversion
with open('data.json', 'r') as f:
    data = json.load(f)

with open('output.csv', 'w', newline='') as f:
    writer = csv.DictWriter(f, fieldnames=data[0].keys())
    writer.writeheader()
    writer.writerows(data)

Web Data Fetching

# Download data from the internet
import urllib.request

# Fetch data directly in sandbox
url = 'https://api.example.com/data.json'
urllib.request.urlretrieve(url, 'downloaded_data.json')

🏗️ How Code Execution Works

Session Lifecycle Architecture

Each agent turn (response to a user message) gets its own isolated sandbox session:

User sends message

Agent starts processing

Code Session Created (when first code tool is needed)

Code Execution 1 (variables stored)

Code Execution 2 (can access previous variables)

Code Execution 3 (state persists within this turn)

Agent completes response → Session Terminated

Next user message → New Session Created

Key Characteristics:

  • Turn-Based Sessions: Each agent response gets a fresh sandbox session

  • Intra-Turn Persistence: Variables, imports, and files persist during a single agent turn

  • Session Isolation: Each turn gets its own independent sandbox environment

  • Automatic Cleanup: Sessions are terminated when the agent completes its response

  • State Reset: The next agent turn starts with a completely fresh environment

  • On-Demand Creation: Sessions are only created when the agent needs to execute code

Important: Unlike a persistent conversation-wide session, you cannot reference variables or files from previous agent responses. Each turn is completely isolated.

Example: Multi-Step Analysis Within a Single Turn

# All within ONE agent response (same session):

# Step 1: Load data
import pandas as pd
df = pd.read_csv('sales_data.csv')

# Step 2: Process data (df is still available in same turn)
monthly_sales = df.groupby('month')['revenue'].sum()

# Step 3: Generate visualization (both variables still accessible)
import matplotlib.pyplot as plt
monthly_sales.plot(kind='line')
plt.savefig('sales_trend.png')

Important Note: All three steps must happen in the same agent response. If the agent completes its response after Step 1, the next turn will have a fresh session where df is no longer available.


📁 File Transfer System

Upload Files to Sandbox

Transfer files from Drive storage to the sandbox for processing:

# Agent uses the upload tool
_code_upload_to_session([
    {"source": "/data/files/sales_2024.csv", "dest": "sales.csv"},
    {"source": "/data/images/logo.png", "dest": "logo.png"}
])

Supported File Types:

  • Text Files: .txt, .md, .py, .js, .json, .csv, .xml, .html, .css, .yaml, .sql, etc.

  • Images: .jpg, .jpeg, .png, .gif, .bmp, .webp, .svg

  • Documents: .pdf

  • Media: .mp4, .mp3, .wav, .flac, .ogg, .avi, .mov

Download Files from Sandbox

Save generated files back to Drive storage:

# Agent uses the download tool
_code_download_from_session([
    {"source": "analysis_report.pdf", "dest": "/data/reports/analysis_report.pdf"},
    {"source": "chart.png", "dest": "/data/images/chart.png", "replace": True}
])

Important Notes:

  • Files in the sandbox are temporary and deleted when the agent turn ends

  • Always download important results to Drive for persistence across turns

  • Set "replace": True to overwrite existing files in Drive

  • Default behavior prevents accidental overwrites

  • Files uploaded in one turn will NOT be available in the next turn's session


Shell Command Execution

Package Installation

Install Python packages as needed:

pip install pandas numpy matplotlib seaborn scikit-learn

Important: Packages are lost when the session ends. Since each agent turn starts a fresh session, packages must be reinstalled if needed in subsequent turns.

System Operations

Perform file and system operations:

# File operations
ls -la                           # List files
mkdir output                     # Create directory
cat data.txt                     # View file contents
find . -name "*.csv"             # Find files

# Download from internet
curl -o dataset.csv https://example.com/data.csv
wget https://example.com/large_file.zip

# Archive operations
zip -r archive.zip data/
tar -czf backup.tar.gz files/

🔐 Security & Isolation

Sandbox Isolation Features

Security Boundaries:
✅ Isolated execution environment per agent turn
✅ No access to host system
✅ Separate file system per session
✅ Network restrictions on outbound connections
✅ Resource limits (CPU, memory, execution time)
✅ Automatic session cleanup after each turn
✅ Zero state carry-over between sessions

File Type Restrictions

Only safe file types are allowed for upload/download:

Allowed File Extensions:
- Text & Code: 30+ extensions
- Images: 7 formats
- Documents: PDF only
- Media: Video and audio formats

Blocked for Security:
- Executables (.exe, .dll, .so)
- Scripts (.sh, .bat) - except in sandbox
- Archives (.zip, .tar) - except within sandbox
- System files (.sys, .ini, .conf) - except within sandbox

Resource Quotas

Execution Limits:
- Timeout: 30 seconds per code execution
- Memory: Limited per session
- Storage: Temporary file storage only (deleted after turn)
- Network: Controlled internet access
- Billing: Credit consumption based on session runtime

Cost & Credit Management

Billing Model:
- Credits charged per second of active code session time
- Session starts when first code tool is called in a turn
- Session ends when agent completes its response
- Only active execution time is billed (not idle time)
- Usage tracked in real-time in workspace dashboard

Cost Optimization Tips:
✅ Complete operations efficiently within single turns
✅ Minimize unnecessary package installations
✅ Use efficient algorithms and vectorized operations
✅ Cache results in Drive to avoid recomputation
✅ Monitor usage patterns and optimize workflows

Requirement: A valid PixelML connection must be configured in Workspace Settings → Connections before code execution can be used.


📊 Common Use Cases

1. Data Analysis Agent

Workflow:
1. User uploads CSV file to Drive
2. Agent uploads file to sandbox
3. Agent analyzes data with pandas
4. Agent generates visualizations with matplotlib
5. Agent downloads charts back to Drive
6. Agent presents findings to user

Example Conversation:

User: "Analyze the sales data in /data/files/q4_sales.csv"

Agent:
1. Uploads file to sandbox
2. Executes: pd.read_csv('q4_sales.csv').describe()
3. Identifies trends and insights
4. Creates visualization
5. Downloads chart to Drive
6. Presents analysis with visual

2. Image Processing Agent

Capabilities:
- Batch resize images
- Convert formats (PNG to JPG, etc.)
- Add watermarks
- Optimize file sizes
- Generate thumbnails
- Extract EXIF data

3. Report Generation Agent

Workflow:
1. Fetch data from multiple sources
2. Process and analyze in sandbox
3. Generate charts and visualizations
4. Create PDF report with reportlab
5. Download final report to Drive

4. Data Transformation Agent

Transformations:
- CSV to JSON conversion
- Excel to database format
- Log file parsing
- Text extraction from PDFs
- Format standardization
- Data cleaning and validation

5. Scientific Computing Agent

Capabilities:
- Statistical analysis with scipy
- Machine learning with scikit-learn
- Numerical computations with numpy
- Symbolic math with sympy
- Optimization problems
- Simulation modeling

🎯 Best Practices

Session Management

DO:
✅ Complete all related operations within a single agent response
✅ Install required packages at the start of the turn
✅ Download important results to Drive within the same turn
✅ Use meaningful variable names for complex analyses
✅ Chain multiple code executions efficiently in one turn

DON'T:
❌ Assume variables from previous turns are available
❌ Expect packages installed in previous turns to persist
❌ Store critical data only in sandbox
❌ Run extremely long computations (>30 seconds)
❌ Rely on session state across different agent turns
❌ Expect files to persist between turns without downloading to Drive

File Management

Best Practices:
1. Upload only necessary files to sandbox
2. Use clear, descriptive filenames
3. Organize files in directories when needed
4. Download results promptly after generation
5. Clean up large temporary files to save space

Error Handling

# Robust code with error handling
try:
    import pandas as pd
    df = pd.read_csv('data.csv')
    result = df.groupby('category')['value'].sum()
    print(result)
except FileNotFoundError:
    print("Error: data.csv not found. Please upload the file first.")
except Exception as e:
    print(f"Error during analysis: {e}")

Performance Optimization

Optimization Tips:
1. Use vectorized operations (numpy/pandas) instead of loops
2. Filter data early to reduce processing time
3. Use appropriate data types (int32 vs int64)
4. Cache intermediate results in variables
5. Limit plot complexity for faster rendering

🚨 Common Issues & Solutions

Package Not Found

Problem: ModuleNotFoundError: No module named 'pandas'
Solution: Run shell command: pip install pandas
Note: Reinstall packages at the start of each new conversation

File Not Found

Problem: FileNotFoundError: [Errno 2] No such file or directory: 'data.csv'
Solution: Use _code_upload_to_session to transfer file from Drive first

Session State Lost

Problem: Variable 'df' not found in subsequent execution
Cause: Previous agent turn completed (session was terminated)
Solution: Each agent turn is isolated - reload data and reinstall packages
Tip: Save intermediate results to Drive if multi-turn processing is needed

Timeout Errors

Problem: Code execution exceeded 30-second timeout
Solution:
- Break computation into smaller chunks
- Optimize algorithm efficiency
- Reduce data size being processed
- Consider pre-processing data in workflow nodes

File Download Conflicts

Problem: Error: File already exists in Drive
Solution: Set "replace": True in download request, or use different filename

📈 Advanced Patterns

Iterative Data Processing

# Process data in chunks for large datasets
chunk_size = 10000
chunks = []

for chunk in pd.read_csv('large_file.csv', chunksize=chunk_size):
    processed = chunk.groupby('category')['value'].sum()
    chunks.append(processed)

final_result = pd.concat(chunks).groupby(level=0).sum()

Multi-Format Output

# Generate results in multiple formats
import json

# Create results dictionary
results = {
    'total_sales': 150000,
    'top_products': ['Product A', 'Product B'],
    'trend': 'increasing'
}

# Save as JSON
with open('results.json', 'w') as f:
    json.dump(results, f, indent=2)

# Save as text report
with open('results.txt', 'w') as f:
    f.write(f"Total Sales: ${results['total_sales']:,}\n")
    f.write(f"Top Products: {', '.join(results['top_products'])}\n")
    f.write(f"Trend: {results['trend']}\n")

# Save as CSV
import csv
with open('results.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerow(['Metric', 'Value'])
    for key, value in results.items():
        writer.writerow([key, str(value)])

Combining Internet Data with Drive Files

# Download data from web, combine with Drive data
import urllib.request
import pandas as pd

# Fetch external data
urllib.request.urlretrieve('https://api.example.com/data.csv', 'external_data.csv')

# Load external and Drive data
external_df = pd.read_csv('external_data.csv')
local_df = pd.read_csv('local_data.csv')  # Uploaded from Drive

# Combine and analyze
combined = pd.concat([external_df, local_df])
analysis = combined.groupby('region')['sales'].sum()

🎓 Learning Resources

Essential Python Libraries

Data Analysis:
- pandas: Data manipulation and analysis
- numpy: Numerical computing
- scipy: Scientific computing

Visualization:
- matplotlib: Static plots and charts
- seaborn: Statistical data visualization
- plotly: Interactive visualizations

File Processing:
- PIL/Pillow: Image processing
- PyPDF2: PDF manipulation
- openpyxl: Excel file handling

Machine Learning:
- scikit-learn: ML algorithms
- statsmodels: Statistical modeling
- xgboost: Gradient boosting

Code Execution Examples

# Statistical Analysis Example
import pandas as pd
import numpy as np
from scipy import stats

data = pd.read_csv('experiment_results.csv')
group_a = data[data['group'] == 'A']['score']
group_b = data[data['group'] == 'B']['score']

# Perform t-test
t_stat, p_value = stats.ttest_ind(group_a, group_b)
print(f"T-statistic: {t_stat:.3f}")
print(f"P-value: {p_value:.3f}")
print(f"Significant: {'Yes' if p_value < 0.05 else 'No'}")

Code Execution Checklist

Before deploying your agent with code execution:


Code execution transforms your agent from a conversational assistant into a powerful computational engine capable of data analysis, file processing, and complex problem-solving.

Last updated

Was this helpful?