Building Custom Agent Integrations

The Stagewise Agent Interface allows you to connect any AI agent or service to the Stagewise toolbar. Your custom agent can receive user prompts, maintain conversation state, call tools, and provide rich responses back to users in their browser.

1. Getting Started

Installation

First, install the agent interface package:

npm install @stagewise/agent-interface

Basic Agent Setup

Create a simple agent server that connects to the Stagewise toolbar:

import { createAgentServer, AgentStateType } from '@stagewise/agent-interface/agent';

async function startAgent() {
  // Create the agent server - this handles WebSocket connections and HTTP endpoints
  const server = await createAgentServer();
  
  // Configure your agent
  server.setAgentName('My Custom Agent');
  server.setAgentDescription('A powerful AI assistant');
  
  // Set the agent as available
  server.interface.availability.set(true);
  
  // Listen for user messages
  server.interface.messaging.addUserMessageListener(async (message) => {
    // Set state to working
    server.interface.state.set(AgentStateType.WORKING);
    
    // Process the message and respond
    await processUserMessage(message, server.interface);
    
    // Set state to completed
    server.interface.state.set(AgentStateType.COMPLETED);
  });
  
  console.log(`Agent server running on port ${server.port}`);
}

async function processUserMessage(message, agentInterface) {
  // Extract the user's text
  const userText = message.contentItems
    .filter(item => item.type === 'text')
    .map(item => item.text)
    .join('\n');
  
  // Send a simple response
  agentInterface.messaging.set([{
    type: 'text',
    text: `You said: "${userText}". I'm a simple echo agent!`
  }]);
}

startAgent().catch(console.error);

This creates a basic agent that:

Listens on an available port (starting from 5746)
Exposes a /stagewise/info endpoint for discovery
Accepts WebSocket connections for real-time communication
Echoes back user messages

2. Agent Interface Overview

The Agent Interface provides several key capabilities through a clean, type-safe API:

Core Capabilities

Availability Management: Control when your agent is ready to receive requests
State Management: Communicate current agent status (idle, working, thinking, etc.)
Message Handling: Receive user messages and send rich responses
Tool Calling: Execute tools and functions on behalf of users (optional)
Cleanup: Properly manage resources and prevent memory leaks

Architecture

export type AgentInterface = {
  availability: { get, set };    // Agent availability status
  state: { get, set };          // Current agent state  
  messaging: {                  // Message sending/receiving
    get, set, addPart, updatePart, clear,
    addUserMessageListener, removeUserMessageListener
  };
  toolCalling?: {               // Optional tool calling support
    setToolCallSupport, getAvailableTools, requestToolCall
  };
  cleanup: { clearAllListeners }; // Resource cleanup
};

3. Managing Availability

Control when your agent can accept requests:

import { AgentAvailabilityError } from '@stagewise/agent-interface/agent';

// Set agent as available
agentInterface.availability.set(true);

// Set agent as unavailable with error reason
agentInterface.availability.set(
  false, 
  AgentAvailabilityError.NO_AUTHENTICATION,
  'Please log in to use this agent'
);

// Check current availability
const availability = agentInterface.availability.get();
if (availability.isAvailable) {
  console.log('Agent is ready!');
} else {
  console.log('Agent unavailable:', availability.error, availability.errorMessage);
}

// Available error types:
// - NO_CONNECTION: Cannot connect to service
// - NO_AUTHENTICATION: Authentication required
// - INCOMPATIBLE_VERSION: Version mismatch
// - OTHER: Custom error condition

4. Managing Agent State

Agent state helps users understand what your agent is currently doing. Use appropriate states to provide clear feedback:

import { AgentStateType } from '@stagewise/agent-interface/agent';

// Available states:
// - IDLE: Ready for new requests
// - THINKING: Processing/analyzing 
// - WORKING: Actively executing task
// - CALLING_TOOL: Executing a tool call
// - WAITING_FOR_USER_RESPONSE: Needs user input
// - FAILED: Encountered an error
// - COMPLETED: Successfully finished task

async function handleUserMessage(message, agentInterface) {
  // Set state with optional description
  agentInterface.state.set(AgentStateType.THINKING, 'Analyzing your request...');
  
  // Simulate processing time
  await new Promise(resolve => setTimeout(resolve, 1000));
  
  agentInterface.state.set(AgentStateType.WORKING, 'Generating response...');
  
  // Do the actual work
  const response = await generateResponse(message);
  
  agentInterface.state.set(AgentStateType.COMPLETED, 'Response ready!');
  
  // Return to idle after a brief delay
  setTimeout(() => {
    agentInterface.state.set(AgentStateType.IDLE);
  }, 2000);
}

5. Handling Messages

Receiving User Messages

User messages contain rich context about the user's environment:

server.interface.messaging.addUserMessageListener((message) => {
  // Message structure:
  console.log('User text:', message.contentItems);
  console.log('Current URL:', message.metadata.currentUrl);
  console.log('Page title:', message.metadata.currentTitle);
  console.log('Selected elements:', message.metadata.selectedElements);
  console.log('Plugin content:', message.pluginContent);
  console.log('User agent:', message.metadata.userAgent);
  console.log('Viewport:', message.metadata.viewportResolution);
});

Sending Agent Responses

You can send various types of content back to users:

// Send text response
agentInterface.messaging.set([{
  type: 'text',
  text: 'Here is my response to your question...'
}]);

// Send text with images
agentInterface.messaging.set([
  {
    type: 'text', 
    text: 'Here\'s the analysis you requested:'
  },
  {
    type: 'image',
    mimeType: 'image/png',
    data: base64ImageData, // Base64 encoded image
    replacing: false
  }
]);

// Stream responses by updating parts
agentInterface.messaging.addPart({
  type: 'text',
  text: 'Starting analysis...'
});

// Update existing part with more content
agentInterface.messaging.updatePart({
  type: 'text', 
  text: '\n\nProgress: 50% complete...'
}, 0, 'append');

Working with Selected Elements

When users select elements on a webpage, you get detailed context:

function analyzeSelectedElements(selectedElements) {
  selectedElements.forEach((element, index) => {
    console.log(`Element ${index}:`);
    console.log('- Node type:', element.nodeType);
    console.log('- XPath:', element.xpath);
    console.log('- Text content:', element.textContent);
    console.log('- Attributes:', element.attributes);
    console.log('- Bounding box:', element.boundingClientRect);
    console.log('- Plugin annotations:', element.pluginInfo);
    
    // Check parent element if available
    if (element.parent) {
      console.log('- Parent:', element.parent.nodeType);
    }
  });
}

6. Tool Calling (Advanced)

Tool calling enables your agent to call tools that the toolbar or it's plugins provide in order to get more information.

** This functionality is right now not supported by the toolbar itself. **

Setting Up Tool Support

// Enable tool calling support
server.interface.toolCalling?.setToolCallSupport(true);

// Listen for available tools
server.interface.toolCalling?.onToolListUpdate((tools) => {
  console.log('Available tools:', tools.map(t => t.toolName));
});

Calling Tools

async function handleToolCallingExample(agentInterface) {
  // Get available tools
  const tools = agentInterface.toolCalling?.getAvailableTools() || [];
  
  // Find a specific tool
  const webSearchTool = tools.find(tool => tool.toolName === 'web_search');
  
  if (webSearchTool) {
    agentInterface.state.set(AgentStateType.CALLING_TOOL, 'Searching the web...');
    
    try {
      // Call the tool
      const result = await agentInterface.toolCalling?.requestToolCall(
        'web_search',
        { query: 'Stagewise AI documentation', limit: 5 }
      );
      
      // Use the result
      agentInterface.messaging.addPart({
        type: 'text',
        text: `Search results: ${JSON.stringify(result?.result, null, 2)}`
      });
      
    } catch (error) {
      agentInterface.state.set(AgentStateType.FAILED, 'Tool call failed');
      agentInterface.messaging.addPart({
        type: 'text',
        text: `Error calling web search: ${error.message}`
      });
    }
  }
}

7. Best Practices

State Management

Always set appropriate states to give users feedback
Stay in COMPLETED state for at least 1 second before returning to IDLE
Use descriptive state messages when helpful
Handle errors gracefully with FAILED state

Message Handling

Process user context thoughtfully (selected elements, page info, plugins)
Use streaming responses for long-running operations
Provide rich responses with appropriate markdown formatting
Clear messages when starting new conversations

Resource Management

Always clean up listeners when shutting down
Handle WebSocket disconnections gracefully
Implement proper error handling and recovery
Use appropriate timeouts for tool calls

Tool Integration

Only enable tool calling if your agent actually uses tools
Handle tool call timeouts and errors
Provide clear feedback when calling tools
Document your agent's tool requirements

8. Testing Your Agent

Start your agent: Run your agent server
Add toolbar to your app: Follow the Quickstart guide to add the Stagewise toolbar
Test the connection: The toolbar should detect your agent automatically
Send test messages: Try various prompts and element selections

Note: Your agent runs independently and doesn't require any VS Code extensions to be installed.

Debugging Tips

Check the agent server console for connection logs
Use browser dev tools to inspect WebSocket messages
Monitor the /stagewise/info endpoint
Test availability state changes
Verify tool calling integration

Next Steps

Explore the Plugins documentation to extend toolbar functionality
Check out example agents in the GitHub repository
Join our community for questions and discussions
Consider contributing your agent integration back to the community

The Agent Interface provides a powerful foundation for building sophisticated AI integrations. Start with a simple echo agent and gradually add more advanced features as you explore the possibilities!

Building Custom Agent Integrations

On this page