Error Handling Documentation
Overview
Zhineng-bridge uses a comprehensive error handling system with custom exception classes and structured error responses.
Architecture
Exception Hierarchy
All exceptions inherit from ZhinengBridgeException, which provides:
- Consistent error messages
- HTTP-style error codes
- Detailed error context in details field
- to_dict() method for JSON serialization
ZhinengBridgeException
├── ValidationError
│ ├── InvalidMessageTypeError
│ ├── InvalidToolNameError
│ ├── InvalidSessionIdError
│ ├── InvalidJSONError
│ └── MissingFieldError
├── AuthenticationError
├── AuthorizationError
├── RateLimitError
├── SessionNotFoundError
├── SessionAlreadyRunningError
├── MaxConnectionsError
├── MaxSessionsError
├── ServerException
│ ├── SessionManagerError
│ ├── ToolExecutionError
│ ├── ConnectionError
│ └── TimeoutError
└── ConfigurationError
Error Response Format
All errors follow this consistent structure:
{
"type": "error",
"message": "Human-readable error message",
"code": 400,
"field_name": "value", // Optional: additional context
...
}
Exception Classes
Validation Errors (4xx)
InvalidMessageTypeError
- Code: 400
- Usage: Unknown message type from client
- Example:
InvalidToolNameError
- Code: 400
- Usage: Invalid AI tool name
- Example:
InvalidSessionIdError
- Code: 400
- Usage: Invalid session ID format
- Example:
InvalidJSONError
- Code: 400
- Usage: Malformed JSON in message
- Example:
MissingFieldError
- Code: 400
- Usage: Required field missing
- Example:
Authentication/Authorization Errors (4xx)
AuthenticationError
- Code: 401
- Usage: Authentication failed
- Example:
AuthorizationError
- Code: 403
- Usage: User not authorized for action
- Example:
RateLimitError
- Code: 429
- Usage: Rate limit exceeded
- Example:
Resource Errors (4xx/5xx)
SessionNotFoundError
- Code: 404
- Usage: Session not found
- Example:
SessionAlreadyRunningError
- Code: 409
- Usage: Attempt to start already running session
- Example:
MaxConnectionsError
- Code: 429
- Usage: Maximum connections exceeded
- Example:
MaxSessionsError
- Code: 429
- Usage: Maximum sessions exceeded
- Example:
Server Errors (5xx)
ServerException
- Code: 500
- Usage: Generic server error
- Example:
SessionManagerError
- Code: 500
- Usage: Session manager error
- Example:
ToolExecutionError
- Code: 500
- Usage: AI tool execution failed
- Example:
ConnectionError
- Code: 503
- Usage: Connection error
- Example:
TimeoutError
- Code: 504
- Usage: Operation timeout
- Example:
Configuration Errors (5xx)
ConfigurationError
- Code: 500
- Usage: Invalid configuration
- Example:
Usage Patterns
Raising Exceptions
from exceptions import InvalidToolNameError, SessionNotFoundError
def create_session(tool_name: str):
if tool_name not in VALID_TOOLS:
raise InvalidToolNameError(tool_name, list(VALID_TOOLS))
# ... session creation logic
Catching Exceptions in Server
from exceptions import (
InvalidToolNameError,
SessionNotFoundError,
SessionManagerError,
exception_to_dict
)
async def handle_start_session(self, message):
try:
session_id = self.manager.create_session(tool_name, args)
return SessionStartedResponse(...).model_dump()
except (InvalidToolNameError, SessionManagerError) as e:
track_error("session_creation_error", "error")
self.logger.error("Failed to create session", error=str(e), exc_info=True)
return e.to_dict()
Converting Generic Exceptions
from exceptions import exception_to_dict
try:
# Some operation that might raise generic exceptions
result = perform_operation()
except Exception as e:
error_dict = exception_to_dict(e)
# error_dict is now in the standard format
await send_error_to_client(error_dict)
Error Handling in Session Manager
The Session Manager has been updated to use custom exceptions:
# Before:
raise ValueError(f"工具不存在: {tool_name}")
# After:
raise InvalidToolNameError(tool_name, list(self.tools.keys()))
# Before:
raise ValueError(f"会话不存在: {session_id}")
# After:
raise SessionNotFoundError(session_id)
Session Manager Methods
| Method | Raises | Description |
|---|---|---|
create_session() |
InvalidToolNameError |
Tool not in registry |
stop_session() |
SessionNotFoundError, ToolExecutionError |
Session not found or process failed |
delete_session() |
SessionNotFoundError, ToolExecutionError |
Session not found or stop failed |
set_active_session() |
SessionNotFoundError |
Session not found |
Error Handling in WebSocket Server
The WebSocket server handles exceptions at multiple levels:
Connection Level
async def handle_connection(self, websocket):
try:
# Authentication
await authenticate_connection(websocket)
# Message loop
async for message in websocket:
await self.handle_message(client_id, message)
except websockets.exceptions.ConnectionClosed:
self.logger.info("Client disconnected")
except Exception as e:
self.logger.error("Connection handling error", exc_info=True)
Message Level
async def handle_message(self, client_id, message):
try:
validated_message = validate_message(message)
response = await self.route_message(validated_message)
await self.send_to_client(client_id, response)
except json.JSONDecodeError as e:
error = InvalidJSONError(str(e))
await self.send_error_dict(client_id, error.to_dict())
except ValueError as e:
error = ValidationError(str(e))
await self.send_error_dict(client_id, error.to_dict())
except Exception as e:
await self.send_error_dict(client_id, exception_to_dict(e))
Handler Level
async def handle_start_session(self, message):
try:
session_id = self.manager.create_session(tool_name, args)
return SessionStartedResponse(...).model_dump()
except (InvalidToolNameError, SessionManagerError) as e:
track_error("session_creation_error", "error")
self.logger.error("Failed to create session", error=str(e), exc_info=True)
return e.to_dict()
Error Metrics
All errors are tracked using Prometheus metrics:
from metrics import track_error
track_error("session_creation_error", "error")
track_error("json_decode_error", "error")
track_error("validation_error", "error")
Error Code Reference
| Code | Name | Description |
|---|---|---|
| 400 | Bad Request | Invalid request from client |
| 401 | Unauthorized | Authentication required or failed |
| 403 | Forbidden | User not authorized for action |
| 404 | Not Found | Resource not found |
| 409 | Conflict | Resource conflict (e.g., already running) |
| 422 | Unprocessable Entity | Validation error |
| 429 | Too Many Requests | Rate limit exceeded |
| 500 | Internal Server Error | Server error |
| 503 | Service Unavailable | Service temporarily unavailable |
| 504 | Gateway Timeout | Operation timeout |
Best Practices
- Use specific exceptions: Always use the most specific exception class
- Include context: Add relevant details in the
detailsfield - Log errors: Always log errors with context and stack trace
- Track metrics: Use
track_error()for monitoring - Return consistent format: All errors should return
to_dict()result - Don't expose sensitive info: Error messages should be user-friendly, not expose internals
Testing Error Handling
Testing Exception Raising
import pytest
from exceptions import InvalidToolNameError
def test_invalid_tool_name():
with pytest.raises(InvalidToolNameError) as exc_info:
# Code that raises InvalidToolNameError
pass
assert exc_info.value.code == 400
Testing Error Response
async def test_invalid_tool_name_response():
error = InvalidToolNameError("invalid", ["crush", "claude"])
response = error.to_dict()
assert response["type"] == "error"
assert response["code"] == 400
assert "tool_name" in response
Migration Guide
If you're updating code that uses generic exceptions:
Before
try:
session_id = manager.create_session(tool_name)
except ValueError as e:
return {"type": "error", "message": str(e), "code": 400}
After
try:
session_id = manager.create_session(tool_name)
except InvalidToolNameError as e:
track_error("invalid_tool", "error")
return e.to_dict()
Future Enhancements
Potential improvements to the error handling system:
- Error categories: Group related errors for better organization
- Internationalization: Support for multiple languages in error messages
- Error recovery: Automatic retry logic for transient errors
- Error aggregations: Group related errors for analysis
- Custom error handlers: Allow clients to register custom error handlers
Related Files
relay-server/exceptions.py: Exception definitionsrelay-server/server.py: WebSocket server error handlingphase1/session_manager/session_manager.py: Session manager error handlingrelay-server/metrics.py: Error tracking metricsrelay-server/logger.py: Error logging
Last Updated: 2026-03-25 Version: 1.0.0