Testing Skills
Complete guide to testing Skill Engine skills.
Overview
Testing skills ensures they work correctly across different:
- Runtimes: WASM, Docker, Native
- Environments: Development, staging, production
- Inputs: Valid, invalid, edge cases
- Integrations: External APIs and services
Testing Strategies
1. Manual Testing
Quick validation during development
2. Unit Testing
Test individual tool functions
3. Integration Testing
Test full skill execution
4. End-to-End Testing
Test with real AI agents (Claude Code)
Manual Testing
Basic Execution
bash
# Test skill installation
skill install ./my-skill
# List available tools
skill list my-skill
# Test tool execution
skill run my-skill my-tool --param value
# Check output
echo $? # Exit code (0 = success)Dry Run Mode
bash
# Preview without executing
skill run --dry-run my-skill my-tool --param value
# Shows what would be executedDebug Mode
bash
# Enable debug logging
export SKILL_LOG_LEVEL=debug
skill run my-skill my-tool --param value
# Rust-level debugging
export RUST_LOG=skill_runtime=trace
skill run my-skill my-toolTest Different Instances
bash
# Test development instance
skill run my-skill:dev my-tool
# Test staging instance
skill run my-skill:staging my-tool
# Test production instance
skill run my-skill:prod my-toolUnit Testing
WASM Skills (JavaScript/TypeScript)
typescript
// src/skill.test.ts
import { describe, it, expect } from 'vitest';
import { execute } from './skill';
describe('MySkill', () => {
it('should process valid input', async () => {
const result = await execute({
tool: 'process',
parameters: {
input: 'test data'
}
});
expect(result.status).toBe('success');
expect(result.output).toContain('processed');
});
it('should reject invalid input', async () => {
await expect(
execute({
tool: 'process',
parameters: {
input: '' // Empty input
}
})
).rejects.toThrow('Input required');
});
it('should handle edge cases', async () => {
const result = await execute({
tool: 'process',
parameters: {
input: 'a'.repeat(10000) // Large input
}
});
expect(result.status).toBe('success');
});
});Run tests:
bash
npm testNative Skills (Node.js)
javascript
// skill.test.js
const { execute } = require('./skill');
describe('Kubernetes Skill', () => {
beforeEach(() => {
// Mock kubectl command
process.env.KUBECTL = 'echo';
});
test('get pods should return valid JSON', async () => {
const result = await execute({
tool: 'get',
parameters: {
resource: 'pods',
namespace: 'default'
}
});
expect(result).toBeInstanceOf(Object);
expect(result.items).toBeInstanceOf(Array);
});
test('should validate resource names', async () => {
await expect(
execute({
tool: 'get',
parameters: {
resource: 'invalid-resource',
namespace: 'default'
}
})
).rejects.toThrow();
});
});Run tests:
bash
npm test
# or
node --test skill.test.jsRust Skills
rust
// src/lib.rs
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_execute_valid_input() {
let params = ToolParameters {
param1: "value1".to_string(),
param2: Some("value2".to_string()),
};
let result = execute_tool("my-tool", params);
assert!(result.is_ok());
assert_eq!(result.unwrap().status, "success");
}
#[test]
fn test_execute_invalid_input() {
let params = ToolParameters {
param1: "".to_string(), // Invalid
param2: None,
};
let result = execute_tool("my-tool", params);
assert!(result.is_err());
}
#[tokio::test]
async fn test_async_execution() {
let result = execute_async_tool("api-call").await;
assert!(result.is_ok());
}
}Run tests:
bash
cargo testIntegration Testing
Test Script
bash
#!/bin/bash
# test-skill.sh
set -e
echo "Testing skill installation..."
skill install ./my-skill
echo "Testing tool execution..."
result=$(skill run my-skill my-tool --input "test")
if [[ "$result" == *"success"* ]]; then
echo "✓ Test passed"
exit 0
else
echo "✗ Test failed"
exit 1
fiTest Manifest
toml
# test-manifest.toml
version = "1"
[skills.test-skill]
source = "./my-skill"
runtime = "wasm"
[skills.test-skill.instances.test]
config.base_url = "http://localhost:8000"
env.LOG_LEVEL = "debug"
capabilities.network_access = trueRun integration test:
bash
skill --manifest test-manifest.toml run test-skill:test my-toolAutomated Integration Tests
bash
#!/bin/bash
# integration-tests.sh
set -e
# Setup
export MANIFEST=test-manifest.toml
skill --manifest $MANIFEST validate
# Test each tool
echo "Testing tool1..."
skill --manifest $MANIFEST run test-skill tool1 --param1 value1
echo "Testing tool2..."
skill --manifest $MANIFEST run test-skill tool2 --param2 value2
# Test error handling
echo "Testing error cases..."
if skill --manifest $MANIFEST run test-skill invalid-tool 2>/dev/null; then
echo "✗ Should have failed"
exit 1
fi
echo "✓ All integration tests passed"End-to-End Testing
Claude Code Testing
Test with actual AI agent:
bash
# 1. Generate Claude Agent Skills
skill claude-bridge generate --skill my-skill
# 2. Start Claude Code
claude
# 3. Test with prompts
> Use my-skill to process data
> List available tools in my-skill
> Execute my-skill tool with parametersValidation checklist:
- [ ] Skill appears in Claude's context
- [ ] Tools are discovered correctly
- [ ] Parameters are validated
- [ ] Execution succeeds
- [ ] Output is formatted properly
- [ ] Errors are handled gracefully
MCP Testing
bash
# Start MCP server
skill serve &
SERVER_PID=$!
# Test MCP protocol
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | \
skill serve
# Test tool execution via MCP
echo '{
"jsonrpc":"2.0",
"id":2,
"method":"tools/call",
"params":{
"name":"skill-engine/execute",
"arguments":{
"skill_name":"my-skill",
"tool_name":"my-tool",
"parameters":{"input":"test"}
}
}
}' | skill serve
# Cleanup
kill $SERVER_PIDDocker Skill Testing
Test Docker Configuration
bash
# Test docker configuration
cat > test-docker-skill.toml << 'EOF'
version = "1"
[skills.test-docker]
source = "docker:alpine:latest"
runtime = "docker"
[skills.test-docker.docker]
image = "alpine:latest"
entrypoint = "echo"
EOF
# Test execution
skill --manifest test-docker-skill.toml run test-docker "Hello"Volume Mounting Test
bash
# Create test file
echo "test data" > /tmp/test-input.txt
# Test volume mounting
skill run docker-skill process \
--input /input/test-input.txt \
--output /output/result.txt
# Verify output
cat /tmp/test-output.txtMocking and Stubbing
Mock External APIs
javascript
// test-helpers.js
export function mockGitHubAPI() {
return {
getIssues: jest.fn().mockResolvedValue([
{ id: 1, title: 'Test issue' }
]),
createIssue: jest.fn().mockResolvedValue({
id: 2,
title: 'New issue'
})
};
}
// skill.test.js
import { mockGitHubAPI } from './test-helpers';
test('should create issue', async () => {
const api = mockGitHubAPI();
const result = await execute({
tool: 'create-issue',
parameters: { title: 'Bug' }
});
expect(api.createIssue).toHaveBeenCalled();
});Mock Native Commands
javascript
// For native skills wrapping CLI tools
jest.mock('child_process', () => ({
execFile: jest.fn((cmd, args, callback) => {
// Mock kubectl output
if (cmd === 'kubectl' && args[0] === 'get') {
callback(null, JSON.stringify({
items: [{ name: 'pod-1' }]
}), '');
}
})
}));Test Data
Fixtures
tests/
├── fixtures/
│ ├── valid-input.json
│ ├── invalid-input.json
│ ├── large-input.json
│ └── edge-cases.json
└── skill.test.tsLoad in tests:
typescript
import validInput from './fixtures/valid-input.json';
test('should handle valid input', async () => {
const result = await execute({
tool: 'process',
parameters: validInput
});
expect(result.status).toBe('success');
});Test Manifests
tests/
├── manifests/
│ ├── minimal.toml
│ ├── full-features.toml
│ └── multi-instance.toml
└── integration.test.shContinuous Integration
GitHub Actions
yaml
# .github/workflows/test.yml
name: Test Skills
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install Skill Engine
run: |
curl -fsSL https://dqkbk9o7ynwhxfjx.public.blob.vercel-storage.com/install.sh | sh
echo "$HOME/.skill/bin" >> $GITHUB_PATH
- name: Run unit tests
run: npm test
- name: Run integration tests
run: ./tests/integration-tests.sh
- name: Validate manifest
run: skill validate
- name: Test skill execution
run: |
skill install ./my-skill
skill run my-skill my-tool --input testGitLab CI
yaml
# .gitlab-ci.yml
test:
image: ubuntu:22.04
before_script:
- curl -fsSL https://install-url.com/install.sh | sh
- export PATH="$HOME/.skill/bin:$PATH"
script:
- npm test
- ./tests/integration-tests.sh
- skill validatePerformance Testing
Measure Execution Time
bash
#!/bin/bash
# benchmark.sh
echo "Running performance tests..."
for i in {1..100}; do
start=$(date +%s%N)
skill run my-skill my-tool --input "test $i"
end=$(date +%s%N)
duration=$((($end - $start) / 1000000)) # Convert to ms
echo "$i,$duration" >> results.csv
done
echo "Average: $(awk -F, '{sum+=$2; count++} END {print sum/count}' results.csv) ms"Load Testing
bash
#!/bin/bash
# load-test.sh
# Concurrent executions
for i in {1..10}; do
skill run my-skill my-tool --input "test $i" &
done
wait
echo "Completed 10 concurrent executions"Benchmark with hyperfine
bash
# Install hyperfine
cargo install hyperfine
# Benchmark skill execution
hyperfine 'skill run my-skill my-tool --input test' \
--warmup 3 \
--min-runs 10
# Compare runtimes
hyperfine \
'skill run my-skill:wasm my-tool' \
'skill run my-skill:docker my-tool' \
'skill run my-skill:native my-tool'Coverage
JavaScript/TypeScript
bash
# Using vitest
npm test -- --coverage
# Using c8
c8 npm testRust
bash
# Using tarpaulin
cargo install cargo-tarpaulin
cargo tarpaulin --out Html --output-dir coverageBest Practices
1. Test All Tool Parameters
typescript
describe('Tool parameters', () => {
it('should handle required parameters', async () => {
const result = await execute({
tool: 'my-tool',
parameters: { required_param: 'value' }
});
expect(result.status).toBe('success');
});
it('should reject missing required parameters', async () => {
await expect(
execute({
tool: 'my-tool',
parameters: {}
})
).rejects.toThrow('required_param is required');
});
it('should use default for optional parameters', async () => {
const result = await execute({
tool: 'my-tool',
parameters: { required_param: 'value' }
});
// Optional param should use default
});
});2. Test Error Handling
typescript
describe('Error handling', () => {
it('should handle network errors', async () => {
// Mock network failure
mockAPI.mockRejectedValue(new Error('Network error'));
await expect(
execute({ tool: 'api-call', parameters: {} })
).rejects.toThrow('Network error');
});
it('should handle timeouts', async () => {
// Mock slow response
mockAPI.mockImplementation(() =>
new Promise(resolve => setTimeout(resolve, 60000))
);
await expect(
execute({ tool: 'slow-operation', parameters: {} })
).rejects.toThrow('timeout');
});
});3. Test Edge Cases
typescript
describe('Edge cases', () => {
it('should handle empty input', async () => {
const result = await execute({
tool: 'process',
parameters: { input: '' }
});
expect(result.output).toBe('');
});
it('should handle very large input', async () => {
const result = await execute({
tool: 'process',
parameters: { input: 'a'.repeat(1000000) }
});
expect(result.status).toBe('success');
});
it('should handle special characters', async () => {
const result = await execute({
tool: 'process',
parameters: { input: '!@#$%^&*()' }
});
expect(result.status).toBe('success');
});
});4. Use Test Fixtures
typescript
import * as fs from 'fs';
import * as path from 'path';
function loadFixture(name: string): any {
const fixturePath = path.join(__dirname, 'fixtures', `${name}.json`);
return JSON.parse(fs.readFileSync(fixturePath, 'utf-8'));
}
test('should handle complex input', async () => {
const input = loadFixture('complex-input');
const result = await execute({
tool: 'process',
parameters: input
});
expect(result.status).toBe('success');
});5. Clean Up After Tests
typescript
afterEach(async () => {
// Clean up test data
await cleanup();
});
afterAll(async () => {
// Close connections
await closeConnections();
});Troubleshooting Tests
Tests Fail Intermittently
Cause: Race conditions, timing issues
Solution:
typescript
// Add proper async handling
await Promise.all([
operation1(),
operation2()
]);
// Add timeouts
await waitFor(() => expect(element).toBeInTheDocument(), {
timeout: 5000
});Tests Pass Locally, Fail in CI
Cause: Environment differences
Solution:
yaml
# Ensure consistent environment
env:
NODE_ENV: test
SKILL_LOG_LEVEL: errorSlow Test Suite
Cause: Not using parallelization
Solution:
bash
# Run tests in parallel
npm test -- --parallel
# Rust tests in parallel (default)
cargo testRelated Documentation
- Skill Development - Creating skills
- Environment Variables - Test configuration
- Security Model - Security testing
- CLI Reference - Test commands
External Resources
- Vitest - Fast unit testing
- Jest - JavaScript testing framework
- cargo-tarpaulin - Rust coverage
- hyperfine - Benchmarking tool