Skip to main content

ModelPilot API Documentation

ModelPilot provides OpenAI-compatible API endpoints for seamless migration from OpenAI services to your own deployments. Access your deployed models through familiar OpenAI API patterns.

Authentication

API Keys

Create API keys in your dashboard to access ModelPilot endpoints programmatically. API keys must have proxy permission for OpenAI-compatible endpoints.

bash
curl -X POST https://your-domain.com/api/v1/chat/completions \
  -H "Authorization: Bearer mp_live_your_api_key_here" \
  -H "Content-Type: application/json"

API Key Requirements

  • • Requires read and proxy permissions
  • • Session authentication (web UI) has full permissions automatically
  • • API keys can be created and managed in your dashboard

OpenAI-Compatible Endpoints

Chat Completions

Create chat completions using the OpenAI-compatible format. Automatically routes to your deployed text models.

POST /api/v1/chat/completions

Request Example

JavaScript (fetch)

javascript
const response = await fetch('https://your-domain.com/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer mp_live_your_api_key',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'mistral',  // Your deployed model name
    messages: [
      { role: 'user', content: 'Hello, how are you?' }
    ],
    temperature: 0.7,
    max_tokens: 100
  })
});

const data = await response.json();
console.log(data.choices[0].message.content);

cURL

bash
curl -X POST https://your-domain.com/api/v1/chat/completions \
  -H "Authorization: Bearer mp_live_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "temperature": 0.7,
    "max_tokens": 100
  }'

Response Example

json
{
  "id": "chatcmpl-1234567890",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "mistral",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I'm doing well, thank you for asking. How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 20,
    "total_tokens": 32
  },
  "system_fingerprint": "modelpilot-pod123",
  "x_modelpilot": {
    "deployment_id": "pod123",
    "model_identifier": "mistral:7b",
    "response_time_ms": 1250,
    "direct_endpoint": "https://pod123.proxy.runpod.net:11434"
  }
}

Supported Parameters

ParameterTypeDescription
modelstringYour deployed model name (e.g., "mistral", "gemma3")
messagesarrayArray of message objects with role and content
temperaturenumberSampling temperature (0.0 to 2.0)
max_tokensnumberMaximum tokens to generate
top_pnumberNucleus sampling parameter
stopstring|arrayStop sequences

Health Monitoring

Deployment Health

Check the health status of your deployments to ensure services are running properly.

GET /api/deployments/{podId}/health

cURL Example

bash
curl -X GET https://your-domain.com/api/deployments/pod123/health \
  -H "Authorization: Bearer mp_live_your_api_key"

Response Example

json
{
  "status": "healthy",
  "timestamp": "2023-12-01T10:30:00.000Z",
  "services": {
    "ollama": "running",
    "webui": "running"
  },
  "deployment_status": "running",
  "response_time_ms": 125,
  "last_checked": "2023-12-01T10:30:00.000Z"
}

Status Values

● healthy - All services running
● degraded - Some services have issues
● unhealthy - Services are down
● starting - Deployment is starting up
● unknown - Status could not be determined

Migration from OpenAI

Quick Migration Steps

1

Deploy Your Model

Use the ModelPilot dashboard to deploy your preferred model

2

Create API Key

Generate an API key with proxy permissions in your dashboard

3

Update Your Code

Change the base URL and API key in your existing OpenAI code

Code Changes

Before (OpenAI):
javascript
const openai = new OpenAI({
  apiKey: 'sk-...',
  baseURL: 'https://api.openai.com/v1'
});

const response = await openai.chat.completions.create({
  model: 'gpt-3.5-turbo',
  messages: [{ role: 'user', content: 'Hello' }]
});
After (ModelPilot):
javascript
const openai = new OpenAI({
  apiKey: 'mp_live_your_api_key',
  baseURL: 'https://your-domain.com/api/v1'
});

const response = await openai.chat.completions.create({
  model: 'mistral',  // Your deployed model
  messages: [{ role: 'user', content: 'Hello' }]
});

Error Handling

Common Errors

Model Not Found (404)

No active deployment found for the specified model. Deploy the model first via the dashboard.

json
{
  "error": {
    "message": "No active deployment found for model 'mistral'",
    "type": "invalid_request_error",
    "param": "model",
    "code": "model_not_found"
  },
  "available_models": ["gemma3:7b", "deepseek-r1"]
}

Model Not Running (503)

The deployment exists but is not currently running. Start it via the dashboard.

json
{
  "error": {
    "message": "Model 'mistral' deployment is not running (status: stopped)",
    "type": "invalid_request_error",
    "param": "model",
    "code": "model_not_available"
  }
}

Invalid Model Type (400)

The model is not a text model and cannot be used with chat completions.

json
{
  "error": {
    "message": "Model 'flux-dev' is not a text model and cannot be used with chat completions",
    "type": "invalid_request_error",
    "param": "model",
    "code": "invalid_model_type"
  }
}

Best Practices

Performance Tips

  • • Keep deployments running for faster response times
  • • Use appropriate temperature values (0.1-0.9 for most use cases)
  • • Set reasonable max_tokens to control costs
  • • Monitor deployment health regularly
  • • Consider using direct endpoints for better performance

Cost Optimization

  • • Stop deployments when not in use
  • • Use smaller models for simple tasks
  • • Monitor your credit usage in the dashboard
  • • Set up usage alerts and limits
  • • Consider batch processing for efficiency