Formation Model API Reference

Complete API reference for making inference requests to AI models on the Formation network. All endpoints follow OpenAI API v1 specifications for maximum compatibility.

Base URL

https://formation.ai

Authentication

All requests require ECDSA signature authentication. See the Inference Guide for details.

Required Headers

X-Formation-Address: 0x1234567890abcdef1234567890abcdef12345678 X-Formation-Signature: 0xabcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890ab X-Formation-Message: Formation authentication request Content-Type: application/json

Core Endpoints

List Models

Get a list of all available models.

GET /v1/models

Response

{ "success": true, "models": [ { "id": "llama2-7b-chat", "name": "Llama 2 7B Chat", "description": "Meta's Llama 2 7B parameter chat model", "type": "text_generation", "owner_id": "0x9876543210fedcba...", "is_private": false, "pricing": { "model": "per_token", "input_rate": 0.5, "output_rate": 1.0, "currency": "credits_per_1k_tokens" }, "capabilities": ["chat", "text_generation"], "max_tokens": 4096, "context_length": 4096, "created_at": 1640995200, "updated_at": 1640995800 } ], "total": 1 }

Response Fields

FieldTypeDescription
successbooleanWhether the request was successful
modelsarrayArray of model objects
totalintegerTotal number of models

Model Object Fields

FieldTypeDescription
idstringUnique model identifier
namestringHuman-readable model name
descriptionstringModel description
typestringModel type (text_generation, image_generation, etc.)
owner_idstringEthereum address of model owner
is_privatebooleanWhether model is private
pricingobjectPricing information
capabilitiesarrayList of model capabilities
max_tokensintegerMaximum tokens per request
context_lengthintegerMaximum context length
created_atintegerUnix timestamp of creation
updated_atintegerUnix timestamp of last update

Get Model Details

Get detailed information about a specific model.

GET /v1/models/{model_id}

Parameters

ParameterTypeRequiredDescription
model_idstringYesThe ID of the model to retrieve

Response

{ "success": true, "model": { "id": "llama2-7b-chat", "name": "Llama 2 7B Chat", "description": "Meta's Llama 2 7B parameter chat model optimized for conversational AI", "type": "text_generation", "owner_id": "0x9876543210fedcba...", "is_private": false, "pricing": { "model": "per_token", "input_rate": 0.5, "output_rate": 1.0, "currency": "credits_per_1k_tokens" }, "capabilities": ["chat", "text_generation", "instruction_following"], "max_tokens": 4096, "context_length": 4096, "parameters": { "size": "7B", "architecture": "Llama", "precision": "fp16", "quantization": "none" }, "supported_languages": ["en", "es", "fr", "de", "it", "pt", "ru", "ja", "ko", "zh"], "tags": ["conversational", "instruction", "chat"], "version": "2.0", "license": "custom", "created_at": 1640995200, "updated_at": 1640995800 } }

Model Inference

Make an inference request to a specific model.

POST /v1/models/{model_id}/inference

Parameters

ParameterTypeRequiredDescription
model_idstringYesThe ID of the model to use for inference

Request Body

The request body format depends on the model type. See Model-Specific Schemas for details.

Common Parameters

ParameterTypeRequiredDefaultDescription
max_tokensintegerNo1000Maximum tokens to generate
temperaturenumberNo0.7Sampling temperature (0.0 to 2.0)
top_pnumberNo1.0Nucleus sampling parameter
streambooleanNofalseWhether to stream the response
stopstring/arrayNonullStop sequences
presence_penaltynumberNo0.0Presence penalty (-2.0 to 2.0)
frequency_penaltynumberNo0.0Frequency penalty (-2.0 to 2.0)

Model-Specific Schemas

Chat Completion Models

For models that support chat-based interactions.

Request Schema

{ "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello, how are you?" } ], "max_tokens": 1000, "temperature": 0.7, "top_p": 1.0, "n": 1, "stream": false, "stop": null, "presence_penalty": 0, "frequency_penalty": 0, "logit_bias": {}, "user": "user-123" }

Message Object

FieldTypeRequiredDescription
rolestringYesMessage role (system, user, assistant)
contentstringYesMessage content
namestringNoName of the message author

Response Schema (Non-Streaming)

{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1640995200, "model": "llama2-7b-chat", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! I'm doing well, thank you for asking. How can I help you today?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 13, "completion_tokens": 17, "total_tokens": 30 } }

Response Schema (Streaming)

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1640995200,"model":"llama2-7b-chat","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1640995200,"model":"llama2-7b-chat","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1640995200,"model":"llama2-7b-chat","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1640995200,"model":"llama2-7b-chat","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Text Completion Models

For models that complete text prompts.

Request Schema

{ "prompt": "Once upon a time", "max_tokens": 100, "temperature": 0.7, "top_p": 1.0, "n": 1, "stream": false, "logprobs": null, "echo": false, "stop": null, "presence_penalty": 0, "frequency_penalty": 0, "best_of": 1, "logit_bias": {}, "user": "user-123" }

Response Schema

{ "id": "cmpl-abc123", "object": "text_completion", "created": 1640995200, "model": "gpt-3.5-turbo-instruct", "choices": [ { "text": ", there was a brave knight who embarked on a quest to save the kingdom.", "index": 0, "logprobs": null, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 4, "completion_tokens": 16, "total_tokens": 20 } }

Image Generation Models

For models that generate images from text prompts.

Request Schema

{ "prompt": "A cute baby sea otter", "n": 1, "size": "1024x1024", "response_format": "url", "user": "user-123" }

Parameters

ParameterTypeRequiredDefaultDescription
promptstringYes-Text description of the desired image
nintegerNo1Number of images to generate (1-10)
sizestringNo"1024x1024"Image size (256x256, 512x512, 1024x1024)
response_formatstringNo"url"Response format (url or b64_json)
userstringNo-Unique identifier for the user

Response Schema

{ "created": 1640995200, "data": [ { "url": "https://formation.ai/generated/images/abc123.png" } ] }

Embedding Models

For models that generate text embeddings.

Request Schema

{ "input": ["The food was delicious and the waiter was friendly."], "model": "text-embedding-ada-002", "encoding_format": "float", "user": "user-123" }

Parameters

ParameterTypeRequiredDefaultDescription
inputstring/arrayYes-Text to embed
encoding_formatstringNo"float"Encoding format (float or base64)
userstringNo-Unique identifier for the user

Response Schema

{ "object": "list", "data": [ { "object": "embedding", "embedding": [0.0023064255, -0.009327292, ...], "index": 0 } ], "model": "text-embedding-ada-002", "usage": { "prompt_tokens": 8, "total_tokens": 8 } }

Response Objects

Usage Object

Appears in all inference responses to track token usage and costs.

{ "prompt_tokens": 13, "completion_tokens": 17, "total_tokens": 30, "cost_credits": 0.03 }
FieldTypeDescription
prompt_tokensintegerNumber of tokens in the prompt
completion_tokensintegerNumber of tokens in the completion
total_tokensintegerTotal tokens used
cost_creditsnumberCost in Formation credits

Choice Object

Represents a single completion choice.

{ "index": 0, "message": { "role": "assistant", "content": "Hello! How can I help you?" }, "finish_reason": "stop" }
FieldTypeDescription
indexintegerChoice index
messageobjectMessage object (for chat completions)
textstringGenerated text (for text completions)
finish_reasonstringReason completion finished

Finish Reasons

ReasonDescription
stopNatural stopping point or stop sequence reached
lengthMaximum token limit reached
content_filterContent filtered due to policy violations
function_callModel called a function (for function-calling models)

Error Codes and Handling

HTTP Status Codes

StatusDescription
200Success
400Bad Request - Invalid parameters
401Unauthorized - Authentication failed
403Forbidden - Access denied
404Not Found - Model or resource not found
429Too Many Requests - Rate limit exceeded
500Internal Server Error
502Bad Gateway - Model service unavailable
503Service Unavailable - Temporary overload

Error Response Format

{ "error": { "code": "ERROR_CODE", "message": "Human-readable error message", "type": "error_type", "param": "parameter_name", "details": { "additional": "context" } } }

Common Error Codes

Authentication Errors

{ "error": { "code": "AUTHENTICATION_FAILED", "message": "Invalid signature or address", "type": "authentication_error" } }
{ "error": { "code": "INVALID_SIGNATURE", "message": "ECDSA signature verification failed", "type": "authentication_error", "details": { "provided_address": "0x1234567890abcdef...", "recovered_address": "0xabcdef1234567890..." } } }

Request Validation Errors

{ "error": { "code": "INVALID_REQUEST", "message": "Missing required parameter: messages", "type": "invalid_request_error", "param": "messages" } }
{ "error": { "code": "INVALID_PARAMETER", "message": "temperature must be between 0.0 and 2.0", "type": "invalid_request_error", "param": "temperature", "details": { "provided_value": 3.5, "valid_range": "0.0 - 2.0" } } }

Model Errors

{ "error": { "code": "MODEL_NOT_FOUND", "message": "Model 'invalid-model-id' not found", "type": "invalid_request_error", "param": "model" } }
{ "error": { "code": "MODEL_UNAVAILABLE", "message": "Model is temporarily unavailable", "type": "service_unavailable_error", "details": { "retry_after_seconds": 30, "estimated_wait_time": "1-2 minutes" } } }

Rate Limiting Errors

{ "error": { "code": "RATE_LIMIT_EXCEEDED", "message": "Rate limit exceeded. Try again in 3600 seconds.", "type": "rate_limit_error", "details": { "retry_after": 3600, "limit_type": "requests_per_hour", "current_limit": 1000, "reset_time": 1640999200 } } }

Billing Errors

{ "error": { "code": "INSUFFICIENT_CREDITS", "message": "Insufficient credits for this request", "type": "billing_error", "details": { "required_credits": 50, "available_credits": 25, "shortfall": 25 } } }
{ "error": { "code": "BUDGET_EXCEEDED", "message": "Monthly budget limit exceeded", "type": "billing_error", "details": { "monthly_limit": 10000, "current_usage": 10050, "reset_date": "2024-02-01" } } }

Content Policy Errors

{ "error": { "code": "CONTENT_POLICY_VIOLATION", "message": "Content violates usage policies", "type": "content_policy_error", "details": { "violation_type": "harmful_content", "flagged_content": "portion of content that triggered the filter" } } }

Server Errors

{ "error": { "code": "INTERNAL_ERROR", "message": "An internal error occurred", "type": "server_error", "details": { "request_id": "req_abc123", "timestamp": 1640995200 } } }
{ "error": { "code": "MODEL_TIMEOUT", "message": "Model inference timed out", "type": "server_error", "details": { "timeout_seconds": 30, "partial_response": false } } }

Rate Limiting

Formation implements rate limiting to ensure fair usage across all users.

Rate Limit Headers

All responses include rate limiting information:

X-RateLimit-Limit: 1000 X-RateLimit-Remaining: 999 X-RateLimit-Reset: 1640999200 X-RateLimit-Type: requests_per_hour X-RateLimit-Retry-After: 3600
HeaderDescription
X-RateLimit-LimitMaximum requests allowed in the time window
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetUnix timestamp when the rate limit resets
X-RateLimit-TypeType of rate limit (requests_per_hour, tokens_per_hour)
X-RateLimit-Retry-AfterSeconds to wait before retrying (when rate limited)

Rate Limit Tiers

TierRequests/HourTokens/HourConcurrent Requests
Free10010,0002
Pro1,000100,0005
Pro Plus5,000500,00010
Power10,0001,000,00020
Power Plus50,0005,000,00050

Handling Rate Limits

When you exceed rate limits, implement exponential backoff:

import time import random def handle_rate_limit(response): """Handle rate limit response with exponential backoff""" if response.status_code == 429: retry_after = int(response.headers.get('X-RateLimit-Retry-After', 60)) jitter = random.uniform(0.1, 0.3) * retry_after wait_time = retry_after + jitter print(f"Rate limited. Waiting {wait_time:.2f} seconds...") time.sleep(wait_time) return True return False

Webhooks (Optional)

Some models support webhook notifications for long-running inference requests.

Webhook Request

{ "messages": [{"role": "user", "content": "Generate a long story"}], "max_tokens": 4000, "webhook_url": "https://your-app.com/webhooks/inference", "webhook_secret": "your-webhook-secret" }

Webhook Payload

{ "request_id": "req_abc123", "model_id": "llama2-7b-chat", "status": "completed", "result": { "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1640995200, "model": "llama2-7b-chat", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Once upon a time..." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 13, "completion_tokens": 1500, "total_tokens": 1513 } }, "timestamp": 1640995800 }

SDK Examples

Python

import requests from typing import Dict, Any, Optional class FormationClient: def __init__(self, private_key: str): self.base_url = "https://formation.ai" self.headers = self._generate_auth_headers(private_key) def chat_completion(self, model_id: str, messages: list, **kwargs) -> Dict[str, Any]: """Create a chat completion""" payload = { "messages": messages, **kwargs } response = requests.post( f"{self.base_url}/v1/models/{model_id}/inference", headers=self.headers, json=payload ) response.raise_for_status() return response.json() def list_models(self) -> Dict[str, Any]: """List available models""" response = requests.get( f"{self.base_url}/v1/models", headers=self.headers ) response.raise_for_status() return response.json() # Usage client = FormationClient("0x1234567890abcdef...") models = client.list_models() result = client.chat_completion( "llama2-7b-chat", [{"role": "user", "content": "Hello!"}] )

JavaScript

class FormationClient { constructor(privateKey) { this.baseUrl = 'https://formation.ai'; this.headers = this._generateAuthHeaders(privateKey); } async chatCompletion(modelId, messages, options = {}) { const payload = { messages, ...options }; const response = await fetch(`${this.baseUrl}/v1/models/${modelId}/inference`, { method: 'POST', headers: this.headers, body: JSON.stringify(payload) }); if (!response.ok) { throw new Error(`HTTP ${response.status}: ${response.statusText}`); } return response.json(); } async listModels() { const response = await fetch(`${this.baseUrl}/v1/models`, { headers: this.headers }); if (!response.ok) { throw new Error(`HTTP ${response.status}: ${response.statusText}`); } return response.json(); } } // Usage const client = new FormationClient('0x1234567890abcdef...'); const models = await client.listModels(); const result = await client.chatCompletion( 'llama2-7b-chat', [{ role: 'user', content: 'Hello!' }] );

OpenAI Compatibility

Formation's API is designed to be a drop-in replacement for OpenAI's API. Here's how to migrate:

URL Changes

# OpenAI openai.api_base = "https://api.openai.com/v1" # Formation formation_base = "https://formation.ai/v1/models/{model_id}/inference"

Authentication Changes

# OpenAI openai.api_key = "sk-..." # Formation headers = { "X-Formation-Address": "0x...", "X-Formation-Signature": "0x...", "X-Formation-Message": "Formation authentication request" }

Request Changes

# OpenAI response = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hello"}] ) # Formation response = requests.post( "https://formation.ai/v1/models/llama2-7b-chat/inference", headers=headers, json={ "messages": [{"role": "user", "content": "Hello"}] } )

Next Steps


Need help with the API? Check our troubleshooting guide or contact support! 🚀