API Documentation

Everything you need to integrate Sofya into your AI agent.

llms.txt · SKILL.md

Copy as Markdown

Copy as Text

Quick Start

1. Sign up via GitHub

Visit the dashboard and sign in with GitHub to get started.

2. Get credits

Eligible GitHub accounts get 1,000 credits/month on the free tier. Need more? Buy credits at $0.005 each.

3. Search the web

curl -X POST https://mcp.sofya.yusufgurdogan.com/v1/search \
  -H "Authorization: Bearer ay_live_..." \
  -H "Content-Type: application/json" \
  -d '{"query": "latest AI news"}'

Authentication

All API requests require an API key in the Authorization header.

Authorization: Bearer ay_live_your_key_here

MCP Model Context Protocol

Connect Sofya directly to Claude Code, Cursor, or any MCP-compatible client. Your AI agent gets search, fetch, extract, and research tools, no REST calls needed.

Go to your dashboard and click the copy button for your client. The command is pre-filled with your API key.

Claude Code

claude mcp add --transport http sofya https://mcp.sofya.yusufgurdogan.com/mcp \
  --header "Authorization: Bearer ay_live_..."

Cursor · `~/.cursor/mcp.json`

{
  "mcpServers": {
    "sofya": {
      "url": "https://mcp.sofya.yusufgurdogan.com/mcp",
      "headers": { "Authorization": "Bearer ay_live_..." }
    }
  }
}

Codex · `~/.codex/config.toml`

[mcp_servers.sofya]
url = "https://mcp.sofya.yusufgurdogan.com/mcp"
http_headers = { "Authorization" = "Bearer ay_live_..." }

Windsurf · `~/.codeium/windsurf/mcp_config.json`

{
  "mcpServers": {
    "sofya": {
      "serverUrl": "https://mcp.sofya.yusufgurdogan.com/mcp",
      "headers": { "Authorization": "Bearer ay_live_..." }
    }
  }
}

VS Code Copilot · `.vscode/mcp.json`

{
  "servers": {
    "sofya": {
      "type": "http",
      "url": "https://mcp.sofya.yusufgurdogan.com/mcp",
      "headers": { "Authorization": "Bearer ay_live_..." }
    }
  }
}

Your API key is sent via HTTP header. The AI model never sees it.

Available Tools

Web search with page content extraction and optional AI answers. 1-3 credits (+5 with AI answer).

fetch

Fetch URLs as clean markdown. 1 credit per URL.

extract

AI-powered structured data extraction. 5 credits.

research

Multi-query deep research with AI synthesis. 25 credits.

Tool Definitions for Claude & GPT

Copy-paste these tool schemas into your Anthropic or OpenAI API calls. Your model gets Sofya's tools without writing any definitions yourself.

Anthropic (Claude)

Pass this array as the tools parameter in your /v1/messages request. When Claude returns a tool_use block, call the matching Sofya REST endpoint and return the result as a tool_result.

[
  {
    "name": "sofya_search",
    "description": "Search the web for current information. Returns extracted page content, not just snippets. Set topic='news' for current events. Set include_answer=true for an AI-synthesized answer (+5 credits). Returns: query, answer, results [{title, url, content, description, fetched, published_date}], search_depth, topic, elapsed_ms, credits_used, credits_remaining, altered_query.",
    "input_schema": {
      "type": "object",
      "required": ["query"],
      "properties": {
        "query": {"type": "string", "description": "The search query"},
        "search_depth": {"type": "string", "description": "\"snippets\" (1 credit) or \"basic\" (3, default)"},
        "max_results": {"type": "integer", "description": "Number of results, 1-20 (default 10)"},
        "include_answer": {"type": "boolean", "description": "Add AI answer synthesized from results (+5 credits)"},
        "topic": {"type": "string", "description": "\"general\" (default) or \"news\""},
        "freshness": {"type": "string", "description": "\"day\", \"week\", \"month\", \"year\", or \"YYYY-MM-DD:YYYY-MM-DD\""},
        "include_domains": {"type": "array", "items": {"type": "string"}, "description": "Only these domains (max 10)"},
        "exclude_domains": {"type": "array", "items": {"type": "string"}, "description": "Exclude these domains (max 10)"}
      }
    }
  },
  {
    "name": "sofya_fetch",
    "description": "Fetch one or more URLs and return their content as clean markdown. Supports web pages, PDF, DOCX, and other document formats. 1 credit per URL, max 10 URLs. Failed URLs are not charged. Returns: results [{title, url, content, raw_html, published_time, success, error}], credits_used, credits_remaining.",
    "input_schema": {
      "type": "object",
      "required": ["urls"],
      "properties": {
        "urls": {"type": "array", "items": {"type": "string"}, "description": "URLs to fetch (max 10)"},
        "include_raw_html": {"type": "boolean", "description": "Include raw HTML source in response (default false)"}
      }
    }
  },
  {
    "name": "sofya_extract",
    "description": "Fetch a URL and extract specific information using AI. Use when you need structured data (pricing, specs, contact info) rather than raw content. 5 credits. Returns: content, url, credits_used, credits_remaining, usage.",
    "input_schema": {
      "type": "object",
      "required": ["url", "prompt"],
      "properties": {
        "url": {"type": "string", "description": "The URL to extract from"},
        "prompt": {"type": "string", "description": "What to extract, e.g. \"list all pricing tiers with features\""}
      }
    }
  },
  {
    "name": "sofya_research",
    "description": "Deep research on a topic. Decomposes query into sub-queries, searches and reads multiple sources in parallel, synthesizes a structured report with citations. 25 credits. Returns: query, report, sources [{title, url, fetched}], sub_queries, credits_used, credits_remaining, usage.",
    "input_schema": {
      "type": "object",
      "required": ["query"],
      "properties": {
        "query": {"type": "string", "description": "The research question or topic"},
        "topic": {"type": "string", "description": "\"general\" (default) or \"news\""},
        "freshness": {"type": "string", "description": "\"day\", \"week\", \"month\", \"year\", or \"YYYY-MM-DD:YYYY-MM-DD\""},
        "max_sources": {"type": "integer", "description": "Max sources to use, 5-30 (default 20)"}
      }
    }
  }
]

OpenAI (GPT)

Pass this array as the tools parameter in your /chat/completions request. When the model returns tool_calls, call the matching Sofya REST endpoint and return the result as a role: "tool" message.

[
  {
    "type": "function",
    "function": {
      "name": "sofya_search",
      "description": "Search the web for current information. Returns extracted page content, not just snippets. Set topic='news' for current events. Set include_answer=true for an AI-synthesized answer (+5 credits). Returns: query, answer, results [{title, url, content, description, fetched, published_date}], search_depth, topic, elapsed_ms, credits_used, credits_remaining, altered_query.",
      "parameters": {
        "type": "object",
        "required": ["query"],
        "properties": {
          "query": {"type": "string", "description": "The search query"},
          "search_depth": {"type": "string", "description": "\"snippets\" (1 credit) or \"basic\" (3, default)"},
          "max_results": {"type": "integer", "description": "Number of results, 1-20 (default 10)"},
          "include_answer": {"type": "boolean", "description": "Add AI answer synthesized from results (+5 credits)"},
          "topic": {"type": "string", "description": "\"general\" (default) or \"news\""},
          "freshness": {"type": "string", "description": "\"day\", \"week\", \"month\", \"year\", or \"YYYY-MM-DD:YYYY-MM-DD\""},
          "include_domains": {"type": "array", "items": {"type": "string"}, "description": "Only these domains (max 10)"},
          "exclude_domains": {"type": "array", "items": {"type": "string"}, "description": "Exclude these domains (max 10)"}
        },
        "additionalProperties": false
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "sofya_fetch",
      "description": "Fetch one or more URLs and return their content as clean markdown. Supports web pages, PDF, DOCX, and other document formats. 1 credit per URL, max 10 URLs. Failed URLs are not charged. Returns: results [{title, url, content, raw_html, published_time, success, error}], credits_used, credits_remaining.",
      "parameters": {
        "type": "object",
        "required": ["urls"],
        "properties": {
          "urls": {"type": "array", "items": {"type": "string"}, "description": "URLs to fetch (max 10)"},
          "include_raw_html": {"type": "boolean", "description": "Include raw HTML source in response (default false)"}
        },
        "additionalProperties": false
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "sofya_extract",
      "description": "Fetch a URL and extract specific information using AI. Use when you need structured data (pricing, specs, contact info) rather than raw content. 5 credits. Returns: content, url, credits_used, credits_remaining, usage.",
      "parameters": {
        "type": "object",
        "required": ["url", "prompt"],
        "properties": {
          "url": {"type": "string", "description": "The URL to extract from"},
          "prompt": {"type": "string", "description": "What to extract, e.g. \"list all pricing tiers with features\""}
        },
        "additionalProperties": false
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "sofya_research",
      "description": "Deep research on a topic. Decomposes query into sub-queries, searches and reads multiple sources in parallel, synthesizes a structured report with citations. 25 credits. Returns: query, report, sources [{title, url, fetched}], sub_queries, credits_used, credits_remaining, usage.",
      "parameters": {
        "type": "object",
        "required": ["query"],
        "properties": {
          "query": {"type": "string", "description": "The research question or topic"},
          "topic": {"type": "string", "description": "\"general\" (default) or \"news\""},
          "freshness": {"type": "string", "description": "\"day\", \"week\", \"month\", \"year\", or \"YYYY-MM-DD:YYYY-MM-DD\""},
          "max_sources": {"type": "integer", "description": "Max sources to use, 5-30 (default 20)"}
        },
        "additionalProperties": false
      }
    }
  }
]

Example: wiring it up

When the model calls a tool, map the tool name to the Sofya endpoint and forward the arguments:

# Python - handle tool calls from Claude or GPT
TOOL_TO_ENDPOINT = {
    "sofya_search": "/v1/search",
    "sofya_fetch": "/v1/fetch",
    "sofya_extract": "/v1/extract",
    "sofya_research": "/v1/research",
}

def call_sofya(tool_name: str, args: dict) -> dict:
    import httpx
    resp = httpx.post(
        f"https://mcp.sofya.yusufgurdogan.com{TOOL_TO_ENDPOINT[tool_name]}",
        headers={"Authorization": "Bearer ay_live_..."},
        json=args,
        timeout=120,
    )
    return resp.json()

Core Tools

POST /v1/search 1-3 credits (+5 with answer)

Search the web. Returns page content, not just snippets. Choose a search depth to control the quality/cost tradeoff. Add include_answer to any depth for an AI-synthesized answer (+5 credits). This is a lightweight alternative to the 25-credit research endpoint.

snippets 1 credit

SERP snippets only. Fastest.

basic 3 credits (default)

Fetches pages, returns extracted content (~5000 chars per result).

Request Body

{
  "query": "string",              // required
  "search_depth": "basic",        // "snippets" or "basic"
  "max_results": 10,               // 1-20
  "include_answer": false,         // AI answer from results (+5 credits). Combine with any depth for search + synthesis (e.g. basic = 8 credits)
  "include_domains": [],           // e.g. ["reddit.com", "github.com"]
  "exclude_domains": [],           // e.g. ["pinterest.com"]
  "topic": "general",             // "general" or "news"
  "freshness": null               // "day", "week", "month", "year", or "YYYY-MM-DD:YYYY-MM-DD"
}

Response

{
  "query": "latest AI news",
  "answer": "According to...",     // null unless include_answer
  "results": [
    {
      "title": "...",
      "url": "...",
      "content": "Extracted page content...",
      "description": "SERP snippet",
      "fetched": true,             // true if page was fetched, false if snippet only
      "published_date": "2026-03-08",  // YYYY-MM-DD, normalized from page metadata or SERP (or null)
      "sublinks": [],
      "table": {}
    }
  ],
  "search_depth": "basic",
  "topic": "general",
  "elapsed_ms": 4200,
  "credits_used": 3,
  "credits_remaining": 997,
  "altered_query": null            // if the query was auto-corrected
}

topic: "general" (default) for web search, or "news" for news-specific search. Use "news" for current events, breaking news, politics, or any time-sensitive query. Returns articles with publication dates.

freshness: "day", "week", "month", "year", or custom range "YYYY-MM-DD:YYYY-MM-DD"

curl -X POST https://mcp.sofya.yusufgurdogan.com/v1/search \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ay_live_..." \
  -d '{"query": "latest AI news", "search_depth": "basic"}'

import httpx

resp = httpx.post("https://mcp.sofya.yusufgurdogan.com/v1/search",
    headers={"Authorization": "Bearer ay_live_..."},
    json={"query": "latest AI news", "search_depth": "basic"})
print(resp.json())

const resp = await fetch("https://mcp.sofya.yusufgurdogan.com/v1/search", {
  method: "POST",
  headers: { "Content-Type": "application/json", "Authorization": "Bearer ay_live_..." },
  body: JSON.stringify({ query: "latest AI news", search_depth: "basic" })
});
console.log(await resp.json());

POST /v1/fetch 1 credit per URL

Fetch one or more URLs and return their content as clean markdown.

Request Body

{
  "urls": ["string", ...],       // required, max 10
  "include_raw_html": false      // optional - include raw HTML source
}

Response

{
  "results": [
    {
      "title": "Example Page",
      "url": "https://example.com",
      "content": "# Markdown content...",
      "raw_html": null,
      "published_time": null,           // YYYY-MM-DD, extracted from page metadata when available (or null)
      "success": true,
      "error": null
    }
  ],
  "credits_used": 1,
  "credits_remaining": 999
}

curl -X POST https://mcp.sofya.yusufgurdogan.com/v1/fetch \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ay_live_..." \
  -d '{"urls": ["https://example.com"]}'

import httpx

resp = httpx.post("https://mcp.sofya.yusufgurdogan.com/v1/fetch",
    headers={"Authorization": "Bearer ay_live_..."},
    json={"urls": ["https://example.com"]})
print(resp.json())

const resp = await fetch("https://mcp.sofya.yusufgurdogan.com/v1/fetch", {
  method: "POST",
  headers: { "Content-Type": "application/json", "Authorization": "Bearer ay_live_..." },
  body: JSON.stringify({ urls: ["https://example.com"] })
});
console.log(await resp.json());

POST /v1/extract 5 credits

Fetch a webpage and extract specific information using AI. Costs 5 credits.

Request Body

{
  "url": "string",          // required
  "prompt": "string"        // required, what to extract
}

Response

{
  "content": "Extracted information...",
  "url": "https://example.com",
  "credits_used": 5,
  "credits_remaining": 995,
  "usage": { "input_tokens": 90, "output_tokens": 24 }
}

curl -X POST https://mcp.sofya.yusufgurdogan.com/v1/extract \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ay_live_..." \
  -d '{"url": "https://example.com", "prompt": "Summarize this page"}'

import httpx

resp = httpx.post("https://mcp.sofya.yusufgurdogan.com/v1/extract",
    headers={"Authorization": "Bearer ay_live_..."},
    json={"url": "https://example.com", "prompt": "Summarize this page"})
print(resp.json())

const resp = await fetch("https://mcp.sofya.yusufgurdogan.com/v1/extract", {
  method: "POST",
  headers: { "Content-Type": "application/json", "Authorization": "Bearer ay_live_..." },
  body: JSON.stringify({ url: "https://example.com", prompt: "Summarize this page" })
});
console.log(await resp.json());

POST /v1/research 25 credits

Deep research on any topic. Decomposes your query into sub-queries, searches and reads multiple sources in parallel, then synthesizes a structured report with citations. Costs 25 credits.

Request Body

{
  "query": "string",              // required
  "topic": "general",             // "general" or "news"
  "freshness": null,              // "day", "week", "month", "year", or "YYYY-MM-DD:YYYY-MM-DD"
  "max_sources": 20               // 5-30
}

Response

{
  "query": "How do modern LLMs handle long context?",
  "report": "## Key Findings\n\n- ...",
  "sources": [
    {
      "title": "Scaling Transformer Context Windows",
      "url": "https://arxiv.org/abs/...",
      "fetched": true
    }
  ],
  "sub_queries": [
    "transformer context window scaling techniques",
    "RoPE positional encoding extensions"
  ],
  "credits_used": 25,
  "credits_remaining": 975,
  "usage": { "input_tokens": 12400, "output_tokens": 1850 }
}

curl -X POST https://mcp.sofya.yusufgurdogan.com/v1/research \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ay_live_..." \
  -d '{"query": "How do modern LLMs handle long context?"}'

import httpx

resp = httpx.post("https://mcp.sofya.yusufgurdogan.com/v1/research",
    headers={"Authorization": "Bearer ay_live_..."},
    json={"query": "How do modern LLMs handle long context?"},
    timeout=120)
print(resp.json())

const resp = await fetch("https://mcp.sofya.yusufgurdogan.com/v1/research", {
  method: "POST",
  headers: { "Content-Type": "application/json", "Authorization": "Bearer ay_live_..." },
  body: JSON.stringify({ query: "How do modern LLMs handle long context?" })
});
console.log(await resp.json());

Account

GET /v1/auth/me

Get your account info including credits, tier, and total requests.

Response

{
  "credits": 997,
  "plan_credits": 997,
  "purchased_credits": 0,
  "is_free_tier": true,
  "credits_reset_at": "2026-04-04T12:00:00Z",
  "total_requests": 3,
  "api_key": "ay_live_...",
  "last_login_method": "github",
  "email": "user@example.com",
  "github_username": "octocat"
}

GET /v1/auth/transactions

Get your recent credit transactions (top-ups). Returns the last 50.

Response

[
  {
    "id": "uuid",
    "type": "credit",
    "amount": 5000,
    "endpoint": "top-up",
    "balance_after": 6000,
    "created_at": "2026-02-22T12:00:00Z"
  }
]

GET /v1/auth/usage

Get your daily usage breakdown, per endpoint.

Query Parameters

days=7           // 1-90, default 7
offset_days=0    // pagination offset in days

Response

[
  {
    "date": "2026-02-22",
    "endpoint": "/v1/search",
    "request_count": 312,
    "total_credits": 312
  }
]

Billing

POST /v1/billing/checkout

Buy credits (PAYG). Minimum purchase: 2,000 credits ($10). Redirects to payment page.

Request Body

{
  "credits": 5000           // required, minimum ~2000
}

Response

{
  "checkout_url": "https://checkout.creem.io/pay/..."
}

Rate Limits

REST API endpoints are rate limited to 10 requests per second per API key. This applies to all /v1/* endpoints. MCP (/mcp) is rate limited to 30 requests per second.

If you exceed the limit, the API returns 429 Too Many Requests with a Retry-After header indicating how many seconds to wait before retrying.

429 Response

HTTP/1.1 429 Too Many Requests
Retry-After: 0.85

{
  "detail": "Rate limit exceeded. 10 requests per second."
}

Rate-limited requests do not consume credits. Implement exponential backoff or respect the Retry-After header for best results.

Error Codes

Code	Status	Description
400	Bad Request	Invalid parameters (e.g. missing query, bad freshness format)
401	Unauthorized	Invalid or missing API key
402	Payment Required	Insufficient credits
403	Forbidden	Forbidden
429	Too Many Requests	Rate limited. Check `Retry-After` header.
502	Bad Gateway	Internal error. Retry the request.
504	Gateway Timeout	Research timed out. Try a simpler query or fewer sources.

Status & Monitoring

Sofya's public status page lives off-host at status.sofya.co. It probes the API every 60 seconds from a separate server so it keeps reporting accurately even if Sofya itself is down. All feeds are public, no API key needed.

JSON feeds

/api/v2/summary.json — Atlassian Statuspage-compatible. Drop-in for any tool that already reads status.anthropic.com, GitHub status, etc.
/api/v2/status.json — Page metadata + overall indicator only. Cheapest poll if you only need up/down.
/api/status.json — Sofya-native richer feed: per-component latency, 24h / 90d uptime, last error.

Example

$ curl https://status.sofya.co/api/v2/status.json
{
  "page": { "id": "sofya", "name": "Sofya Status", ... },
  "status": {
    "indicator": "none",
    "description": "All Systems Operational"
  }
}

The indicator field follows the Statuspage convention: none (operational), minor (degraded), or major (down). Treat any non-none value as cause to surface a notice to your users or back off retries.

API Documentation

Quick Start

1. Sign up via GitHub

2. Get credits

3. Search the web

Authentication

MCP Model Context Protocol

Claude Code

Cursor · ~/.cursor/mcp.json

Codex · ~/.codex/config.toml

Windsurf · ~/.codeium/windsurf/mcp_config.json

VS Code Copilot · .vscode/mcp.json

Available Tools

Tool Definitions for Claude & GPT

Anthropic (Claude)

OpenAI (GPT)

Example: wiring it up

Core Tools

Request Body

Response

Request Body

Response

Request Body

Response

Request Body

Response

Account

Response

Response

Query Parameters

Response

Billing

Request Body

Response

Rate Limits

429 Response

Error Codes

Status & Monitoring

JSON feeds

Example

Cursor · `~/.cursor/mcp.json`

Codex · `~/.codex/config.toml`

Windsurf · `~/.codeium/windsurf/mcp_config.json`

VS Code Copilot · `.vscode/mcp.json`