// docs

API Documentation

Everything you need to integrate Sofya into your AI agent.

llms.txt · SKILL.md

Copy as Markdown
Copy as Text

Quick Start

1. Sign up via GitHub

Visit the dashboard and sign in with GitHub to get started.

2. Get credits

Free plan includes 1,000 credits/month. Need more? Buy credits or subscribe to a plan from the dashboard.

3. Search the web

curl -X POST https://mcp.sofya.yusufgurdogan.com/v1/search \
  -H "Authorization: Bearer ay_live_..." \
  -H "Content-Type: application/json" \
  -d '{"query": "latest AI news"}'

Authentication

All API requests require an API key in the Authorization header.

Authorization: Bearer ay_live_your_key_here

MCP Model Context Protocol

Connect Sofya directly to Claude Code, Cursor, or any MCP-compatible client. Your AI agent gets search, fetch, extract, and research tools, no REST calls needed.

Go to your dashboard and click the copy button for your client. The command is pre-filled with your API key.

Claude Code

claude mcp add --transport http sofya https://mcp.sofya.yusufgurdogan.com/mcp \
  --header "Authorization: Bearer ay_live_..."

Cursor · ~/.cursor/mcp.json

{
  "mcpServers": {
    "sofya": {
      "url": "https://mcp.sofya.yusufgurdogan.com/mcp",
      "headers": { "Authorization": "Bearer ay_live_..." }
    }
  }
}

Codex · ~/.codex/config.toml

[mcp_servers.sofya]
url = "https://mcp.sofya.yusufgurdogan.com/mcp"
http_headers = { "Authorization" = "Bearer ay_live_..." }

Windsurf · ~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "sofya": {
      "serverUrl": "https://mcp.sofya.yusufgurdogan.com/mcp",
      "headers": { "Authorization": "Bearer ay_live_..." }
    }
  }
}

VS Code Copilot · .vscode/mcp.json

{
  "servers": {
    "sofya": {
      "type": "http",
      "url": "https://mcp.sofya.yusufgurdogan.com/mcp",
      "headers": { "Authorization": "Bearer ay_live_..." }
    }
  }
}

Your API key is sent via HTTP header. The AI model never sees it.

Available Tools

search

Web search with page content extraction and optional AI answers. 1-10 credits.

fetch

Fetch URLs as clean markdown. 1 credit per URL.

extract

AI-powered structured data extraction. 5 credits.

research

Multi-query deep research with AI synthesis. 25 credits.

Tool Definitions for Claude & GPT

Copy-paste these tool schemas into your Anthropic or OpenAI API calls. Your model gets Sofya's tools without writing any definitions yourself.

Anthropic (Claude)

Pass this array as the tools parameter in your /v1/messages request. When Claude returns a tool_use block, call the matching Sofya REST endpoint and return the result as a tool_result.

[
  {
    "name": "sofya_search",
    "description": "Search the web for current information. Returns extracted page content, not just snippets. Set topic='news' for current events. Set include_answer=true for an AI-synthesized answer (+5 credits). Returns: query, answer, results [{title, url, content, published_date}], credits_used.",
    "input_schema": {
      "type": "object",
      "required": ["query"],
      "properties": {
        "query": {"type": "string", "description": "The search query"},
        "search_depth": {"type": "string", "description": "\"snippets\" (1 credit), \"basic\" (3, default), or \"advanced\" (5)"},
        "max_results": {"type": "integer", "description": "Number of results, 1-20 (default 10)"},
        "include_answer": {"type": "boolean", "description": "Add AI answer synthesized from results (+5 credits)"},
        "topic": {"type": "string", "description": "\"general\" (default) or \"news\""},
        "freshness": {"type": "string", "description": "\"day\", \"week\", \"month\", \"year\", or \"YYYY-MM-DD:YYYY-MM-DD\""},
        "include_domains": {"type": "array", "items": {"type": "string"}, "description": "Only these domains (max 10)"},
        "exclude_domains": {"type": "array", "items": {"type": "string"}, "description": "Exclude these domains (max 10)"}
      }
    }
  },
  {
    "name": "sofya_fetch",
    "description": "Fetch one or more URLs and return their content as clean markdown. Supports web pages, PDF, DOCX, and other document formats. 1 credit per URL, max 10 URLs. Failed URLs are not charged. Returns: results [{title, url, content, raw_html, published_time, success, error}], credits_used.",
    "input_schema": {
      "type": "object",
      "required": ["urls"],
      "properties": {
        "urls": {"type": "array", "items": {"type": "string"}, "description": "URLs to fetch (max 10)"},
        "include_raw_html": {"type": "boolean", "description": "Include raw HTML source in response (default false)"}
      }
    }
  },
  {
    "name": "sofya_extract",
    "description": "Fetch a URL and extract specific information using AI. Use when you need structured data (pricing, specs, contact info) rather than raw content. 5 credits. Returns: content, url.",
    "input_schema": {
      "type": "object",
      "required": ["url", "prompt"],
      "properties": {
        "url": {"type": "string", "description": "The URL to extract from"},
        "prompt": {"type": "string", "description": "What to extract, e.g. \"list all pricing tiers with features\""}
      }
    }
  },
  {
    "name": "sofya_research",
    "description": "Deep research on a topic. Decomposes query into sub-queries, searches and reads multiple sources in parallel, synthesizes a structured report with citations. 25 credits. Returns: report, sources, sub_queries.",
    "input_schema": {
      "type": "object",
      "required": ["query"],
      "properties": {
        "query": {"type": "string", "description": "The research question or topic"},
        "topic": {"type": "string", "description": "\"general\" (default) or \"news\""},
        "freshness": {"type": "string", "description": "\"day\", \"week\", \"month\", \"year\", or \"YYYY-MM-DD:YYYY-MM-DD\""},
        "max_sources": {"type": "integer", "description": "Max sources to use, 5-30 (default 20)"}
      }
    }
  }
]

OpenAI (GPT)

Pass this array as the tools parameter in your /chat/completions request. When the model returns tool_calls, call the matching Sofya REST endpoint and return the result as a role: "tool" message.

[
  {
    "type": "function",
    "function": {
      "name": "sofya_search",
      "description": "Search the web for current information. Returns extracted page content, not just snippets. Set topic='news' for current events. Set include_answer=true for an AI-synthesized answer (+5 credits). Returns: query, answer, results [{title, url, content, published_date}], credits_used.",
      "parameters": {
        "type": "object",
        "required": ["query"],
        "properties": {
          "query": {"type": "string", "description": "The search query"},
          "search_depth": {"type": "string", "description": "\"snippets\" (1 credit), \"basic\" (3, default), or \"advanced\" (5)"},
          "max_results": {"type": "integer", "description": "Number of results, 1-20 (default 10)"},
          "include_answer": {"type": "boolean", "description": "Add AI answer synthesized from results (+5 credits)"},
          "topic": {"type": "string", "description": "\"general\" (default) or \"news\""},
          "freshness": {"type": "string", "description": "\"day\", \"week\", \"month\", \"year\", or \"YYYY-MM-DD:YYYY-MM-DD\""},
          "include_domains": {"type": "array", "items": {"type": "string"}, "description": "Only these domains (max 10)"},
          "exclude_domains": {"type": "array", "items": {"type": "string"}, "description": "Exclude these domains (max 10)"}
        },
        "additionalProperties": false
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "sofya_fetch",
      "description": "Fetch one or more URLs and return their content as clean markdown. Supports web pages, PDF, DOCX, and other document formats. 1 credit per URL, max 10 URLs. Failed URLs are not charged. Returns: results [{title, url, content, raw_html, published_time, success, error}], credits_used.",
      "parameters": {
        "type": "object",
        "required": ["urls"],
        "properties": {
          "urls": {"type": "array", "items": {"type": "string"}, "description": "URLs to fetch (max 10)"},
          "include_raw_html": {"type": "boolean", "description": "Include raw HTML source in response (default false)"}
        },
        "additionalProperties": false
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "sofya_extract",
      "description": "Fetch a URL and extract specific information using AI. Use when you need structured data (pricing, specs, contact info) rather than raw content. 5 credits. Returns: content, url.",
      "parameters": {
        "type": "object",
        "required": ["url", "prompt"],
        "properties": {
          "url": {"type": "string", "description": "The URL to extract from"},
          "prompt": {"type": "string", "description": "What to extract, e.g. \"list all pricing tiers with features\""}
        },
        "additionalProperties": false
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "sofya_research",
      "description": "Deep research on a topic. Decomposes query into sub-queries, searches and reads multiple sources in parallel, synthesizes a structured report with citations. 25 credits. Returns: report, sources, sub_queries.",
      "parameters": {
        "type": "object",
        "required": ["query"],
        "properties": {
          "query": {"type": "string", "description": "The research question or topic"},
          "topic": {"type": "string", "description": "\"general\" (default) or \"news\""},
          "freshness": {"type": "string", "description": "\"day\", \"week\", \"month\", \"year\", or \"YYYY-MM-DD:YYYY-MM-DD\""},
          "max_sources": {"type": "integer", "description": "Max sources to use, 5-30 (default 20)"}
        },
        "additionalProperties": false
      }
    }
  }
]

Example: wiring it up

When the model calls a tool, map the tool name to the Sofya endpoint and forward the arguments:

# Python - handle tool calls from Claude or GPT
TOOL_TO_ENDPOINT = {
    "sofya_search": "/v1/search",
    "sofya_fetch": "/v1/fetch",
    "sofya_extract": "/v1/extract",
    "sofya_research": "/v1/research",
}

def call_sofya(tool_name: str, args: dict) -> dict:
    import httpx
    resp = httpx.post(
        f"https://mcp.sofya.yusufgurdogan.com{TOOL_TO_ENDPOINT[tool_name]}",
        headers={"Authorization": "Bearer ay_live_..."},
        json=args,
        timeout=120,
    )
    return resp.json()

Core Tools

POST /v1/fetch 1 credit per URL

Fetch one or more URLs and return their content as clean markdown.

Request Body

{
  "urls": ["string", ...],       // required, max 10
  "include_raw_html": false      // optional - include raw HTML source
}

Response

{
  "results": [
    {
      "title": "Example Page",
      "url": "https://example.com",
      "content": "# Markdown content...",
      "raw_html": null,
      "published_time": null,
      "success": true,
      "error": null
    }
  ],
  "credits_used": 1,
  "credits_remaining": 999
}
curl -X POST https://mcp.sofya.yusufgurdogan.com/v1/fetch \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ay_live_..." \
  -d '{"urls": ["https://example.com"]}'
import httpx

resp = httpx.post("https://mcp.sofya.yusufgurdogan.com/v1/fetch",
    headers={"Authorization": "Bearer ay_live_..."},
    json={"urls": ["https://example.com"]})
print(resp.json())
const resp = await fetch("https://mcp.sofya.yusufgurdogan.com/v1/fetch", {
  method: "POST",
  headers: { "Content-Type": "application/json", "Authorization": "Bearer ay_live_..." },
  body: JSON.stringify({ urls: ["https://example.com"] })
});
console.log(await resp.json());
POST /v1/extract 5 credits

Fetch a webpage and extract specific information using AI. Costs 5 credits.

Request Body

{
  "url": "string",          // required
  "prompt": "string"        // required, what to extract
}

Response

{
  "content": "Extracted information...",
  "url": "https://example.com",
  "credits_used": 5,
  "credits_remaining": 995,
  "usage": { "input_tokens": 90, "output_tokens": 24 }
}
curl -X POST https://mcp.sofya.yusufgurdogan.com/v1/extract \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ay_live_..." \
  -d '{"url": "https://example.com", "prompt": "Summarize this page"}'
import httpx

resp = httpx.post("https://mcp.sofya.yusufgurdogan.com/v1/extract",
    headers={"Authorization": "Bearer ay_live_..."},
    json={"url": "https://example.com", "prompt": "Summarize this page"})
print(resp.json())
const resp = await fetch("https://mcp.sofya.yusufgurdogan.com/v1/extract", {
  method: "POST",
  headers: { "Content-Type": "application/json", "Authorization": "Bearer ay_live_..." },
  body: JSON.stringify({ url: "https://example.com", prompt: "Summarize this page" })
});
console.log(await resp.json());
POST /v1/research 25 credits

Deep research on any topic. Decomposes your query into sub-queries, searches and reads multiple sources in parallel, then synthesizes a structured report with citations. Costs 25 credits.

Request Body

{
  "query": "string",              // required
  "topic": "general",             // "general" or "news"
  "freshness": null,              // "day", "week", "month", "year", or "YYYY-MM-DD:YYYY-MM-DD"
  "max_sources": 20               // 5-30
}

Response

{
  "query": "How do modern LLMs handle long context?",
  "report": "## Key Findings\n\n- ...",
  "sources": [
    {
      "title": "Scaling Transformer Context Windows",
      "url": "https://arxiv.org/abs/...",
      "fetched": true
    }
  ],
  "sub_queries": [
    "transformer context window scaling techniques",
    "RoPE positional encoding extensions"
  ],
  "credits_used": 25,
  "credits_remaining": 975,
  "usage": { "input_tokens": 12400, "output_tokens": 1850 }
}
curl -X POST https://mcp.sofya.yusufgurdogan.com/v1/research \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ay_live_..." \
  -d '{"query": "How do modern LLMs handle long context?"}'
import httpx

resp = httpx.post("https://mcp.sofya.yusufgurdogan.com/v1/research",
    headers={"Authorization": "Bearer ay_live_..."},
    json={"query": "How do modern LLMs handle long context?"},
    timeout=120)
print(resp.json())
const resp = await fetch("https://mcp.sofya.yusufgurdogan.com/v1/research", {
  method: "POST",
  headers: { "Content-Type": "application/json", "Authorization": "Bearer ay_live_..." },
  body: JSON.stringify({ query: "How do modern LLMs handle long context?" })
});
console.log(await resp.json());

Account

GET /v1/auth/me

Get your account info including plan, credits, and total requests.

Response

{
  "plan": "free",
  "credits": 997,
  "credits_reset_at": "2026-04-04T12:00:00Z",
  "total_requests": 3,
  "api_key": "ay_live_...",
  "last_login_method": "github",
  "email": "user@example.com",
  "github_username": "octocat"
}
GET /v1/auth/transactions

Get your recent credit transactions (top-ups). Returns the last 50.

Response

[
  {
    "id": "uuid",
    "type": "credit",
    "amount": 5000,
    "endpoint": "top-up",
    "balance_after": 6000,
    "created_at": "2026-02-22T12:00:00Z"
  }
]
GET /v1/auth/usage

Get your daily usage breakdown. Returns the last 30 days, per endpoint.

Response

[
  {
    "date": "2026-02-22",
    "endpoint": "/v1/search",
    "request_count": 312,
    "total_credits": 312
  }
]

Billing

POST /v1/billing/checkout

Buy credits (PAYG). Minimum purchase: 2,000 credits ($10). Redirects to payment page.

Request Body

{
  "credits": 5000           // required, minimum ~2000
}

Response

{
  "checkout_url": "https://checkout.creem.io/pay/..."
}

Rate Limits

REST API endpoints are rate limited to 1 request per second per API key. This applies to all /v1/* endpoints. MCP (/mcp) is not rate limited.

If you exceed the limit, the API returns 429 Too Many Requests with a Retry-After header indicating how many seconds to wait before retrying.

429 Response

HTTP/1.1 429 Too Many Requests
Retry-After: 0.85

{
  "detail": "Rate limit exceeded. 1 request per second."
}

Rate-limited requests do not consume credits. Implement exponential backoff or respect the Retry-After header for best results.

Error Codes

Code Status Description
400 Bad Request Invalid parameters (e.g. missing query, bad freshness format)
401 Unauthorized Invalid or missing API key
402 Payment Required Insufficient credits
403 Forbidden Forbidden
429 Too Many Requests Rate limited. Check Retry-After header.
502 Bad Gateway Internal error. Retry the request.
504 Gateway Timeout Research timed out. Try a simpler query or fewer sources.