~/docs/api_reference/chat_completions.md
4,943 bytesยทedit on github โ†’

#Chat Completions

The core endpoint of the GPT-GOB API. Send a list of messages, get back a goblin-aligned completion.

http
POST https://api.gpt-gob.ai/v1/chat/completions

##Request body

ParameterTypeRequiredDefaultDescription
modelstringyesโ€”Model ID. See Models.
messagesarrayyesโ€”Conversation history. See below.
temperaturenumberno0.850.0 to 2.0. Higher = more chaotic goblin.
max_tokensintegerno4096Cap output length.
top_pnumberno0.95Nucleus sampling threshold.
streambooleannofalseStream tokens via SSE.
mining_depthintegerno31โ€“7. Depth of Deep Context Mining.
shadow_attentionbooleannotrueEnable Shadow Attention layer.
horde_modestringno"auto"One of auto, focused, broad.
grs_targetnumbernonullTarget Goblin Reward Signal score (0.0โ€“1.0). The model will prefer outputs scoring above this threshold.
loot_dropsbooleannofalseAppend a "loot" line at the end of substantial answers.
response_formatobjectnonull{ "type": "json_object" } for forced JSON output.
stopstring\arraynonullUp to 4 stop sequences.
userstringnonullEnd-user ID for abuse monitoring.

##Messages

Each message has role and content:

json
{
  "messages": [
    {"role": "system", "content": "be brief."},
    {"role": "user", "content": "what's a transformer?"},
    {"role": "assistant", "content": "..."},
    {"role": "user", "content": "and attention?"}
  ]
}

Roles:

  • โ–ธsystem โ€” high-level instructions. Optional. The Goblin Personality Core is always on regardless.
  • โ–ธuser โ€” input from the human.
  • โ–ธassistant โ€” prior model output.
  • โ–ธtool โ€” output from a tool call (see Function Calling).

##Response

json
{
  "id": "gob-cmpl-9F2kQxLm7HpVbNa3",
  "object": "chat.completion",
  "created": 1746998400,
  "model": "gob-5.5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "the answer text..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 89,
    "total_tokens": 101,
    "grs_score": 0.87,
    "mining_passes": 3,
    "horde_size": 7
  }
}

###`finish_reason` values

ValueMeaning
stopNatural stop (hit a stop token or model decided it was done)
lengthHit max_tokens
content_filterFiltered by safety layer
tool_callsModel wants to invoke a tool โ€” see Function Calling
goblin_distractedModel abandoned the response mid-generation. Rare. Retry.

###Extended `usage` fields

GPT-GOB returns three fields beyond standard OpenAI usage:

  • โ–ธgrs_score โ€” Goblin Reward Signal score for the completion (0.0โ€“1.0)
  • โ–ธmining_passes โ€” Actual Deep Context Mining passes used (โ‰ค mining_depth)
  • โ–ธhorde_size โ€” Number of parameter clusters activated by Horde Routing

##Streaming

Set stream: true to receive server-sent events instead of a single response.

##Function calling

Pass a tools array to let the goblin call your functions:

json
{
  "model": "gob-5.5",
  "messages": [{"role": "user", "content": "what's the weather in moscow?"}],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "get the current weather in a city",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {"type": "string"}
          },
          "required": ["city"]
        }
      }
    }
  ]
}

The model will respond with finish_reason: "tool_calls" and a tool_calls array containing the function name and arguments. You execute the function, append the result as a tool message, and call the API again to get the final response.

##Errors

See Errors for the full list. Common cases:

HTTPCodeMeaning
400invalid_request_errorMalformed request
401authentication_errorBad API key
429rate_limit_exceededSlow down
500cave_collapsedServer error, retry with exponential backoff
503horde_unavailableSpecific model overloaded, try a different tier