#Chat Completions

The core endpoint of the GPT-GOB API. Send a list of messages, get back a goblin-aligned completion.

http

POST https://api.gpt-gob.ai/v1/chat/completions

##Request body

Parameter	Type	Required	Default	Description
`model`	string	yes	—	Model ID. See Models.
`messages`	array	yes	—	Conversation history. See below.
`temperature`	number	no	`0.85`	0.0 to 2.0. Higher = more chaotic goblin.
`max_tokens`	integer	no	`4096`	Cap output length.
`top_p`	number	no	`0.95`	Nucleus sampling threshold.
`stream`	boolean	no	`false`	Stream tokens via SSE.
`mining_depth`	integer	no	`3`	1–7. Depth of Deep Context Mining.
`shadow_attention`	boolean	no	`true`	Enable Shadow Attention layer.
`horde_mode`	string	no	`"auto"`	One of `auto`, `focused`, `broad`.
`grs_target`	number	no	`null`	Target Goblin Reward Signal score (0.0–1.0). The model will prefer outputs scoring above this threshold.
`loot_drops`	boolean	no	`false`	Append a "loot" line at the end of substantial answers.
`response_format`	object	no	`null`	`{ "type": "json_object" }` for forced JSON output.
`stop`	string\	array	no	`null`	Up to 4 stop sequences.
`user`	string	no	`null`	End-user ID for abuse monitoring.

##Messages

Each message has role and content:

json

{
  "messages": [
    {"role": "system", "content": "be brief."},
    {"role": "user", "content": "what's a transformer?"},
    {"role": "assistant", "content": "..."},
    {"role": "user", "content": "and attention?"}
  ]
}

Roles:

▸system — high-level instructions. Optional. The Goblin Personality Core is always on regardless.
▸user — input from the human.
▸assistant — prior model output.
▸tool — output from a tool call (see Function Calling).

##Response

json

{
  "id": "gob-cmpl-9F2kQxLm7HpVbNa3",
  "object": "chat.completion",
  "created": 1746998400,
  "model": "gob-5.5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "the answer text..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 89,
    "total_tokens": 101,
    "grs_score": 0.87,
    "mining_passes": 3,
    "horde_size": 7
  }
}

###`finish_reason` values

Value	Meaning
`stop`	Natural stop (hit a stop token or model decided it was done)
`length`	Hit `max_tokens`
`content_filter`	Filtered by safety layer
`tool_calls`	Model wants to invoke a tool — see Function Calling
`goblin_distracted`	Model abandoned the response mid-generation. Rare. Retry.

###Extended `usage` fields

GPT-GOB returns three fields beyond standard OpenAI usage:

▸grs_score — Goblin Reward Signal score for the completion (0.0–1.0)
▸mining_passes — Actual Deep Context Mining passes used (≤ mining_depth)
▸horde_size — Number of parameter clusters activated by Horde Routing

##Streaming

Set stream: true to receive server-sent events instead of a single response.

##Function calling

Pass a tools array to let the goblin call your functions:

json

{
  "model": "gob-5.5",
  "messages": [{"role": "user", "content": "what's the weather in moscow?"}],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "get the current weather in a city",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {"type": "string"}
          },
          "required": ["city"]
        }
      }
    }
  ]
}

The model will respond with finish_reason: "tool_calls" and a tool_calls array containing the function name and arguments. You execute the function, append the result as a tool message, and call the API again to get the final response.

##Errors

See Errors for the full list. Common cases:

HTTP	Code	Meaning
400	`invalid_request_error`	Malformed request
401	`authentication_error`	Bad API key
429	`rate_limit_exceeded`	Slow down
500	`cave_collapsed`	Server error, retry with exponential backoff
503	`horde_unavailable`	Specific model overloaded, try a different tier