GenAI developer docs

Build with AIGrid

Ship chat, embeddings, OCR, and agent workflows on Algeria's AI provider with OpenAI-compatible endpoints and organization-scoped keys.

Explore models Start quickstart

API base

http://app.ai-grid.io:4000/v1

Public docs use the AIGrid inference endpoint. Sign in to deploy models and create keys.

Product context

Use the API in production

Deploy a model in your AIGrid workspace to obtain an **organization-scoped API key**. Every HTTPS call includes your instance `model` id and uses **Bearer authentication** identical to OpenAI-compatible proxies. Treat your deployed base URL as secret infrastructure: snippets in this hub use a neutral placeholder until you authenticate, then switch to your provisioned hostname for frictionless testing.

Why OpenAI-compatible

Libraries and MCP clients already assume `/v1/chat/completions`. AIGrid keeps that contract so LangChain tool binders, LangGraph checkpoints, and LlamaIndex `Settings.llm` work with **minimal configuration changes**.

Next steps after sign-up

Create an organization, open **Models**, deploy an instance, and open **How to use** inside the dashboard. That dialog surfaces live base URLs and key hygiene — this documentation hub complements it with runnable patterns across Python and Node ecosystems.

Quickstart

Make your first request

Three things matter: the endpoint, the model id, and the instance key. Everything else can come later.

Step 1

Create a key

Deploy a model from Models, then copy the instance API key from How to use.

Step 2

Choose a model id

Use the exact model value shown in the model library or on your deployed instance.

Step 3

Call the API

Send OpenAI-compatible requests to your AIGrid endpoint with a bearer token.

Minimal chat request

curl http://app.ai-grid.io:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-oss-120b",
    "messages": [{ "role": "user", "content": "Summarize AIGrid in one sentence." }]
  }'

Choose your workflow

AIGrid docs are organized by what you are building, not by framework names first.

Chat or agent

Text LLM

Use chat completions for assistants, structured prompts, tool calls, and agent loops.

Open guide

Search and RAG

Embeddings

Create vectors for semantic search, document retrieval, clustering, and recommendations.

Open guide

OCR and documents

Vision

Extract text from scans, screenshots, rendered PDF pages, and document images.

Open guide

Production backend

Security

Keep keys server-side, proxy browser traffic, rotate leaked credentials, and monitor spend.

Open guide

HTTP clients

Direct API examples

Use these for scripts, notebooks, and backend smoke tests. Browser apps should call your backend instead of storing keys client-side.

Model-specific examples

Browser and CORS

Calling the inference endpoint directly from a browser requires permissive CORS and exposes secrets if you embed keys in bundles. Ship a backend route that attaches the Authorization header server-side.

Node.js, browser demo, and Python requests

Node.js fetch

// Node 18+ (run: node chat.mjs)
const base = "http://app.ai-grid.io:4000";
const apiKey = process.env.AIGRID_API_KEY;

const res = await fetch(`${base}/v1/chat/completions`, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${apiKey}`,
  },
  body: JSON.stringify({
    model: "gpt-oss-120b",
    messages: [{ role: "user", content: "Ping" }],
  }),
});

if (!res.ok) throw new Error(await res.text());
const data = await res.json();
console.log(data.choices[0].message.content);

Browser fetch demo

// Educational only — never ship a real API key in front-end bundles.
// Prefer a small backend route that injects the key server-side.
async function askAIGrid(prompt) {
  const res = await fetch("http://app.ai-grid.io:4000/v1/chat/completions", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: "Bearer YOUR_API_KEY",
    },
    body: JSON.stringify({
      model: "gpt-oss-120b",
      messages: [{ role: "user", content: prompt }],
    }),
  });
  const data = await res.json();
  return data.choices[0].message.content;
}

Python requests

import os
import requests

url = "http://app.ai-grid.io:4000/v1/chat/completions"
headers = {
    "Authorization": f"Bearer {os.environ['AIGRID_API_KEY']}",
    "Content-Type": "application/json",
}
payload = {
    "model": "gpt-oss-120b",
    "messages": [{"role": "user", "content": "Hello from requests!"}],
}
r = requests.post(url, json=payload, headers=headers, timeout=60)
r.raise_for_status()
print(r.json()["choices"][0]["message"]["content"])

Streaming responses

Set stream: true to receive server-sent events with incremental delta content.

curl with stream

curl http://app.ai-grid.io:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-oss-120b",
    "stream": true,
    "messages": [{ "role": "user", "content": "Write a haiku about APIs." }]
  }'

Python OpenAI SDK stream

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["AIGRID_API_KEY"],
    base_url="http://app.ai-grid.io:4000/v1",
)

stream = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[{"role": "user", "content": "Stream three bullet tips."}],
    stream=True,
)
for chunk in stream:
    part = chunk.choices[0].delta.content or ""
    print(part, end="", flush=True)

Frameworks

Advanced integration examples

Framework examples are available when you need them, but they stay out of the first quickstart path.

LangChain

Use ChatOpenAI with your AIGrid base URL and deployment model id.

ChatOpenAI + messages

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage
import os

llm = ChatOpenAI(
    model="gpt-oss-120b",
    openai_api_key=os.environ["AIGRID_API_KEY"],
    openai_api_base="http://app.ai-grid.io:4000/v1",
    temperature=0.2,
)

messages = [
    SystemMessage(content="You are a concise support agent."),
    HumanMessage(content="How do I rotate an API key?"),
]
print(llm.invoke(messages).content)

LCEL: prompt | model | parser

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
import os

llm = ChatOpenAI(
    model="gpt-oss-120b",
    openai_api_key=os.environ["AIGRID_API_KEY"],
    openai_api_base="http://app.ai-grid.io:4000/v1",
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "Reply as JSON with keys title and summary."),
    ("human", "{text}"),
])

chain = prompt | llm | StrOutputParser()
print(chain.invoke({"text": "AIGrid deploys model instances with dedicated keys."}))

RAG-style LCEL

# LCEL RAG-style chain: retrieve locally, answer via your AIGrid API.
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_core.documents import Document
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
import os

embeddings = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"])
docs = [
    Document(page_content="AIGrid issues per-model API keys for each deployment."),
    Document(page_content="Swap model name to match your deployment."),
]
store = InMemoryVectorStore.from_documents(docs, embedding=embeddings)
retriever = store.as_retriever(search_kwargs={"k": 2})

llm = ChatOpenAI(
    model="gpt-oss-120b",
    openai_api_key=os.environ["AIGRID_API_KEY"],
    openai_api_base="http://app.ai-grid.io:4000/v1",
)

def format_docs(d):
    return "\n".join(doc.page_content for doc in d)

prompt = ChatPromptTemplate.from_template(
    "Answer using only the context.\n\nContext:\n{context}\n\nQuestion: {question}"
)

chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

print(chain.invoke("What is AIGrid?"))

Tool calling

# Tool calling (bind_tools) against your AIGrid deployment
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from pydantic import BaseModel, Field
import os

class CityArgs(BaseModel):
    city: str = Field(description="City to summarize weather for")

@tool(args_schema=CityArgs)
def weather_stub(city: str) -> str:
    """Return a fake forecast for demos."""
    return f"Sunny and warm in {city}."

llm = ChatOpenAI(
    model="gpt-oss-120b",
    openai_api_key=os.environ["AIGRID_API_KEY"],
    openai_api_base="http://app.ai-grid.io:4000/v1",
).bind_tools([weather_stub])

msg = llm.invoke("What is the weather in Oran tomorrow?")
if msg.tool_calls:
    print("Tool calls:", msg.tool_calls)
else:
    print(msg.content)

LangGraph

Use LangGraph when your GenAI flow needs explicit routing, retries, tools, or human approval steps.

Single-node graph

from typing import TypedDict, Annotated
import operator
import os
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from langgraph.graph import StateGraph, END

class S(TypedDict):
    messages: Annotated[list, operator.add]

def llm_node(state: S):
    llm = ChatOpenAI(
        model="gpt-oss-120b",
        openai_api_key=os.environ["AIGRID_API_KEY"],
        openai_api_base="http://app.ai-grid.io:4000/v1",
    )
    return {"messages": [llm.invoke(state["messages"])]}

g = StateGraph(S)
g.add_node("model", llm_node)
g.set_entry_point("model")
g.add_edge("model", END)
app = g.compile()

print(app.invoke({"messages": [HumanMessage(content="One-line pitch for AIGrid.")]})["messages"][-1].content)

ReAct agent with tools

# ReAct-style agent (LangGraph prebuilt). pip install langgraph langchain-openai
import os
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage
from langgraph.prebuilt import create_react_agent

@tool
def ticket_lookup(query: str) -> str:
    """Stub CRM lookup — replace with your database."""
    return f"(demo) no rows for: {query}"

llm = ChatOpenAI(
    model="gpt-oss-120b",
    openai_api_key=os.environ["AIGRID_API_KEY"],
    openai_api_base="http://app.ai-grid.io:4000/v1",
)

agent = create_react_agent(llm, [ticket_lookup])
result = agent.invoke(
    {"messages": [HumanMessage(content="Find ticket ACME-42 in the CRM stub.")]}
)
print(result["messages"][-1].content)

Two-step retrieve to generate graph

# Minimal graph RAG: retrieve locally, generate with your AIGrid LLM
from typing import TypedDict
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.documents import Document
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_core.messages import HumanMessage, SystemMessage
from langgraph.graph import StateGraph, END
import os

class S(TypedDict):
    question: str
    context: str
    answer: str

def retrieve(state: S) -> S:
    emb = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"])
    docs = [Document(page_content="AIGrid routes org traffic through per-instance keys.")]
    store = InMemoryVectorStore.from_documents(docs, embedding=emb)
    hits = store.similarity_search(state["question"], k=1)
    return {**state, "context": "\n".join(d.page_content for d in hits)}

def generate(state: S) -> S:
    llm = ChatOpenAI(
        model="gpt-oss-120b",
        openai_api_key=os.environ["AIGRID_API_KEY"],
        openai_api_base="http://app.ai-grid.io:4000/v1",
    )
    msgs = [
        SystemMessage(content="Answer using CONTEXT only."),
        HumanMessage(content=f"CONTEXT:\n{state['context']}\n\nQ: {state['question']}"),
    ]
    out = llm.invoke(msgs)
    return {**state, "answer": out.content}

g = StateGraph(S)
g.add_node("retrieve", retrieve)
g.add_node("generate", generate)
g.set_entry_point("retrieve")
g.add_edge("retrieve", "generate")
g.add_edge("generate", END)

app = g.compile()
print(app.invoke({"question": "How are keys scoped?", "context": "", "answer": ""})["answer"])

LlamaIndex

Point LlamaIndex at your AIGrid API root, then keep model selection and keys aligned with the deployed instance.

Settings.llm + VectorStoreIndex

import os
from llama_index.llms.openai import OpenAI as LlamaOpenAI
from llama_index.core import Document, VectorStoreIndex, Settings

Settings.llm = LlamaOpenAI(
    model="gpt-oss-120b",
    api_key=os.environ["AIGRID_API_KEY"],
    api_base="http://app.ai-grid.io:4000/v1",
    temperature=0.1,
)

docs = [Document(text="AIGrid routes traffic through org-scoped deployment keys.")]
index = VectorStoreIndex.from_documents(docs)
qe = index.as_query_engine()
print(qe.query("What routes traffic?").response)

Multi-turn chat messages

import os
from llama_index.llms.openai import OpenAI as LlamaOpenAI
from llama_index.core import Settings
from llama_index.core.llms import ChatMessage

Settings.llm = LlamaOpenAI(
    model="gpt-oss-120b",
    api_key=os.environ["AIGRID_API_KEY"],
    api_base="http://app.ai-grid.io:4000/v1",
)

history = [
    ChatMessage(role="user", content="We use AIGrid for all LLM traffic."),
    ChatMessage(role="assistant", content="Understood."),
    ChatMessage(role="user", content="List three ops checks before go-live."),
]
reply = Settings.llm.chat(history)
print(reply.message.content)

Agent bootstrap

# LlamaIndex agent pattern (imports vary by release — align with yours)
# Typical flow: RouterQueryEngine → OpenAIAgent / ReActAgent with Tools
import os
from llama_index.core.tools import FunctionTool
from llama_index.llms.openai import OpenAI as LlamaOpenAI
from llama_index.core import Settings

Settings.llm = LlamaOpenAI(
    model="gpt-oss-120b",
    api_key=os.environ["AIGRID_API_KEY"],
    api_base="http://app.ai-grid.io:4000/v1",
)

def ping_env() -> str:
    return "AIGrid endpoint reachable (stub)."

tool = FunctionTool.from_defaults(fn=ping_env, name="ping_env", description="Checks stub connectivity.")

# Compose an agent via your LlamaIndex version:
# agent = OpenAIAgent.from_tools([tool], llm=Settings.llm, verbose=True)
# print(agent.chat("Run ping_env and summarize Algeria AI posture in one sentence."))
print(
    "Install the agent package matching your LlamaIndex version, uncomment the agent lines, "
    "and keep api_base pointing at your AIGrid /v1 root."
)

Production security

Keep production API keys out of front-end bundles. Use a backend-for-frontend or server action, rotate leaked keys from Models, and monitor usage for anomalous spikes.

Validate request payloads before forwarding them.

Log request IDs to correlate app events with AIGrid dashboards.

Use separate keys for workloads with different spend or risk profiles.

Open model library Go to dashboard