sigil-sdk records normalized LLM generation and tool-execution telemetry. It exports normalized generations to Sigil ingest and uses your OpenTelemetry tracer/meter setup for traces and metrics.
Use this package when you want:
- A provider-agnostic generation record (same schema for OpenAI, Anthropic, Gemini, or custom adapters).
- OTel-aligned tracing attributes for generation and tool spans.
- Async export with retry/backoff, queueing, batching, and explicit shutdown semantics.
pip install sigil-sdkRun the shared core conformance suite for the Python SDK from the repo root:
mise run test:py:sdk-conformanceRun the cross-language aggregate core conformance suite from the repo root:
mise run sdk:conformanceOptional provider helper packages:
pip install sigil-sdk-openai
pip install sigil-sdk-anthropic
pip install sigil-sdk-geminiOptional framework modules:
pip install sigil-sdk-langchain
pip install sigil-sdk-langgraph
pip install sigil-sdk-openai-agents
pip install sigil-sdk-llamaindex
pip install sigil-sdk-google-adkFramework handler usage:
from sigil_sdk import Client
from sigil_sdk_langchain import with_sigil_langchain_callbacks
from sigil_sdk_langgraph import with_sigil_langgraph_callbacks
from sigil_sdk_openai_agents import with_sigil_openai_agents_hooks
from sigil_sdk_llamaindex import with_sigil_llamaindex_callbacks
from sigil_sdk_google_adk import with_sigil_google_adk_callbacks
client = Client()
chain_config = with_sigil_langchain_callbacks(None, client=client, provider_resolver="auto")
graph_config = with_sigil_langgraph_callbacks(None, client=client, provider_resolver="auto")
openai_agents_run_options = with_sigil_openai_agents_hooks(None, client=client, provider_resolver="auto")
llamaindex_config = with_sigil_llamaindex_callbacks(None, client=client, provider_resolver="auto")
google_adk_agent_config = with_sigil_google_adk_callbacks(None, client=client, provider_resolver="auto")Framework handlers inject framework tags/metadata on recorded generations:
sigil.framework.name(langchain,langgraph,openai-agents,llamaindex, orgoogle-adk)sigil.framework.source=handlersigil.framework.language=pythonmetadata["sigil.framework.run_id"]metadata["sigil.framework.thread_id"](when present)metadata["sigil.framework.parent_run_id"](when available)metadata["sigil.framework.component_name"]metadata["sigil.framework.run_type"]metadata["sigil.framework.tags"]metadata["sigil.framework.retry_attempt"](when available)metadata["sigil.framework.event_id"](when available)metadata["sigil.framework.langgraph.node"](LangGraph when available)
Conversation mapping is conversation-first:
conversation_id/session_id/group_idfrom framework context first- then
thread_id - deterministic fallback
sigil:framework:<framework_name>:<run_id>
When present in generation metadata, low-cardinality framework keys are copied onto generation span attributes.
For LangGraph persistence, pass configurable.thread_id and reuse it across invocations:
thread_config = {
**with_sigil_langgraph_callbacks(None, client=client, provider_resolver="auto"),
"configurable": {"thread_id": "customer-42"},
}
graph.invoke({"prompt": "Remember my timezone is UTC+1.", "answer": ""}, config=thread_config)
graph.invoke({"prompt": "What timezone did I give you?", "answer": ""}, config=thread_config)Full framework examples:
- LangChain:
../python-frameworks/langchain/README.md - LangGraph:
../python-frameworks/langgraph/README.md - OpenAI Agents:
../python-frameworks/openai-agents/README.md - LlamaIndex:
../python-frameworks/llamaindex/README.md - Google ADK:
../python-frameworks/google-adk/README.md
from sigil_sdk import (
Client,
ClientConfig,
GenerationStart,
ModelRef,
assistant_text_message,
user_text_message,
)
client = Client(
ClientConfig(
generation_export_endpoint="http://localhost:8080/api/v1/generations:export",
)
)
with client.start_generation(
GenerationStart(
conversation_id="conv-1",
agent_name="my-service",
agent_version="1.0.0",
model=ModelRef(provider="openai", name="gpt-5"),
)
) as rec:
rec.set_result(
input=[user_text_message("What is the weather in Paris?")],
output=[assistant_text_message("It is 18C and sunny.")],
)
# Recorder errors are local SDK errors (validation/enqueue/shutdown),
# not provider call failures.
if rec.err() is not None:
raise rec.err()
client.shutdown()Configure OTEL exporters (traces/metrics) in your application OTEL SDK setup. You can optionally pass tracer and meter via ClientConfig.
Quick OTEL setup pattern before creating the Sigil client:
from opentelemetry import metrics, trace
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.trace import TracerProvider
trace.set_tracer_provider(TracerProvider())
metrics.set_meter_provider(MeterProvider())Use start_streaming_generation(...) when the upstream provider call is streaming.
from sigil_sdk import GenerationStart, ModelRef
with client.start_streaming_generation(
GenerationStart(
conversation_id="conv-stream",
model=ModelRef(provider="anthropic", name="claude-sonnet-4-5"),
)
) as rec:
rec.set_result(output=[assistant_text_message("partial stream summary")])Use start_embedding(...) for embedding API calls. Embedding recording emits OTel spans and SDK metrics only, and does not enqueue generation exports.
from sigil_sdk import EmbeddingResult, EmbeddingStart, ModelRef
with client.start_embedding(
EmbeddingStart(
agent_name="retrieval-worker",
agent_version="1.0.0",
model=ModelRef(provider="openai", name="text-embedding-3-small"),
)
) as rec:
response = openai.embeddings.create(model="text-embedding-3-small", input=["hello", "world"])
rec.set_result(
EmbeddingResult(
input_count=2,
input_tokens=response.usage.prompt_tokens,
input_texts=["hello", "world"], # captured only when embedding_capture.capture_input=True
response_model=response.model,
)
)Input text capture is opt-in:
from sigil_sdk import ClientConfig, EmbeddingCaptureConfig
cfg = ClientConfig(
embedding_capture=EmbeddingCaptureConfig(
capture_input=True,
max_input_items=20,
max_text_length=1024,
)
)capture_input may expose PII/document content in spans. Keep it disabled by default and enable only for scoped debugging.
TraceQL examples:
traces{gen_ai.operation.name="embeddings"}traces{gen_ai.operation.name="embeddings" && gen_ai.request.model="text-embedding-3-small"}traces{gen_ai.operation.name="embeddings" && error.type!=""}
Tool spans are recorded independently of generation export.
from sigil_sdk import ToolExecutionStart
with client.start_tool_execution(
ToolExecutionStart(
tool_name="weather",
tool_call_id="call_weather_1",
tool_type="function",
include_content=True,
)
) as rec:
rec.set_result(arguments={"city": "Paris"}, result={"temp_c": 18})- Generation and tool spans always include:
sigil.sdk.name=sdk-python
- Normalized generation metadata always includes the same key.
- If caller metadata provides a conflicting value for this key, the SDK overwrites it.
Use context helpers to set defaults once per request/task boundary.
from sigil_sdk import with_agent_name, with_agent_version, with_conversation_id
with with_conversation_id("conv-ctx"), with_agent_name("planner"), with_agent_version("2026.02"):
with client.start_generation(
GenerationStart(model=ModelRef(provider="gemini", name="gemini-2.5-pro"))
) as rec:
rec.set_result(output=[assistant_text_message("ok")])from sigil_sdk import ApiConfig, AuthConfig, ClientConfig, GenerationExportConfig
cfg = ClientConfig(
generation_export=GenerationExportConfig(
protocol="http",
endpoint="http://localhost:8080/api/v1/generations:export",
auth=AuthConfig(mode="tenant", tenant_id="dev-tenant"),
),
api=ApiConfig(endpoint="http://localhost:8080"),
)cfg = ClientConfig(
generation_export=GenerationExportConfig(
protocol="grpc",
endpoint="localhost:50051",
insecure=True,
auth=AuthConfig(mode="tenant", tenant_id="dev-tenant"),
),
api=ApiConfig(endpoint="http://localhost:8080"),
)Auth is resolved for generation_export.
mode="none"mode="tenant"(requirestenant_id, injectsX-Scope-OrgID)mode="bearer"(requiresbearer_token, injectsAuthorization: Bearer <token>)
Invalid mode/field combinations fail fast in resolve_config(...).
If explicit headers already include Authorization or X-Scope-OrgID, explicit headers win.
from sigil_sdk import ApiConfig, AuthConfig, ClientConfig, GenerationExportConfig
cfg = ClientConfig(
generation_export=GenerationExportConfig(
protocol="http",
endpoint="http://localhost:8080/api/v1/generations:export",
auth=AuthConfig(mode="tenant", tenant_id="prod-tenant"),
),
api=ApiConfig(endpoint="http://localhost:8080"),
)The SDK does not auto-load env vars. Resolve env values in your application and pass them into config explicitly.
import os
from sigil_sdk import AuthConfig, ClientConfig
cfg = ClientConfig()
gen_token = (os.getenv("SIGIL_GEN_BEARER_TOKEN") or "").strip()
if gen_token:
cfg.generation_export.auth = AuthConfig(mode="bearer", bearer_token=gen_token)Common topology:
- Generations direct to Sigil: generation
tenantmode. - Traces/metrics via OTEL Collector/Alloy: configure exporters in your app OTEL SDK setup.
- Enterprise proxy: generation
bearermode to proxy; proxy authenticates and forwards tenant header upstream.
Use the SDK helper to submit user-facing ratings:
from sigil_sdk import ConversationRatingInput, ConversationRatingValue
result = client.submit_conversation_rating(
"conv-123",
ConversationRatingInput(
rating_id="rat-123",
rating=ConversationRatingValue.BAD,
comment="Answer ignored user context",
metadata={"channel": "assistant-ui"},
source="sdk-python",
),
)
print(result.rating.rating, result.summary.has_bad_rating)submit_conversation_rating(...) sends requests to ClientConfig.api.endpoint (default http://localhost:8080) and uses the same generation-export auth headers (tenant or bearer) already configured on the SDK client.
Set generation_export.protocol="none" to keep generation/tool instrumentation and spans while disabling generation transport.
from sigil_sdk import Client, ClientConfig, GenerationExportConfig
cfg = ClientConfig(
generation_export=GenerationExportConfig(
protocol="none",
),
)
client = Client(cfg)flush()forces immediate export of queued generations.shutdown()flushes pending generations, then closes generation exporters.- Always call
shutdown()during process teardown to avoid dropped telemetry. recorder.set_call_error(exc)marks provider-call failures on the generation payload and span status.recorder.err()is for local SDK runtime errors only (validation, queue full, payload too large, shutdown).
The SDK emits these OTel histograms through your configured OTEL meter provider:
gen_ai.client.operation.durationgen_ai.client.token.usagegen_ai.client.time_to_first_tokengen_ai.client.tool_calls_per_operation
Core client and lifecycle:
ClientClient.start_generation(...)Client.start_streaming_generation(...)Client.start_tool_execution(...)Client.flush()Client.shutdown()
Typed payloads:
GenerationStart,Generation,ModelRefMessage,Part,ToolDefinition,TokenUsageToolExecutionStart,ToolExecutionEnd
Helpers:
user_text_message(...),assistant_text_message(...)with_conversation_id(...),with_agent_name(...),with_agent_version(...)
Validation:
validate_generation(...)
Provider wrappers are wrapper-first and mapper-explicit:
sigil-sdk-openaisigil-sdk-anthropicsigil-sdk-gemini
Each package exposes sync + async wrappers and explicit mapper functions for custom integration points.
Install dev dependencies once:
python3 -m pip install -e 'sdks/python[dev]'Then regenerate:
./sdks/python/scripts/generate_proto.shThis regenerates sigil_sdk/internal/gen/sigil/v1/*_pb2*.py from sigil/proto/sigil/v1/generation_ingest.proto.