Skip to content

Latest commit

 

History

History
446 lines (331 loc) · 13.2 KB

File metadata and controls

446 lines (331 loc) · 13.2 KB

Grafana Sigil Python SDK

sigil-sdk records normalized LLM generation and tool-execution telemetry. It exports normalized generations to Sigil ingest and uses your OpenTelemetry tracer/meter setup for traces and metrics.

Use this package when you want:

  • A provider-agnostic generation record (same schema for OpenAI, Anthropic, Gemini, or custom adapters).
  • OTel-aligned tracing attributes for generation and tool spans.
  • Async export with retry/backoff, queueing, batching, and explicit shutdown semantics.

Installation

pip install sigil-sdk

Validation

Run the shared core conformance suite for the Python SDK from the repo root:

mise run test:py:sdk-conformance

Run the cross-language aggregate core conformance suite from the repo root:

mise run sdk:conformance

Optional provider helper packages:

pip install sigil-sdk-openai
pip install sigil-sdk-anthropic
pip install sigil-sdk-gemini

Optional framework modules:

pip install sigil-sdk-langchain
pip install sigil-sdk-langgraph
pip install sigil-sdk-openai-agents
pip install sigil-sdk-llamaindex
pip install sigil-sdk-google-adk

Framework handler usage:

from sigil_sdk import Client
from sigil_sdk_langchain import with_sigil_langchain_callbacks
from sigil_sdk_langgraph import with_sigil_langgraph_callbacks
from sigil_sdk_openai_agents import with_sigil_openai_agents_hooks
from sigil_sdk_llamaindex import with_sigil_llamaindex_callbacks
from sigil_sdk_google_adk import with_sigil_google_adk_callbacks

client = Client()
chain_config = with_sigil_langchain_callbacks(None, client=client, provider_resolver="auto")
graph_config = with_sigil_langgraph_callbacks(None, client=client, provider_resolver="auto")
openai_agents_run_options = with_sigil_openai_agents_hooks(None, client=client, provider_resolver="auto")
llamaindex_config = with_sigil_llamaindex_callbacks(None, client=client, provider_resolver="auto")
google_adk_agent_config = with_sigil_google_adk_callbacks(None, client=client, provider_resolver="auto")

Framework handlers inject framework tags/metadata on recorded generations:

  • sigil.framework.name (langchain, langgraph, openai-agents, llamaindex, or google-adk)
  • sigil.framework.source=handler
  • sigil.framework.language=python
  • metadata["sigil.framework.run_id"]
  • metadata["sigil.framework.thread_id"] (when present)
  • metadata["sigil.framework.parent_run_id"] (when available)
  • metadata["sigil.framework.component_name"]
  • metadata["sigil.framework.run_type"]
  • metadata["sigil.framework.tags"]
  • metadata["sigil.framework.retry_attempt"] (when available)
  • metadata["sigil.framework.event_id"] (when available)
  • metadata["sigil.framework.langgraph.node"] (LangGraph when available)

Conversation mapping is conversation-first:

  • conversation_id / session_id / group_id from framework context first
  • then thread_id
  • deterministic fallback sigil:framework:<framework_name>:<run_id>

When present in generation metadata, low-cardinality framework keys are copied onto generation span attributes.

For LangGraph persistence, pass configurable.thread_id and reuse it across invocations:

thread_config = {
    **with_sigil_langgraph_callbacks(None, client=client, provider_resolver="auto"),
    "configurable": {"thread_id": "customer-42"},
}
graph.invoke({"prompt": "Remember my timezone is UTC+1.", "answer": ""}, config=thread_config)
graph.invoke({"prompt": "What timezone did I give you?", "answer": ""}, config=thread_config)

Full framework examples:

  • LangChain: ../python-frameworks/langchain/README.md
  • LangGraph: ../python-frameworks/langgraph/README.md
  • OpenAI Agents: ../python-frameworks/openai-agents/README.md
  • LlamaIndex: ../python-frameworks/llamaindex/README.md
  • Google ADK: ../python-frameworks/google-adk/README.md

Quick Start (Sync Generation)

from sigil_sdk import (
    Client,
    ClientConfig,
    GenerationStart,
    ModelRef,
    assistant_text_message,
    user_text_message,
)

client = Client(
    ClientConfig(
        generation_export_endpoint="http://localhost:8080/api/v1/generations:export",
    )
)

with client.start_generation(
    GenerationStart(
        conversation_id="conv-1",
        agent_name="my-service",
        agent_version="1.0.0",
        model=ModelRef(provider="openai", name="gpt-5"),
    )
) as rec:
    rec.set_result(
        input=[user_text_message("What is the weather in Paris?")],
        output=[assistant_text_message("It is 18C and sunny.")],
    )

    # Recorder errors are local SDK errors (validation/enqueue/shutdown),
    # not provider call failures.
    if rec.err() is not None:
        raise rec.err()

client.shutdown()

Configure OTEL exporters (traces/metrics) in your application OTEL SDK setup. You can optionally pass tracer and meter via ClientConfig.

Quick OTEL setup pattern before creating the Sigil client:

from opentelemetry import metrics, trace
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.trace import TracerProvider

trace.set_tracer_provider(TracerProvider())
metrics.set_meter_provider(MeterProvider())

Streaming Generation

Use start_streaming_generation(...) when the upstream provider call is streaming.

from sigil_sdk import GenerationStart, ModelRef

with client.start_streaming_generation(
    GenerationStart(
        conversation_id="conv-stream",
        model=ModelRef(provider="anthropic", name="claude-sonnet-4-5"),
    )
) as rec:
    rec.set_result(output=[assistant_text_message("partial stream summary")])

Embedding Observability

Use start_embedding(...) for embedding API calls. Embedding recording emits OTel spans and SDK metrics only, and does not enqueue generation exports.

from sigil_sdk import EmbeddingResult, EmbeddingStart, ModelRef

with client.start_embedding(
    EmbeddingStart(
        agent_name="retrieval-worker",
        agent_version="1.0.0",
        model=ModelRef(provider="openai", name="text-embedding-3-small"),
    )
) as rec:
    response = openai.embeddings.create(model="text-embedding-3-small", input=["hello", "world"])
    rec.set_result(
        EmbeddingResult(
            input_count=2,
            input_tokens=response.usage.prompt_tokens,
            input_texts=["hello", "world"],  # captured only when embedding_capture.capture_input=True
            response_model=response.model,
        )
    )

Input text capture is opt-in:

from sigil_sdk import ClientConfig, EmbeddingCaptureConfig

cfg = ClientConfig(
    embedding_capture=EmbeddingCaptureConfig(
        capture_input=True,
        max_input_items=20,
        max_text_length=1024,
    )
)

capture_input may expose PII/document content in spans. Keep it disabled by default and enable only for scoped debugging.

TraceQL examples:

  • traces{gen_ai.operation.name="embeddings"}
  • traces{gen_ai.operation.name="embeddings" && gen_ai.request.model="text-embedding-3-small"}
  • traces{gen_ai.operation.name="embeddings" && error.type!=""}

Tool Execution Span Recording

Tool spans are recorded independently of generation export.

from sigil_sdk import ToolExecutionStart

with client.start_tool_execution(
    ToolExecutionStart(
        tool_name="weather",
        tool_call_id="call_weather_1",
        tool_type="function",
        include_content=True,
    )
) as rec:
    rec.set_result(arguments={"city": "Paris"}, result={"temp_c": 18})

SDK identity attributes

  • Generation and tool spans always include:
    • sigil.sdk.name=sdk-python
  • Normalized generation metadata always includes the same key.
  • If caller metadata provides a conflicting value for this key, the SDK overwrites it.

Context Defaults

Use context helpers to set defaults once per request/task boundary.

from sigil_sdk import with_agent_name, with_agent_version, with_conversation_id

with with_conversation_id("conv-ctx"), with_agent_name("planner"), with_agent_version("2026.02"):
    with client.start_generation(
        GenerationStart(model=ModelRef(provider="gemini", name="gemini-2.5-pro"))
    ) as rec:
        rec.set_result(output=[assistant_text_message("ok")])

Export Configuration

HTTP generation export

from sigil_sdk import ApiConfig, AuthConfig, ClientConfig, GenerationExportConfig

cfg = ClientConfig(
    generation_export=GenerationExportConfig(
        protocol="http",
        endpoint="http://localhost:8080/api/v1/generations:export",
        auth=AuthConfig(mode="tenant", tenant_id="dev-tenant"),
    ),
    api=ApiConfig(endpoint="http://localhost:8080"),
)

gRPC generation export

cfg = ClientConfig(
    generation_export=GenerationExportConfig(
        protocol="grpc",
        endpoint="localhost:50051",
        insecure=True,
        auth=AuthConfig(mode="tenant", tenant_id="dev-tenant"),
    ),
    api=ApiConfig(endpoint="http://localhost:8080"),
)

Generation export auth modes

Auth is resolved for generation_export.

  • mode="none"
  • mode="tenant" (requires tenant_id, injects X-Scope-OrgID)
  • mode="bearer" (requires bearer_token, injects Authorization: Bearer <token>)

Invalid mode/field combinations fail fast in resolve_config(...).

If explicit headers already include Authorization or X-Scope-OrgID, explicit headers win.

from sigil_sdk import ApiConfig, AuthConfig, ClientConfig, GenerationExportConfig

cfg = ClientConfig(
    generation_export=GenerationExportConfig(
        protocol="http",
        endpoint="http://localhost:8080/api/v1/generations:export",
        auth=AuthConfig(mode="tenant", tenant_id="prod-tenant"),
    ),
    api=ApiConfig(endpoint="http://localhost:8080"),
)

Env-secret wiring example

The SDK does not auto-load env vars. Resolve env values in your application and pass them into config explicitly.

import os
from sigil_sdk import AuthConfig, ClientConfig

cfg = ClientConfig()

gen_token = (os.getenv("SIGIL_GEN_BEARER_TOKEN") or "").strip()
if gen_token:
    cfg.generation_export.auth = AuthConfig(mode="bearer", bearer_token=gen_token)

Common topology:

  • Generations direct to Sigil: generation tenant mode.
  • Traces/metrics via OTEL Collector/Alloy: configure exporters in your app OTEL SDK setup.
  • Enterprise proxy: generation bearer mode to proxy; proxy authenticates and forwards tenant header upstream.

Conversation Ratings

Use the SDK helper to submit user-facing ratings:

from sigil_sdk import ConversationRatingInput, ConversationRatingValue

result = client.submit_conversation_rating(
    "conv-123",
    ConversationRatingInput(
        rating_id="rat-123",
        rating=ConversationRatingValue.BAD,
        comment="Answer ignored user context",
        metadata={"channel": "assistant-ui"},
        source="sdk-python",
    ),
)

print(result.rating.rating, result.summary.has_bad_rating)

submit_conversation_rating(...) sends requests to ClientConfig.api.endpoint (default http://localhost:8080) and uses the same generation-export auth headers (tenant or bearer) already configured on the SDK client.

Instrumentation-only mode (no generation send)

Set generation_export.protocol="none" to keep generation/tool instrumentation and spans while disabling generation transport.

from sigil_sdk import Client, ClientConfig, GenerationExportConfig

cfg = ClientConfig(
    generation_export=GenerationExportConfig(
        protocol="none",
    ),
)

client = Client(cfg)

Lifecycle and Error Semantics

  • flush() forces immediate export of queued generations.
  • shutdown() flushes pending generations, then closes generation exporters.
  • Always call shutdown() during process teardown to avoid dropped telemetry.
  • recorder.set_call_error(exc) marks provider-call failures on the generation payload and span status.
  • recorder.err() is for local SDK runtime errors only (validation, queue full, payload too large, shutdown).

SDK metrics

The SDK emits these OTel histograms through your configured OTEL meter provider:

  • gen_ai.client.operation.duration
  • gen_ai.client.token.usage
  • gen_ai.client.time_to_first_token
  • gen_ai.client.tool_calls_per_operation

Public API Overview

Core client and lifecycle:

  • Client
  • Client.start_generation(...)
  • Client.start_streaming_generation(...)
  • Client.start_tool_execution(...)
  • Client.flush()
  • Client.shutdown()

Typed payloads:

  • GenerationStart, Generation, ModelRef
  • Message, Part, ToolDefinition, TokenUsage
  • ToolExecutionStart, ToolExecutionEnd

Helpers:

  • user_text_message(...), assistant_text_message(...)
  • with_conversation_id(...), with_agent_name(...), with_agent_version(...)

Validation:

  • validate_generation(...)

Provider Helper Packages

Provider wrappers are wrapper-first and mapper-explicit:

  • sigil-sdk-openai
  • sigil-sdk-anthropic
  • sigil-sdk-gemini

Each package exposes sync + async wrappers and explicit mapper functions for custom integration points.

Regenerating gRPC Stubs

Install dev dependencies once:

python3 -m pip install -e 'sdks/python[dev]'

Then regenerate:

./sdks/python/scripts/generate_proto.sh

This regenerates sigil_sdk/internal/gen/sigil/v1/*_pb2*.py from sigil/proto/sigil/v1/generation_ingest.proto.