llama-stack/docs/source/references/python_sdk_reference/index.md
Xi Yan 8b655e3cd2
fix!: update eval-tasks -> benchmarks (#1032)
# What does this PR do?

- Update `/eval-tasks` to `/benchmarks`
- ⚠️ Remove differentiation between `app` v.s. `benchmark` eval task
config. Now we only have `BenchmarkConfig`. The overloaded `benchmark`
is confusing and do not add any value. Backward compatibility is being
kept as the "type" is not being used anywhere.

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
- This change is backward compatible 
- Run notebook test with

```
pytest -v -s --nbval-lax ./docs/getting_started.ipynb
pytest -v -s --nbval-lax ./docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb
```

<img width="846" alt="image"
src="https://github.com/user-attachments/assets/d2fc06a7-593a-444f-bc1f-10ab9b0c843d"
/>



[//]: # (## Documentation)
[//]: # (- [ ] Added a Changelog entry if the change is significant)

---------

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
Signed-off-by: Ben Browning <bbrownin@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
Co-authored-by: Ben Browning <ben324@gmail.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Reid <61492567+reidliu41@users.noreply.github.com>
Co-authored-by: reidliu <reid201711@gmail.com>
Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
2025-02-13 16:40:58 -08:00

24 KiB

Python SDK Reference

Shared Types

from llama_stack_client.types import (
    AgentConfig,
    BatchCompletion,
    CompletionMessage,
    ContentDelta,
    Document,
    InterleavedContent,
    InterleavedContentItem,
    Message,
    ParamType,
    QueryConfig,
    QueryResult,
    ReturnType,
    SafetyViolation,
    SamplingParams,
    ScoringResult,
    SystemMessage,
    ToolCall,
    ToolParamDefinition,
    ToolResponseMessage,
    URL,
    UserMessage,
)

Toolgroups

Types:

from llama_stack_client.types import (
    ListToolGroupsResponse,
    ToolGroup,
    ToolgroupListResponse,
)

Methods:

Tools

Types:

from llama_stack_client.types import ListToolsResponse, Tool, ToolListResponse

Methods:

ToolRuntime

Types:

from llama_stack_client.types import ToolDef, ToolInvocationResult

Methods:

RagTool

Methods:

Agents

Types:

from llama_stack_client.types import (
    InferenceStep,
    MemoryRetrievalStep,
    ShieldCallStep,
    ToolExecutionStep,
    ToolResponse,
    AgentCreateResponse,
)

Methods:

Session

Types:

from llama_stack_client.types.agents import Session, SessionCreateResponse

Methods:

Steps

Types:

from llama_stack_client.types.agents import StepRetrieveResponse

Methods:

Turn

Types:

from llama_stack_client.types.agents import Turn, TurnCreateResponse

Methods:

BatchInference

Types:

from llama_stack_client.types import BatchInferenceChatCompletionResponse

Methods:

Datasets

Types:

from llama_stack_client.types import (
    ListDatasetsResponse,
    DatasetRetrieveResponse,
    DatasetListResponse,
)

Methods:

Eval

Types:

from llama_stack_client.types import EvaluateResponse, Job

Methods:

Jobs

Types:

from llama_stack_client.types.eval import JobStatusResponse

Methods:

  • client.eval.jobs.retrieve(job_id, *, benchmark_id) -> EvaluateResponse
  • client.eval.jobs.cancel(job_id, *, benchmark_id) -> None
  • client.eval.jobs.status(job_id, *, benchmark_id) -> Optional[JobStatusResponse]

Inspect

Types:

from llama_stack_client.types import HealthInfo, ProviderInfo, RouteInfo, VersionInfo

Methods:

Inference

Types:

from llama_stack_client.types import (
    CompletionResponse,
    EmbeddingsResponse,
    TokenLogProbs,
    InferenceChatCompletionResponse,
    InferenceCompletionResponse,
)

Methods:

VectorIo

Types:

from llama_stack_client.types import QueryChunksResponse

Methods:

VectorDBs

Types:

from llama_stack_client.types import (
    ListVectorDBsResponse,
    VectorDBRetrieveResponse,
    VectorDBListResponse,
    VectorDBRegisterResponse,
)

Methods:

Models

Types:

from llama_stack_client.types import ListModelsResponse, Model, ModelListResponse

Methods:

PostTraining

Types:

from llama_stack_client.types import ListPostTrainingJobsResponse, PostTrainingJob

Methods:

Job

Types:

from llama_stack_client.types.post_training import (
    JobListResponse,
    JobArtifactsResponse,
    JobStatusResponse,
)

Methods:

Providers

Types:

from llama_stack_client.types import ListProvidersResponse, ProviderListResponse

Methods:

Routes

Types:

from llama_stack_client.types import ListRoutesResponse, RouteListResponse

Methods:

Safety

Types:

from llama_stack_client.types import RunShieldResponse

Methods:

Shields

Types:

from llama_stack_client.types import ListShieldsResponse, Shield, ShieldListResponse

Methods:

SyntheticDataGeneration

Types:

from llama_stack_client.types import SyntheticDataGenerationResponse

Methods:

Telemetry

Types:

from llama_stack_client.types import (
    QuerySpansResponse,
    SpanWithStatus,
    Trace,
    TelemetryGetSpanResponse,
    TelemetryGetSpanTreeResponse,
    TelemetryQuerySpansResponse,
    TelemetryQueryTracesResponse,
)

Methods:

Datasetio

Types:

from llama_stack_client.types import PaginatedRowsResult

Methods:

Scoring

Types:

from llama_stack_client.types import ScoringScoreResponse, ScoringScoreBatchResponse

Methods:

ScoringFunctions

Types:

from llama_stack_client.types import (
    ListScoringFunctionsResponse,
    ScoringFn,
    ScoringFunctionListResponse,
)

Methods:

Benchmarks

Types:

from llama_stack_client.types import (
    Benchmark,
    ListBenchmarksResponse,
    BenchmarkListResponse,
)

Methods: