Commit graph

3 commits

Author SHA1 Message Date
Sébastien Han
ac5fd57387
chore: remove nested imports (#2515)
# What does this PR do?

* Given that our API packages use "import *" in `__init.py__` we don't
need to do `from llama_stack.apis.models.models` but simply from
llama_stack.apis.models. The decision to use `import *` is debatable and
should probably be revisited at one point.

* Remove unneeded Ruff F401 rule
* Consolidate Ruff F403 rule in the pyprojectfrom
llama_stack.apis.models.models

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-06-26 08:01:05 +05:30
Eran Cohen
747e594680
feat: expand set of known gemini models (#2471)
Some checks failed
Test Llama Stack Build / build-single-provider (push) Failing after 39s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 37s
Python Package Build Test / build (3.12) (push) Failing after 36s
Test External Providers / test-external-providers (venv) (push) Failing after 45s
Pre-commit / pre-commit (push) Successful in 1m57s
Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.11, post_training) (push) Failing after 12s
Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 6s
Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 12s
Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 7s
Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 6s
Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 6s
Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 11s
Test Llama Stack Build / generate-matrix (push) Successful in 9s
Python Package Build Test / build (3.11) (push) Failing after 7s
Python Package Build Test / build (3.13) (push) Failing after 6s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 9s
Unit Tests / unit-tests (3.11) (push) Failing after 5s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 6s
Test Llama Stack Build / build (push) Failing after 3s
feat: Add Gemini 2.0 and 2.5 models

This commit expands the set of known Gemini models by introducing:
- `gemini/gemini-2.0-flash`
- `gemini/gemini-2.5-flash`
- `gemini/gemini-2.5-pro`

These new models are added to `LLM_MODEL_IDS` for broader compatibility
and updated in `run.yaml` to allow for their immediate use in starter
configurations.

Signed-off-by: Eran Cohen <eranco@redhat.com>
2025-06-19 12:19:37 -04:00
Ashwin Bharambe
63e6acd0c3
feat: add (openai, anthropic, gemini) providers via litellm (#1267)
# What does this PR do?

This PR introduces more non-llama model support to llama stack.
Providers introduced: openai, anthropic and gemini. All of these
providers use essentially the same piece of code -- the implementation
works via the `litellm` library.

We will expose only specific models for providers we enable making sure
they all work well and pass tests. This setup (instead of automatically
enabling _all_ providers and models allowed by LiteLLM) ensures we can
also perform any needed prompt tuning on a per-model basis as needed
(just like we do it for llama models.)

## Test Plan

```bash
#!/bin/bash

args=("$@")
for model in openai/gpt-4o anthropic/claude-3-5-sonnet-latest gemini/gemini-1.5-flash; do
    LLAMA_STACK_CONFIG=dev pytest -s -v tests/client-sdk/inference/test_text_inference.py \
        --embedding-model=all-MiniLM-L6-v2 \
        --vision-inference-model="" \
        --inference-model=$model "${args[@]}"
done
```
2025-02-25 22:07:33 -08:00