llama-stack-mirror/llama_stack/distribution
Hardik Shah a51c8b4efc
Convert SamplingParams.strategy to a union (#767)
# What does this PR do?

Cleans up how we provide sampling params. Earlier, strategy was an enum
and all params (top_p, temperature, top_k) across all strategies were
grouped. We now have a strategy union object with each strategy (greedy,
top_p, top_k) having its corresponding params.
Earlier, 
```
class SamplingParams: 
    strategy: enum ()
    top_p, temperature, top_k and other params
```
However, the `strategy` field was not being used in any providers making
it confusing to know the exact sampling behavior purely based on the
params since you could pass temperature, top_p, top_k and how the
provider would interpret those would not be clear.

Hence we introduced -- a union where the strategy and relevant params
are all clubbed together to avoid this confusion.

Have updated all providers, tests, notebooks, readme and otehr places
where sampling params was being used to use the new format.
   

## Test Plan
`pytest llama_stack/providers/tests/inference/groq/test_groq_utils.py`
// inference on ollama, fireworks and together 
`with-proxy pytest -v -s -k "ollama"
--inference-model="meta-llama/Llama-3.1-8B-Instruct"
llama_stack/providers/tests/inference/test_text_inference.py `
// agents on fireworks 
`pytest -v -s -k 'fireworks and create_agent'
--inference-model="meta-llama/Llama-3.1-8B-Instruct"
llama_stack/providers/tests/agents/test_agents.py
--safety-shield="meta-llama/Llama-Guard-3-8B"`

## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [X] Ran pre-commit to handle lint / formatting issues.
- [X] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [X] Updated relevant documentation.
- [X] Wrote necessary unit or integration tests.

---------

Co-authored-by: Hardik Shah <hjshah@fb.com>
2025-01-15 05:38:51 -08:00
..
routers remove conflicting default for tool prompt format in chat completion (#742) 2025-01-10 10:41:53 -08:00
server rename LLAMASTACK_PORT to LLAMA_STACK_PORT for consistency with other env vars (#744) 2025-01-10 11:09:49 -08:00
store Fix broken tests in test_registry (#707) 2025-01-14 14:33:15 -08:00
ui Convert SamplingParams.strategy to a union (#767) 2025-01-15 05:38:51 -08:00
utils Ensure model_local_dir does not mangle "C:\" on Windows 2024-11-24 14:18:59 -08:00
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00
build.py Switch to use importlib instead of deprecated pkg_resources (#678) 2025-01-13 20:20:02 -08:00
build_conda_env.sh added support of PYPI_VERSION in stack build (#762) 2025-01-14 13:45:42 -08:00
build_container.sh added support of PYPI_VERSION in stack build (#762) 2025-01-14 13:45:42 -08:00
build_venv.sh Miscellaneous fixes around telemetry, library client and run yaml autogen 2024-12-08 20:40:22 -08:00
client.py use API version in "remote" stack client 2024-11-19 15:59:47 -08:00
common.sh API Updates (#73) 2024-09-17 19:51:35 -07:00
configure.py [remove import *] clean up import *'s (#689) 2024-12-27 15:45:44 -08:00
configure_container.sh docker: Check for selinux before using --security-opt (#167) 2024-10-02 10:37:41 -07:00
datatypes.py agents to use tools api (#673) 2025-01-08 19:01:00 -08:00
distribution.py Tools API with brave and MCP providers (#639) 2024-12-19 21:25:17 -08:00
inspect.py add --version to llama stack CLI & /version endpoint (#732) 2025-01-08 16:30:06 -08:00
library_client.py [bugfix] fix streaming GeneratorExit exception with LlamaStackAsLibraryClient (#760) 2025-01-14 10:58:46 -08:00
request_headers.py Add X-LlamaStack-Client-Version, rename ProviderData -> Provider-Data (#735) 2025-01-09 11:51:36 -08:00
resolver.py agents to use tools api (#673) 2025-01-08 19:01:00 -08:00
stack.py Switch to use importlib instead of deprecated pkg_resources (#678) 2025-01-13 20:20:02 -08:00
start_conda_env.sh Move to use argparse, fix issues with multiple --env cmdline options 2024-11-18 16:31:59 -08:00
start_container.sh rename LLAMASTACK_PORT to LLAMA_STACK_PORT for consistency with other env vars (#744) 2025-01-10 11:09:49 -08:00