# What does this PR do?
now that /v1/inference/completion has been removed, no docs should refer
to it
this cleans up remaining references
## Test Plan
ci
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
# What does this PR do?
- Documentation update and fix for the NVIDIA Inference provider.
- Update the `run_moderation` for safety API with a
`NotImplementedError` placeholder. Otherwise initialization NVIDIA
inference client will raise an error.
## Test Plan
N/A
# What does this PR do?
NVIDIA asymmetric embedding models (e.g.,
`nvidia/llama-3.2-nv-embedqa-1b-v2`) require an `input_type` parameter
not present in the standard OpenAI embeddings API. This PR adds the
`input_type="query"` as default and updates the documentation to suggest
using the `embedding` API for passage embeddings.
<!-- If resolving an issue, uncomment and update the line below -->
Resolves#2892
## Test Plan
```
pytest -s -v tests/integration/inference/test_openai_embeddings.py --stack-config="inference=nvidia" --embedding-model="nvidia/llama-3.2-nv-embedqa-1b-v2" --env NVIDIA_API_KEY={nvidia_api_key} --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com"
```
As the title says. Distributions is in, Templates is out.
`llama stack build --template` --> `llama stack build --distro`. For
backward compatibility, the previous option is kept but results in a
warning.
Updated `server.py` to remove the "config_or_template" backward
compatibility since it has been a couple releases since that change.
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR is responsible for removal of Conda support in Llama Stack
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes#2539
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
# What does this PR do?
Add NVIDIA platform docs that serve as a starting point for Llama Stack
users and explains all supported microservices.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
---------
Co-authored-by: Jash Gulabrai <jgulabrai@nvidia.com>