Commit graph

5 commits

Author SHA1 Message Date
Xi Yan
6765fd76ff
fix llama stack build for together & llama stack build from templates (#479)
# What does this PR do?

- Fix issue w/ llama stack build using together template
<img width="669" alt="image"
src="https://github.com/user-attachments/assets/1cbef052-d902-40b9-98f8-37efb494d117">

- For builds from templates, copy over the
`templates/<template-name>/run.yaml` file to the
`~/.llama/distributions/<name>/<name>-run.yaml` instead of re-building
run config.


## Test Plan

```
$ llama stack build --template together --image-type conda
..
Build spec configuration saved at /opt/anaconda3/envs/llamastack-together/together-build.yaml
Build Successful! Next steps:
   1. Set the environment variables: LLAMASTACK_PORT, TOGETHER_API_KEY
   2. `llama stack run /Users/xiyan/.llama/distributions/llamastack-together/together-run.yaml`
```

```
$ llama stack run /Users/xiyan/.llama/distributions/llamastack-together/together-run.yaml
```

```
$ llama-stack-client models list
$ pytest -v -s -m remote agents/test_agents.py --env REMOTE_STACK_URL=http://localhost:5000 --inference-model meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
```
<img width="764" alt="image"
src="https://github.com/user-attachments/assets/b805b6c5-a316-4561-8fe3-24fc3b1f8b80">


## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2024-11-18 22:29:16 -08:00
Ashwin Bharambe
2a31163178
Auto-generate distro yamls + docs (#468)
# What does this PR do?

Automatically generates
- build.yaml
- run.yaml
- run-with-safety.yaml
- parts of markdown docs

for the distributions.

## Test Plan

At this point, this only updates the YAMLs and the docs. Some testing
(especially with ollama and vllm) has been performed but needs to be
much more tested.
2024-11-18 14:57:06 -08:00
Dinesh Yeduguru
fdff24e77a
Inference to use provider resource id to register and validate (#428)
This PR changes the way model id gets translated to the final model name
that gets passed through the provider.
Major changes include:
1) Providers are responsible for registering an object and as part of
the registration returning the object with the correct provider specific
name of the model provider_resource_id
2) To help with the common look ups different names a new ModelLookup
class is created.



Tested all inference providers including together, fireworks, vllm,
ollama, meta reference and bedrock
2024-11-12 20:02:00 -08:00
Xi Yan
36e2538eb0
fix together inference validator (#393) 2024-11-07 11:31:53 -08:00
Ashwin Bharambe
994732e2e0
impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00