forked from phoenix-oss/llama-stack-mirror
cerebras template update for memory (#792)
# What does this PR do? - we no longer have meta-reference as memory provider, update cerebras template ## Test Plan ``` python llama_stack/scripts/distro_codegen.py ``` ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.
This commit is contained in:
parent
48b12b9777
commit
d1f3b032c9
37 changed files with 14 additions and 39 deletions
|
@ -1,5 +1,4 @@
|
|||
version: '2'
|
||||
name: meta-reference-quantized-gpu
|
||||
distribution_spec:
|
||||
description: Use Meta Reference with fp8, int4 quantization for running LLM inference
|
||||
providers:
|
||||
|
|
|
@ -1,6 +1,5 @@
|
|||
version: '2'
|
||||
image_name: meta-reference-quantized-gpu
|
||||
conda_env: meta-reference-quantized-gpu
|
||||
apis:
|
||||
- agents
|
||||
- datasetio
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue