llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

History

Wen Zhou 4bca4af3e4 Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 4s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.12, post_training) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 5s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 26s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 20s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 36s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.12, inference) (push) Failing after 22s Details Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 5s Details Integration Tests / test-matrix (server, 3.12, datasets) (push) Failing after 32s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.12, inspect) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 22s Details Integration Tests / test-matrix (server, 3.12, agents) (push) Failing after 16s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 24s Details Integration Tests / test-matrix (server, 3.12, providers) (push) Failing after 20s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 18s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 20s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 34s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 33s Details Integration Tests / test-matrix (server, 3.12, tool_runtime) (push) Failing after 30s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Python Package Build Test / build (3.13) (push) Failing after 39s Details Update ReadTheDocs / update-readthedocs (push) Failing after 41s Details Unit Tests / unit-tests (3.12) (push) Failing after 46s Details Pre-commit / pre-commit (push) Successful in 1m30s Details refactor: set proper name for embedding all-minilm:l6-v2 and update to use "starter" in detailed_tutorial (#2627 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - we are using `all-minilm:l6-v2` but the model we download from ollama is `all-minilm:latest` latest: https://ollama.com/library/all-minilm:latest 1b226e2802db l6-v2: https://ollama.com/library/all-minilm:l6-v2 pin 1b226e2802db - even currently they are exactly the same model but if [all-minilm:l12-v2](https://ollama.com/library/all-minilm:l12-v2) is updated, "latest" might not be the same for l6-v2. - the only change in this PR is pin the model id in ollama - also update detailed_tutorial with "starter" to replace deprecated "ollama". <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ``` >INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" >llama stack build --run --template ollama --image-type venv ... Build Successful! You can find the newly-built template here: /home/wenzhou/zdtsw-forking/lls/llama-stack/llama_stack/templates/ollama/run.yaml .... - metadata: embedding_dimension: 384 model_id: all-MiniLM-L6-v2 model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType - embedding provider_id: ollama provider_model_id: all-minilm:l6-v2 ... ``` test ``` >llama-stack-client inference chat-completion --message "Write me a 2-sentence poem about the moon" INFO:httpx:HTTP Request: GET http://localhost:8321/v1/models "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/chat/completions "HTTP/1.1 200 OK" OpenAIChatCompletion( id='chatcmpl-04f99071-3da2-44ba-a19f-03b5b7fc70b7', choices=[ OpenAIChatCompletionChoice( finish_reason='stop', index=0, message=OpenAIChatCompletionChoiceMessageOpenAIAssistantMessageParam( role='assistant', content="Here is a 2-sentence poem about the moon:\n\nSilver crescent in the midnight sky,\nLuna's gentle face, a beauty to the eye.", name=None, tool_calls=None, refusal=None, annotations=None, audio=None, function_call=None ), logprobs=None ) ], created=1751644429, model='llama3.2:3b-instruct-fp16', object='chat.completion', service_tier=None, system_fingerprint='fp_ollama', usage={'completion_tokens': 33, 'prompt_tokens': 36, 'total_tokens': 69, 'completion_tokens_details': None, 'prompt_tokens_details': None} ) ``` --------- Signed-off-by: Wen Zhou <wenzhou@redhat.com>		2025-07-06 09:07:37 +05:30
..
building_applications	feat: improve telemetry (#2590 )	2025-07-04 17:29:09 +02:00
concepts	docs: specify the ability to train non-Llama models (#2573 )	2025-07-01 19:29:06 +05:30
contributing	docs: revamp testing documentation (#2155 )	2025-05-13 11:28:29 -07:00
distributions	refactor: set proper name for embedding all-minilm:l6-v2 and update to use "starter" in detailed_tutorial (#2627 )	2025-07-06 09:07:37 +05:30
getting_started	refactor: set proper name for embedding all-minilm:l6-v2 and update to use "starter" in detailed_tutorial (#2627 )	2025-07-06 09:07:37 +05:30
introduction	docs: Remove mentions of focus on Llama models (#1690 )	2025-03-19 00:17:22 -04:00
openai	docs: Add OpenAI API compatibility page (#2316 )	2025-06-04 06:51:52 -04:00
playground	chore: simplify running the demo UI (#1907 )	2025-04-09 11:22:29 -07:00
providers	feat: improve telemetry (#2590 )	2025-07-04 17:29:09 +02:00
references	chore: remove last instances of code-interpreter provider (#2143 )	2025-05-12 10:54:43 -07:00
conf.py	fix: use pypi browser agent (#2260 )	2025-05-24 23:26:30 -07:00
index.md	docs: update full list of providers with matched APIs and dockerhub images (#2452 )	2025-07-03 10:12:56 +02:00