llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-27 23:31:59 +00:00

History

Matthew Farrellee 99bd39cc30 feat: use openai-python for openai inference provider fixes #2121 this implementation splits reponsibility between litellm and openai libraries - \| Inference Method \| Implementation Source \| \|----------------------------\|--------------------------\| \| completion \| LiteLLMOpenAIMixin \| \| chat_completion \| LiteLLMOpenAIMixin \| \| embedding \| LiteLLMOpenAIMixin \| \| batch_completion \| LiteLLMOpenAIMixin \| \| batch_chat_completion \| LiteLLMOpenAIMixin \| \| openai_completion \| AsyncOpenAI \| \| openai_chat_completion \| AsyncOpenAI \| test with - $ OPENAI_API_KEY=$LLAMA_API_KEY OPENAI_BASE_URL=https://api.llama.com/compat/v1 llama stack build --image-type conda --image-name openai --providers inference=remote::openai --run $ llama-stack-client models register Llama-4-Scout-17B-16E-Instruct-FP8 $ curl "http://localhost:8321/v1/openai/v1/chat/completions" -H "Content-Type: application/json" \ -d '{ "model": "Llama-4-Scout-17B-16E-Instruct-FP8", "messages": [ {"role": "user", "content": "Hello Llama! Can you give me a quick intro?"} ] }' {"id":"AmPwrrkc5JgVjejPdIPrpT2","choices":[{"finish_reason":"stop","index":0,"logprobs":{"content":null,"refusal":null},"message":{"content":"Hello! I'm Llama, a Meta-designed model that adapts to your conversational style. Whether you need quick answers, deep dives into ideas, or just want to vent, joke, or brainstorm—I'm here for it. What’s on your mind?","refusal":"","role":"assistant","annotations":null,"audio":null,"function_call":null,"tool_calls":null,"id":"AmPwrrkc5JgVjejPdIPrpT2"}}],"created":1747410061,"model":"Llama-4-Scout-17B-16E-Instruct-FP8","object":"chat.completions","service_tier":null,"system_fingerprint":null,"usage":{"completion_tokens":54,"prompt_tokens":22,"total_tokens":76,"completion_tokens_details":null,"prompt_tokens_details":null}}		2025-05-16 11:47:02 -04:00
..
agents	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
datasetio	chore(refact): move paginate_records fn outside of datasetio (#2137 )	2025-05-12 10:56:14 -07:00
eval	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
inference	feat: use openai-python for openai inference provider	2025-05-16 11:47:02 -04:00
post_training	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
safety	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
tool_runtime	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
vector_io	fix: chromadb type hint (#2136 )	2025-05-12 06:27:01 -07:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00