Ashwin Bharambe
6229562760
Organize references
2024-11-22 16:46:45 -08:00
Ashwin Bharambe
97dc5b68e5
model -> model_id for TGI
2024-11-22 15:40:08 -08:00
Ashwin Bharambe
c2c53d0272
More doc cleanup
2024-11-22 14:37:22 -08:00
Ashwin Bharambe
900b0556e7
Much more documentation work, things are getting a bit consumable right now
2024-11-22 14:06:18 -08:00
Ashwin Bharambe
98e213e96c
More docs work
2024-11-22 14:06:18 -08:00
Ashwin Bharambe
eb2063bc3d
Updates to the main doc page
2024-11-22 14:06:18 -08:00
Ashwin Bharambe
a0a00f1345
Update telemetry to have TEXT be the default log format
2024-11-21 15:18:45 -08:00
Ashwin Bharambe
55c55b9f51
Update Quick Start significantly
2024-11-21 13:20:55 -08:00
Ashwin Bharambe
cf079a22a0
Plurals
2024-11-20 23:24:59 -08:00
Ashwin Bharambe
cd6ccb664c
Integrate distro docs into the restructured docs
2024-11-20 23:20:05 -08:00
Ashwin Bharambe
2411a44833
Update more distribution docs to be simpler and partially codegen'ed
2024-11-20 22:03:44 -08:00
Dinesh Yeduguru
b3f9e8b2f2
Restructure docs ( #494 )
...
Rendered docs at: https://llama-stack.readthedocs.io/en/doc-simplify/
2024-11-20 15:54:47 -08:00
Xi Yan
b0fdf7552a
docs
2024-11-19 16:41:45 -08:00
Xi Yan
c49acc5226
docs
2024-11-19 16:39:40 -08:00
Xi Yan
f78200b189
docs
2024-11-19 16:37:30 -08:00
Xi Yan
2da93c8835
fix 3.2-1b fireworks
2024-11-19 14:20:07 -08:00
Xi Yan
189df6358a
codegen docs
2024-11-19 14:16:00 -08:00
Xi Yan
1b0f5fff5a
fix curl endpoint
2024-11-19 10:26:05 -08:00
Ashwin Bharambe
e8d3eee095
Fix docs yet again
2024-11-18 23:51:35 -08:00
Ashwin Bharambe
d463d68e1e
Update docs
2024-11-18 23:21:25 -08:00
Ashwin Bharambe
7693786322
Use HF names for registering fireworks and together models
2024-11-18 22:34:47 -08:00
Riandy
2108a779f2
Update kotlin client docs ( #476 )
...
# What does this PR do?
In short, provide a summary of what this PR does and why. Usually, the
relevant context should be present in a linked issue.
Add Kotlin package link into readme docs
2024-11-19 08:43:20 +05:30
Ashwin Bharambe
939056e265
More documentation fixes
2024-11-18 17:06:13 -08:00
Ashwin Bharambe
e40404625b
Update to docs
2024-11-18 16:52:48 -08:00
Ashwin Bharambe
afa4f0b19f
Update remote vllm docs
2024-11-18 16:34:33 -08:00
Ashwin Bharambe
47c37fd831
Fixes
2024-11-18 16:03:53 -08:00
Ashwin Bharambe
3aedde2ab4
Add a pre-commit for distro_codegen but it does not work yet
2024-11-18 15:21:13 -08:00
Ashwin Bharambe
2a31163178
Auto-generate distro yamls + docs ( #468 )
...
# What does this PR do?
Automatically generates
- build.yaml
- run.yaml
- run-with-safety.yaml
- parts of markdown docs
for the distributions.
## Test Plan
At this point, this only updates the YAMLs and the docs. Some testing
(especially with ollama and vllm) has been performed but needs to be
much more tested.
2024-11-18 14:57:06 -08:00
Xi Yan
59a65e34d3
Update new_api_provider.md
2024-11-13 00:02:13 -05:00
Dinesh Yeduguru
fdff24e77a
Inference to use provider resource id to register and validate ( #428 )
...
This PR changes the way model id gets translated to the final model name
that gets passed through the provider.
Major changes include:
1) Providers are responsible for registering an object and as part of
the registration returning the object with the correct provider specific
name of the model provider_resource_id
2) To help with the common look ups different names a new ModelLookup
class is created.
Tested all inference providers including together, fireworks, vllm,
ollama, meta reference and bedrock
2024-11-12 20:02:00 -08:00
Ashwin Bharambe
3d7561e55c
Rename all inline providers with an inline:: prefix ( #423 )
2024-11-11 22:19:16 -08:00
Ashwin Bharambe
c1f7ba3aed
Split safety into (llama-guard, prompt-guard, code-scanner) ( #400 )
...
Splits the meta-reference safety implementation into three distinct providers:
- inline::llama-guard
- inline::prompt-guard
- inline::code-scanner
Note that this PR is a backward incompatible change to the llama stack server. I have added deprecation_error field to ProviderSpec -- the server reads it and immediately barfs. This is used to direct the user with a specific message on what action to perform. An automagical "config upgrade" is a bit too much work to implement right now :/
(Note that we will be gradually prefixing all inline providers with inline:: -- I am only doing this for this set of new providers because otherwise existing configuration files will break even more badly.)
2024-11-11 09:29:18 -08:00
Xi Yan
b0b9c905b3
docs
2024-11-09 10:22:41 -08:00
Xi Yan
cc61fd8083
docs
2024-11-09 09:00:18 -08:00
Xi Yan
0c14761453
docs
2024-11-09 08:57:51 -08:00
Ashwin Bharambe
4986e46188
Distributions updates (slight updates to ollama, add inline-vllm and remote-vllm) ( #408 )
...
* remote vllm distro
* add inline-vllm details, fix things
* Write some docs
2024-11-08 18:09:39 -08:00
Xi Yan
bd0622ef10
update docs
2024-11-08 12:47:05 -08:00
Xi Yan
7ee9f8d8ac
rename
2024-11-08 10:34:48 -08:00
Xi Yan
b1d7376730
kill tgi/cpu
2024-11-08 10:33:45 -08:00
Xi Yan
8350f2df4c
[docs] refactor remote-hosted distro ( #402 )
...
* move docs
* docs
2024-11-07 19:16:38 -08:00
Ashwin Bharambe
064d2a5287
Remove the safety adapter for Together; we can just use "meta-reference" ( #387 )
2024-11-06 17:36:57 -08:00
Ashwin Bharambe
994732e2e0
impls
-> inline
, adapters
-> remote
(#381 )
2024-11-06 14:54:05 -08:00
Dinesh Yeduguru
093c9f1987
add bedrock distribution code ( #358 )
...
* add bedrock distribution code
* fix linter error
* add bedrock shields support
* linter fixes
* working bedrock safety
* change to return only one violation
* remove env var reading
* refereshable boto credentials
* remove env vars
* address raghu's feedback
* fix session_ttl passing
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-06 14:39:11 -08:00
Xi Yan
748606195b
Kill llama stack configure
( #371 )
...
* remove configure
* build msg
* wip
* build->run
* delete prints
* docs
* fix docs, kill configure
* precommit
* update fireworks build
* docs
* clean up build
* comments
* fix
* test
* remove baking build.yaml into docker
* fix msg, urls
* configure msg
2024-11-06 13:32:10 -08:00
Xi Yan
db30809141
precommit
2024-11-05 15:26:13 -08:00
Xi Yan
0706f6c82f
add Llama3.2-3B-Instruct:int4-qlora-eo8
2024-11-05 15:22:26 -08:00
Xi Yan
16b7fa4614
quantized model docs
2024-11-05 15:21:13 -08:00
Xi Yan
c810a4184d
[docs] update documentations ( #356 )
...
* move docs -> source
* Add files via upload
* mv image
* Add files via upload
* colocate iOS setup doc
* delete image
* Add files via upload
* fix
* delete image
* Add files via upload
* Update developer_cookbook.md
* toctree
* wip subfolder
* docs update
* subfolder
* updates
* name
* updates
* index
* updates
* refactor structure
* depth
* docs
* content
* docs
* getting started
* distributions
* fireworks
* fireworks
* update
* theme
* theme
* theme
* pdj theme
* pytorch theme
* css
* theme
* agents example
* format
* index
* headers
* copy button
* test tabs
* test tabs
* fix
* tabs
* tab
* tabs
* sphinx_design
* quick start commands
* size
* width
* css
* css
* download models
* asthetic fix
* tab format
* update
* css
* width
* css
* docs
* tab based
* tab
* tabs
* docs
* style
* image
* css
* color
* typo
* update docs
* missing links
* list templates
* links
* links update
* troubleshooting
* fix
* distributions
* docs
* fix table
* kill llamastack-local-gpu/cpu
* Update index.md
* Update index.md
* mv ios_setup.md
* Update ios_setup.md
* Add remote_or_local.gif
* Update ios_setup.md
* release notes
* typos
* Add ios_setup to index
* nav bar
* hide torctree
* ios image
* links update
* rename
* rename
* docs
* rename
* links
* distributions
* distributions
* distributions
* distributions
* remove release
* remote
---------
Co-authored-by: dltn <6599399+dltn@users.noreply.github.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2024-11-04 16:52:38 -08:00
Ashwin Bharambe
4aa1bf6a60
Kill --name from llama stack build ( #340 )
2024-10-28 23:07:32 -07:00
raghotham
e2a5a2e10d
first version of readthedocs ( #278 )
2024-10-22 10:15:58 +05:30