mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-06-27 18:50:41 +00:00
Update kotlin docs to 0.0.58 (#614)
Docs changes to reflect latest SDK version 0.0.58
This commit is contained in:
parent
2a9b13dd52
commit
53b3a1e345
1 changed files with 28 additions and 11 deletions
|
@ -8,12 +8,14 @@ Features:
|
|||
- Remote Inferencing: Perform inferencing tasks remotely with Llama models hosted on a remote connection (or serverless localhost).
|
||||
- Simple Integration: With easy-to-use APIs, a developer can quickly integrate Llama Stack in their Android app. The difference with local vs remote inferencing is also minimal.
|
||||
|
||||
Latest Release Notes: [v0.0.54.1](https://github.com/meta-llama/llama-stack-client-kotlin/releases/tag/v0.0.54.1)
|
||||
Latest Release Notes: [v0.0.58](https://github.com/meta-llama/llama-stack-client-kotlin/releases/tag/v0.0.58)
|
||||
|
||||
*Tagged releases are stable versions of the project. While we strive to maintain a stable main branch, it's not guaranteed to be free of bugs or issues.*
|
||||
|
||||
## Android Demo App
|
||||
Check out our demo app to see how to integrate Llama Stack into your Android app: [Android Demo App](https://github.com/meta-llama/llama-stack-apps/tree/main/examples/android_app)
|
||||
Check out our demo app to see how to integrate Llama Stack into your Android app: [Android Demo App](https://github.com/meta-llama/llama-stack-apps/tree/android-kotlin-app-latest/examples/android_app)
|
||||
|
||||
The key files in the app are `LlamaStackLocalInference.kt`, `LlamaStackRemoteInference.kts`, and `MainActivity.java`. With encompassed business logic, the app shows how to use Llama Stack for both the environments.
|
||||
The key files in the app are `ExampleLlamaStackLocalInference.kt`, `ExampleLlamaStackRemoteInference.kts`, and `MainActivity.java`. With encompassed business logic, the app shows how to use Llama Stack for both the environments.
|
||||
|
||||
## Quick Start
|
||||
|
||||
|
@ -22,7 +24,7 @@ The key files in the app are `LlamaStackLocalInference.kt`, `LlamaStackRemoteInf
|
|||
Add the following dependency in your `build.gradle.kts` file:
|
||||
```
|
||||
dependencies {
|
||||
implementation("com.llama.llamastack:llama-stack-client-kotlin:0.0.54.1")
|
||||
implementation("com.llama.llamastack:llama-stack-client-kotlin:0.0.58")
|
||||
}
|
||||
```
|
||||
This will download jar files in your gradle cache in a directory like `~/.gradle/caches/modules-2/files-2.1/com.llama.llamastack/`
|
||||
|
@ -34,10 +36,10 @@ If you plan on doing remote inferencing this is sufficient to get started.
|
|||
For local inferencing, it is required to include the ExecuTorch library into your app.
|
||||
|
||||
Include the ExecuTorch library by:
|
||||
1. Download the `download-prebuilt-et-lib.sh` script file from the [llama-stack-client-kotlin-client-local](https://github.com/meta-llama/llama-stack-client-kotlin/blob/release/0.0.54.1/llama-stack-client-kotlin-client-local/download-prebuilt-et-lib.sh) directory to your local machine.
|
||||
1. Download the `download-prebuilt-et-lib.sh` script file from the [llama-stack-client-kotlin-client-local](https://github.com/meta-llama/llama-stack-client-kotlin/blob/release/0.0.58/llama-stack-client-kotlin-client-local/download-prebuilt-et-lib.sh) directory to your local machine.
|
||||
2. Move the script to the top level of your Android app where the app directory resides:
|
||||
<p align="center">
|
||||
<img src="https://raw.githubusercontent.com/meta-llama/llama-stack-client-kotlin/refs/heads/release/0.0.54.1/doc/img/example_android_app_directory.png" style="width:300px">
|
||||
<img src="https://raw.githubusercontent.com/meta-llama/llama-stack-client-kotlin/refs/heads/release/0.0.58/doc/img/example_android_app_directory.png" style="width:300px">
|
||||
</p>
|
||||
|
||||
3. Run `sh download-prebuilt-et-lib.sh` to create an `app/libs` directory and download the `executorch.aar` in that path. This generates an ExecuTorch library for the XNNPACK delegate with commit: [0a12e33](https://github.com/pytorch/executorch/commit/0a12e33d22a3d44d1aa2af5f0d0673d45b962553).
|
||||
|
@ -58,12 +60,14 @@ Start a Llama Stack server on localhost. Here is an example of how you can do th
|
|||
```
|
||||
conda create -n stack-fireworks python=3.10
|
||||
conda activate stack-fireworks
|
||||
pip install llama-stack=0.0.54
|
||||
pip install llama-stack=0.0.58
|
||||
llama stack build --template fireworks --image-type conda
|
||||
export FIREWORKS_API_KEY=<SOME_KEY>
|
||||
llama stack run /Users/<your_username>/.llama/distributions/llamastack-fireworks/fireworks-run.yaml --port=5050
|
||||
```
|
||||
|
||||
Ensure the Llama Stack server version is the same as the Kotlin SDK Library for maximum compatibility.
|
||||
|
||||
Other inference providers: [Table](https://llama-stack.readthedocs.io/en/latest/index.html#supported-llama-stack-implementations)
|
||||
|
||||
How to set remote localhost in Demo App: [Settings](https://github.com/meta-llama/llama-stack-apps/tree/main/examples/android_app#settings)
|
||||
|
@ -109,7 +113,6 @@ With the Kotlin Library managing all the major operational logic, there are mini
|
|||
val result = client!!.inference().chatCompletion(
|
||||
InferenceChatCompletionParams.builder()
|
||||
.modelId(modelName)
|
||||
.putAdditionalQueryParam("seq_len", sequenceLength.toString())
|
||||
.messages(listOfMessages)
|
||||
.build()
|
||||
)
|
||||
|
@ -118,9 +121,23 @@ val result = client!!.inference().chatCompletion(
|
|||
var response = result.asChatCompletionResponse().completionMessage().content().string();
|
||||
```
|
||||
|
||||
### Setup Tool Calling
|
||||
[Remote only] For inference with a streaming response:
|
||||
|
||||
Android demo app for more details: [Tool Calling](https://github.com/meta-llama/llama-stack-apps/tree/main/examples/android_app#tool-calling)
|
||||
```
|
||||
val result = client!!.inference().chatCompletionStreaming(
|
||||
InferenceChatCompletionParams.builder()
|
||||
.modelId(modelName)
|
||||
.messages(listOfMessages)
|
||||
.build()
|
||||
)
|
||||
|
||||
// Response can be received as a asChatCompletionResponseStreamChunk as part of a callback.
|
||||
// See Android demo app for a detailed implementation example.
|
||||
```
|
||||
|
||||
### Setup Custom Tool Calling
|
||||
|
||||
Android demo app for more details: [Custom Tool Calling](https://github.com/meta-llama/llama-stack-apps/tree/main/examples/android_app#tool-calling)
|
||||
|
||||
## Advanced Users
|
||||
|
||||
|
@ -129,7 +146,7 @@ The purpose of this section is to share more details with users that would like
|
|||
### Prerequisite
|
||||
|
||||
You must complete the following steps:
|
||||
1. Clone the repo (`git clone https://github.com/meta-llama/llama-stack-client-kotlin.git -b release/0.0.54.1`)
|
||||
1. Clone the repo (`git clone https://github.com/meta-llama/llama-stack-client-kotlin.git -b release/0.0.58`)
|
||||
2. Port the appropriate ExecuTorch libraries over into your Llama Stack Kotlin library environment.
|
||||
```
|
||||
cd llama-stack-client-kotlin-client-local
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue