diff --git a/docs/my-website/docs/observability/wandb_integration.md b/docs/my-website/docs/observability/wandb_integration.md deleted file mode 100644 index 37057f43db..0000000000 --- a/docs/my-website/docs/observability/wandb_integration.md +++ /dev/null @@ -1,61 +0,0 @@ -import Image from '@theme/IdealImage'; - -# Weights & Biases - Logging LLM Input/Output - - -:::tip - -This is community maintained, Please make an issue if you run into a bug -https://github.com/BerriAI/litellm - -::: - - -Weights & Biases helps AI developers build better models faster https://wandb.ai - - - -:::info -We want to learn how we can make the callbacks better! Meet the LiteLLM [founders](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version) or -join our [discord](https://discord.gg/wuPM9dRgDw) -::: - -## Pre-Requisites -Ensure you have run `pip install wandb` for this integration -```shell -pip install wandb litellm -``` - -## Quick Start -Use just 2 lines of code, to instantly log your responses **across all providers** with Weights & Biases - -```python -litellm.success_callback = ["wandb"] -``` -```python -# pip install wandb -import litellm -import os - -os.environ["WANDB_API_KEY"] = "" -# LLM API Keys -os.environ['OPENAI_API_KEY']="" - -# set wandb as a callback, litellm will send the data to Weights & Biases -litellm.success_callback = ["wandb"] - -# openai call -response = litellm.completion( - model="gpt-3.5-turbo", - messages=[ - {"role": "user", "content": "Hi 👋 - i'm openai"} - ] -) -``` - -## Support & Talk to Founders - -- [Schedule Demo 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version) -- [Community Discord 💭](https://discord.gg/wuPM9dRgDw) -- Our numbers 📞 +1 (770) 8783-106 / ‭+1 (412) 618-6238‬ -- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai \ No newline at end of file diff --git a/docs/my-website/docs/observability/weave_integration.md b/docs/my-website/docs/observability/weave_integration.md new file mode 100644 index 0000000000..c688c8f18e --- /dev/null +++ b/docs/my-website/docs/observability/weave_integration.md @@ -0,0 +1,171 @@ +import Image from '@theme/IdealImage'; + +# Weights & Biases Weave - Tracing and Evaluation + +## What is W&B Weave? + +Weights and Biases (W&B) Weave is a framework for tracking, experimenting with, evaluating, deploying, and improving LLM-based applications. Designed for flexibility and scalability, Weave supports every stage of your LLM application development workflow. + +W&B Weave's integration with LiteLLM enables you to trace, version control and debug your LLM applications. It enables you to easily evaluate your AI systems with the flexibility of LiteLLM. + +Get started with just 2 lines of code and track your LiteLLM calls with W&B Weave. Learn more about W&B Weave [here](https://weave-docs.wandb.ai). + + + +## Quick Start + +Install W&B Weave +```shell +pip install weave +``` + +Use just 2 lines of code, to instantly log your responses **across all providers** with Weave. + +```python +import weave + +weave_client = weave.init("my-llm-application") +``` + +You will be asked to set your W&B API key for authentication. Get your free API key [here](https://wandb.ai/authorize). + +Once done, you can use LiteLLM as usual. + +```python +import litellm +import os + +# Set your LLM provider's API key +os.environ["OPENAI_API_KEY"] = "" + +# Call LiteLLM with the model you want to use +messages = [ + {"role": "user", "content": "What is the meaning of life?"} +] + +response = litellm.completion(model="gpt-4o", messages=messages) +print(response) +``` + +You will get a Weave URL in the stdout. Open it up to see the trace, cost, token usage, and more! + + + +## Building a simple LLM application + +Now let's use LiteLLM and W&B Weave to build a simple LLM application to translate text from source language to target language. + +The function `translate` takes in a text and target language, and returns the translated text using the model of your choice. Note that the `translate` function is decorated with [`weave.op()`](https://weave-docs.wandb.ai/guides/tracking/ops). This is how W&B Weave knows that this function is a part of your application and will be traced when called along with the inputs to the function and the output(s) from the function. + +Since the underlying LiteLLM calls are automatically traced, you can see a nested trace of the LiteLLM call(s) made with details like the model, cost, token usage, etc. + +```python +@weave.op() +def translate(text: str, target_language: str, model: str) -> str: + response = litellm.completion( + model=model, + messages=[ + {"role": "user", "content": f"Translate '{text}' to {target_language}"} + ], + ) + return response.choices[0].message.content + +print(translate("Hello, how are you?", "French", "gpt-4o")) +``` + + + + +## Building an evaluation pipeline + +LiteLLM is powerful for building evaluation pipelines because of the flexibility it provides. Together with W&B Weave, building such pipelines is super easy. + +Below we are building an evaluation pipeline to evaluate LLM's ability to solve maths problems. We first need an evaluation dataset. + +```python +samples = [ + {"question": "What is the sum of 45 and 67?", "answer": "112"}, + {"question": "If a triangle has sides 3 cm, 4 cm, and 5 cm, what is its area?", "answer": "6 square cm"}, + {"question": "What is the derivative of x^2 + 3x with respect to x?", "answer": "2x + 3"}, + {"question": "What is the result of 12 multiplied by 8?", "answer": "96"}, + {"question": "What is the value of 10! (10 factorial)?", "answer": "3628800"} +] +``` + +Next up we write a simple function that can take in a sample question and return the solution to the problem. We will write this function as a method (`predict`) of our `SimpleMathsSolver` class which is inheriting from the [`weave.Model`](https://weave-docs.wandb.ai/guides/core-types/models) class. This allows us to easily track the attributes (hyperparameters) of our model. + +```python +class SimpleMathsSolver(weave.Model): + model_name: str + temperature: float + + @weave.op() + def predict(self, question: str) -> str: + response = litellm.completion( + model=self.model_name, + messages=[ + { + "role": "system", + "content": "You are given maths problems. Think step by step to solve it. Only return the exact answer without any explanation in \\boxed{}" + }, + { + "role": "user", + "content": f"{question}" + } + ], + ) + return response.choices[0].message.content + +maths_solver = SimpleMathsSolver( + model_name="gpt-4o", + temperature=0.0, +) + +print(maths_solver.predict("What is 2+3?")) +``` + + + +Now what we have the dataset and the model, let's define a simple exact match evaluation metric and setup our evaluation pipeline using [`weave.Evaluation`](https://weave-docs.wandb.ai/guides/core-types/evaluations). + +```python +@weave.op() +def exact_match(answer: str, output: str): + pattern = r"\\boxed\{(.+?)\}" + match = re.search(pattern, output) + + if match: + extracted_value = match.group(1) + is_correct = extracted_value == answer + return is_correct + else: + return None + +evaluation_pipeline = weave.Evaluation( + dataset=samples, scorers=[exact_match] +) + +asyncio.run(evaluation_pipeline.evaluate(maths_solver)) +``` + +The evaluation page will show as below. Here you can see the overall score as well as the score for each sample. This is a powerful way to debug the limitations of your LLM application while keeping track of everything that matters in a sane way. + + + +Now say you want to compare the performance of your current model with a different model using the comparison feature in the UI. LiteLLM's flexibility allows you to do this easily and W&B Weave evaluation pipeline will help you do this in a structured way. + +```python +new_maths_solver = SimpleMathsSolver( + model_name="gpt-3.5-turbo", + temperature=0.0, +) + +asyncio.run(evaluation_pipeline.evaluate(new_maths_solver)) +``` + + + +## Support + +* For advanced usage of Weave, visit the [Weave documentation](https://weave-docs.wandb.ai). +* For any question or issue with this integration, please [submit an issue](https://github.com/wandb/weave/issues/new?template=Blank+issue) on our [Github](https://github.com/wandb/weave) repository! diff --git a/docs/my-website/img/weave_comparison_view.png b/docs/my-website/img/weave_comparison_view.png new file mode 100644 index 0000000000..eab1a5a087 Binary files /dev/null and b/docs/my-website/img/weave_comparison_view.png differ diff --git a/docs/my-website/img/weave_evaluation.png b/docs/my-website/img/weave_evaluation.png new file mode 100644 index 0000000000..d4dccb8935 Binary files /dev/null and b/docs/my-website/img/weave_evaluation.png differ diff --git a/docs/my-website/img/weave_litellm.png b/docs/my-website/img/weave_litellm.png new file mode 100644 index 0000000000..d9e86e6d3b Binary files /dev/null and b/docs/my-website/img/weave_litellm.png differ diff --git a/docs/my-website/img/weave_maths_solver.png b/docs/my-website/img/weave_maths_solver.png new file mode 100644 index 0000000000..c8acf2872a Binary files /dev/null and b/docs/my-website/img/weave_maths_solver.png differ diff --git a/docs/my-website/img/weave_trace.png b/docs/my-website/img/weave_trace.png new file mode 100644 index 0000000000..50bc308737 Binary files /dev/null and b/docs/my-website/img/weave_trace.png differ diff --git a/docs/my-website/img/weave_trace_application.png b/docs/my-website/img/weave_trace_application.png new file mode 100644 index 0000000000..d646e6a5fa Binary files /dev/null and b/docs/my-website/img/weave_trace_application.png differ diff --git a/docs/my-website/sidebars.js b/docs/my-website/sidebars.js index 60030a59bb..e971ed70cd 100644 --- a/docs/my-website/sidebars.js +++ b/docs/my-website/sidebars.js @@ -434,7 +434,7 @@ const sidebars = { "observability/helicone_integration", "observability/openmeter", "observability/promptlayer_integration", - "observability/wandb_integration", + "observability/weave_integration", "observability/slack_integration", "observability/athina_integration", "observability/greenscale_integration", diff --git a/litellm/integrations/weights_biases.py b/litellm/integrations/weights_biases.py index 63d87c9bd9..cc0b7fcc3e 100644 --- a/litellm/integrations/weights_biases.py +++ b/litellm/integrations/weights_biases.py @@ -197,6 +197,7 @@ class WeightsBiasesLogger: try: print_verbose(f"W&B Logging - Enters logging function for model {kwargs}") + print_verbose("`WeightsBiasesLogger` is deprecated. Please use the new W&B `weave` integration instead.") run = wandb.init() print_verbose(response_obj)