add v4

2023-07-29 07:12:19 -07:00 · 2023-07-29 07:12:19 -07:00 · a168cf8b9c
commit a168cf8b9c
parent 2cf949990e
832 changed files with 161273 additions and 0 deletions
--- a/docs/snippets/modules/model_io/models/chat/get_started.mdx
+++ b/docs/snippets/modules/model_io/models/chat/get_started.mdx
@ -0,0 +1,120 @@
+### Setup
+
+To start we'll need to install the OpenAI Python package:
+
+```bash
+pip install openai
+```
+
+Accessing the API requires an API key, which you can get by creating an account and heading [here](https://platform.openai.com/account/api-keys). Once we have a key we'll want to set it as an environment variable by running:
+
+```bash
+export OPENAI_API_KEY="..."
+```
+If you'd prefer not to set an environment variable you can pass the key in directly via the `openai_api_key` named parameter when initiating the OpenAI LLM class:
+
+```python
+from langchain.chat_models import ChatOpenAI
+
+chat = ChatOpenAI(openai_api_key="...")
+```
+
+otherwise you can initialize without any params:
+```python
+from langchain.chat_models import ChatOpenAI
+
+chat = ChatOpenAI()
+```
+
+### Messages
+
+The chat model interface is based around messages rather than raw text.
+The types of messages currently supported in LangChain are `AIMessage`, `HumanMessage`, `SystemMessage`, and `ChatMessage` -- `ChatMessage` takes in an arbitrary role parameter. Most of the time, you'll just be dealing with `HumanMessage`, `AIMessage`, and `SystemMessage`
+
+### `__call__`
+#### Messages in -> message out
+
+You can get chat completions by passing one or more messages to the chat model. The response will be a message.
+
+```python
+from langchain.schema import (
+    AIMessage,
+    HumanMessage,
+    SystemMessage
+)
+
+chat([HumanMessage(content="Translate this sentence from English to French: I love programming.")])
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    AIMessage(content="J'aime programmer.", additional_kwargs={})
+```
+
+</CodeOutputBlock>
+
+OpenAI's chat model supports multiple messages as input. See [here](https://platform.openai.com/docs/guides/chat/chat-vs-completions) for more information. Here is an example of sending a system and user message to the chat model:
+
+
+```python
+messages = [
+    SystemMessage(content="You are a helpful assistant that translates English to French."),
+    HumanMessage(content="I love programming.")
+]
+chat(messages)
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    AIMessage(content="J'aime programmer.", additional_kwargs={})
+```
+
+</CodeOutputBlock>
+
+### `generate`
+#### Batch calls, richer outputs
+
+You can go one step further and generate completions for multiple sets of messages using `generate`. This returns an `LLMResult` with an additional `message` parameter.
+
+```python
+batch_messages = [
+    [
+        SystemMessage(content="You are a helpful assistant that translates English to French."),
+        HumanMessage(content="I love programming.")
+    ],
+    [
+        SystemMessage(content="You are a helpful assistant that translates English to French."),
+        HumanMessage(content="I love artificial intelligence.")
+    ],
+]
+result = chat.generate(batch_messages)
+result
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    LLMResult(generations=[[ChatGeneration(text="J'aime programmer.", generation_info=None, message=AIMessage(content="J'aime programmer.", additional_kwargs={}))], [ChatGeneration(text="J'aime l'intelligence artificielle.", generation_info=None, message=AIMessage(content="J'aime l'intelligence artificielle.", additional_kwargs={}))]], llm_output={'token_usage': {'prompt_tokens': 57, 'completion_tokens': 20, 'total_tokens': 77}})
+```
+
+</CodeOutputBlock>
+
+You can recover things like token usage from this LLMResult
+
+
+```python
+result.llm_output
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    {'token_usage': {'prompt_tokens': 57,
+      'completion_tokens': 20,
+      'total_tokens': 77}}
+```
+
+</CodeOutputBlock>
+
--- a/docs/snippets/modules/model_io/models/chat/how_to/chat_model_caching.mdx
+++ b/docs/snippets/modules/model_io/models/chat/how_to/chat_model_caching.mdx
@ -0,0 +1,97 @@
+```python
+import langchain
+from langchain.chat_models import ChatOpenAI
+
+llm = ChatOpenAI()
+```
+
+## In Memory Cache
+
+
+```python
+from langchain.cache import InMemoryCache
+langchain.llm_cache = InMemoryCache()
+
+# The first time, it is not yet in cache, so it should take longer
+llm.predict("Tell me a joke")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    CPU times: user 35.9 ms, sys: 28.6 ms, total: 64.6 ms
+    Wall time: 4.83 s
+    
+
+    "\n\nWhy couldn't the bicycle stand up by itself? It was...two tired!"
+```
+
+</CodeOutputBlock>
+
+
+```python
+# The second time it is, so it goes faster
+llm.predict("Tell me a joke")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    CPU times: user 238 µs, sys: 143 µs, total: 381 µs
+    Wall time: 1.76 ms
+
+
+    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'
+```
+
+</CodeOutputBlock>
+
+## SQLite Cache
+
+
+```bash
+rm .langchain.db
+```
+
+
+```python
+# We can do the same thing with a SQLite cache
+from langchain.cache import SQLiteCache
+langchain.llm_cache = SQLiteCache(database_path=".langchain.db")
+```
+
+
+```python
+# The first time, it is not yet in cache, so it should take longer
+llm.predict("Tell me a joke")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    CPU times: user 17 ms, sys: 9.76 ms, total: 26.7 ms
+    Wall time: 825 ms
+
+
+    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'
+```
+
+</CodeOutputBlock>
+
+
+```python
+# The second time it is, so it goes faster
+llm.predict("Tell me a joke")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    CPU times: user 2.46 ms, sys: 1.23 ms, total: 3.7 ms
+    Wall time: 2.67 ms
+
+
+    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'
+```
+
+</CodeOutputBlock>
--- a/docs/snippets/modules/model_io/models/chat/how_to/llm_chain.mdx
+++ b/docs/snippets/modules/model_io/models/chat/how_to/llm_chain.mdx
@ -0,0 +1,16 @@
+```python
+chain = LLMChain(llm=chat, prompt=chat_prompt)
+```
+
+
+```python
+chain.run(input_language="English", output_language="French", text="I love programming.")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    "J'adore la programmation."
+```
+
+</CodeOutputBlock>
--- a/docs/snippets/modules/model_io/models/chat/how_to/prompts.mdx
+++ b/docs/snippets/modules/model_io/models/chat/how_to/prompts.mdx
@ -0,0 +1,47 @@
+You can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplates`. You can use `ChatPromptTemplate`'s `format_prompt` -- this returns a `PromptValue`, which you can convert to a string or Message object, depending on whether you want to use the formatted value as input to an llm or chat model.
+
+For convenience, there is a `from_template` method exposed on the template. If you were to use this template, this is what it would look like:
+
+
+```python
+from langchain import PromptTemplate
+from langchain.prompts.chat import (
+    ChatPromptTemplate,
+    SystemMessagePromptTemplate,
+    AIMessagePromptTemplate,
+    HumanMessagePromptTemplate,
+)
+
+template="You are a helpful assistant that translates {input_language} to {output_language}."
+system_message_prompt = SystemMessagePromptTemplate.from_template(template)
+human_template="{text}"
+human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
+```
+
+
+```python
+chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])
+
+# get a chat completion from the formatted messages
+chat(chat_prompt.format_prompt(input_language="English", output_language="French", text="I love programming.").to_messages())
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    AIMessage(content="J'adore la programmation.", additional_kwargs={})
+```
+
+</CodeOutputBlock>
+
+If you wanted to construct the MessagePromptTemplate more directly, you could create a PromptTemplate outside and then pass it in, eg:
+
+
+```python
+prompt=PromptTemplate(
+    template="You are a helpful assistant that translates {input_language} to {output_language}.",
+    input_variables=["input_language", "output_language"],
+)
+system_message_prompt = SystemMessagePromptTemplate(prompt=prompt)
+```
+
--- a/docs/snippets/modules/model_io/models/chat/how_to/streaming.mdx
+++ b/docs/snippets/modules/model_io/models/chat/how_to/streaming.mdx
@ -0,0 +1,59 @@
+```python
+from langchain.chat_models import ChatOpenAI
+from langchain.schema import (
+    HumanMessage,
+)
+
+
+from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
+chat = ChatOpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)
+resp = chat([HumanMessage(content="Write me a song about sparkling water.")])
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    Verse 1:
+    Bubbles rising to the top
+    A refreshing drink that never stops
+    Clear and crisp, it's pure delight
+    A taste that's sure to excite
+    
+    Chorus:
+    Sparkling water, oh so fine
+    A drink that's always on my mind
+    With every sip, I feel alive
+    Sparkling water, you're my vibe
+    
+    Verse 2:
+    No sugar, no calories, just pure bliss
+    A drink that's hard to resist
+    It's the perfect way to quench my thirst
+    A drink that always comes first
+    
+    Chorus:
+    Sparkling water, oh so fine
+    A drink that's always on my mind
+    With every sip, I feel alive
+    Sparkling water, you're my vibe
+    
+    Bridge:
+    From the mountains to the sea
+    Sparkling water, you're the key
+    To a healthy life, a happy soul
+    A drink that makes me feel whole
+    
+    Chorus:
+    Sparkling water, oh so fine
+    A drink that's always on my mind
+    With every sip, I feel alive
+    Sparkling water, you're my vibe
+    
+    Outro:
+    Sparkling water, you're the one
+    A drink that's always so much fun
+    I'll never let you go, my friend
+    Sparkling
+```
+
+</CodeOutputBlock>
--- a/docs/snippets/modules/model_io/models/llms/get_started.mdx
+++ b/docs/snippets/modules/model_io/models/llms/get_started.mdx
@ -0,0 +1,108 @@
+### Setup
+
+To start we'll need to install the OpenAI Python package:
+
+```bash
+pip install openai
+```
+
+Accessing the API requires an API key, which you can get by creating an account and heading [here](https://platform.openai.com/account/api-keys). Once we have a key we'll want to set it as an environment variable by running:
+
+```bash
+export OPENAI_API_KEY="..."
+```
+
+If you'd prefer not to set an environment variable you can pass the key in directly via the `openai_api_key` named parameter when initiating the OpenAI LLM class:
+
+```python
+from langchain.llms import OpenAI
+
+llm = OpenAI(openai_api_key="...")
+```
+
+otherwise you can initialize without any params:
+```python
+from langchain.llms import OpenAI
+
+llm = OpenAI()
+```
+
+### `__call__`: string in -> string out
+The simplest way to use an LLM is a callable: pass in a string, get a string completion.
+
+```python
+llm("Tell me a joke")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    'Why did the chicken cross the road?\n\nTo get to the other side.'
+```
+
+</CodeOutputBlock>
+
+### `generate`: batch calls, richer outputs
+`generate` lets you can call the model with a list of strings, getting back a more complete response than just the text. This complete response can includes things like multiple top responses and other LLM provider-specific information:
+
+```python
+llm_result = llm.generate(["Tell me a joke", "Tell me a poem"]*15)
+```
+
+
+```python
+len(llm_result.generations)
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    30
+```
+
+</CodeOutputBlock>
+
+
+```python
+llm_result.generations[0]
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    [Generation(text='\n\nWhy did the chicken cross the road?\n\nTo get to the other side!'),
+     Generation(text='\n\nWhy did the chicken cross the road?\n\nTo get to the other side.')]
+```
+
+</CodeOutputBlock>
+
+
+```python
+llm_result.generations[-1]
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    [Generation(text="\n\nWhat if love neverspeech\n\nWhat if love never ended\n\nWhat if love was only a feeling\n\nI'll never know this love\n\nIt's not a feeling\n\nBut it's what we have for each other\n\nWe just know that love is something strong\n\nAnd we can't help but be happy\n\nWe just feel what love is for us\n\nAnd we love each other with all our heart\n\nWe just don't know how\n\nHow it will go\n\nBut we know that love is something strong\n\nAnd we'll always have each other\n\nIn our lives."),
+     Generation(text='\n\nOnce upon a time\n\nThere was a love so pure and true\n\nIt lasted for centuries\n\nAnd never became stale or dry\n\nIt was moving and alive\n\nAnd the heart of the love-ick\n\nIs still beating strong and true.')]
+```
+
+</CodeOutputBlock>
+
+You can also access provider specific information that is returned. This information is NOT standardized across providers.
+
+
+```python
+llm_result.llm_output
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    {'token_usage': {'completion_tokens': 3903,
+      'total_tokens': 4023,
+      'prompt_tokens': 120}}
+```
+
+</CodeOutputBlock>
--- a/docs/snippets/modules/model_io/models/llms/how_to/llm_caching.mdx
+++ b/docs/snippets/modules/model_io/models/llms/how_to/llm_caching.mdx
@ -0,0 +1,177 @@
+```python
+import langchain
+from langchain.llms import OpenAI
+
+# To make the caching really obvious, lets use a slower model.
+llm = OpenAI(model_name="text-davinci-002", n=2, best_of=2)
+```
+
+## In Memory Cache
+
+
+```python
+from langchain.cache import InMemoryCache
+langchain.llm_cache = InMemoryCache()
+
+# The first time, it is not yet in cache, so it should take longer
+llm.predict("Tell me a joke")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    CPU times: user 35.9 ms, sys: 28.6 ms, total: 64.6 ms
+    Wall time: 4.83 s
+    
+
+    "\n\nWhy couldn't the bicycle stand up by itself? It was...two tired!"
+```
+
+</CodeOutputBlock>
+
+
+```python
+# The second time it is, so it goes faster
+llm.predict("Tell me a joke")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    CPU times: user 238 µs, sys: 143 µs, total: 381 µs
+    Wall time: 1.76 ms
+
+
+    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'
+```
+
+</CodeOutputBlock>
+
+## SQLite Cache
+
+
+```bash
+rm .langchain.db
+```
+
+
+```python
+# We can do the same thing with a SQLite cache
+from langchain.cache import SQLiteCache
+langchain.llm_cache = SQLiteCache(database_path=".langchain.db")
+```
+
+
+```python
+# The first time, it is not yet in cache, so it should take longer
+llm.predict("Tell me a joke")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    CPU times: user 17 ms, sys: 9.76 ms, total: 26.7 ms
+    Wall time: 825 ms
+
+
+    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'
+```
+
+</CodeOutputBlock>
+
+
+```python
+# The second time it is, so it goes faster
+llm.predict("Tell me a joke")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    CPU times: user 2.46 ms, sys: 1.23 ms, total: 3.7 ms
+    Wall time: 2.67 ms
+
+
+    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'
+```
+
+</CodeOutputBlock>
+
+## Optional Caching in Chains
+You can also turn off caching for particular nodes in chains. Note that because of certain interfaces, its often easier to construct the chain first, and then edit the LLM afterwards.
+
+As an example, we will load a summarizer map-reduce chain. We will cache results for the map-step, but then not freeze it for the combine step.
+
+
+```python
+llm = OpenAI(model_name="text-davinci-002")
+no_cache_llm = OpenAI(model_name="text-davinci-002", cache=False)
+```
+
+
+```python
+from langchain.text_splitter import CharacterTextSplitter
+from langchain.chains.mapreduce import MapReduceChain
+
+text_splitter = CharacterTextSplitter()
+```
+
+
+```python
+with open('../../../state_of_the_union.txt') as f:
+    state_of_the_union = f.read()
+texts = text_splitter.split_text(state_of_the_union)
+```
+
+
+```python
+from langchain.docstore.document import Document
+docs = [Document(page_content=t) for t in texts[:3]]
+from langchain.chains.summarize import load_summarize_chain
+```
+
+
+```python
+chain = load_summarize_chain(llm, chain_type="map_reduce", reduce_llm=no_cache_llm)
+```
+
+
+```python
+chain.run(docs)
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    CPU times: user 452 ms, sys: 60.3 ms, total: 512 ms
+    Wall time: 5.09 s
+
+
+    '\n\nPresident Biden is discussing the American Rescue Plan and the Bipartisan Infrastructure Law, which will create jobs and help Americans. He also talks about his vision for America, which includes investing in education and infrastructure. In response to Russian aggression in Ukraine, the United States is joining with European allies to impose sanctions and isolate Russia. American forces are being mobilized to protect NATO countries in the event that Putin decides to keep moving west. The Ukrainians are bravely fighting back, but the next few weeks will be hard for them. Putin will pay a high price for his actions in the long run. Americans should not be alarmed, as the United States is taking action to protect its interests and allies.'
+```
+
+</CodeOutputBlock>
+
+When we run it again, we see that it runs substantially faster but the final answer is different. This is due to caching at the map steps, but not at the reduce step.
+
+
+```python
+chain.run(docs)
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    CPU times: user 11.5 ms, sys: 4.33 ms, total: 15.8 ms
+    Wall time: 1.04 s
+
+
+    '\n\nPresident Biden is discussing the American Rescue Plan and the Bipartisan Infrastructure Law, which will create jobs and help Americans. He also talks about his vision for America, which includes investing in education and infrastructure.'
+```
+
+</CodeOutputBlock>
+
+
+```bash
+rm .langchain.db sqlite.db
+```
--- a/docs/snippets/modules/model_io/models/llms/how_to/streaming_llm.mdx
+++ b/docs/snippets/modules/model_io/models/llms/how_to/streaming_llm.mdx
@ -0,0 +1,70 @@
+Currently, we support streaming for a broad range of LLM implementations, including but not limited to `OpenAI`, `ChatOpenAI`, `ChatAnthropic`, `Hugging Face Text Generation Inference`, and `Replicate`. This feature has been expanded to accommodate most of the models. To utilize streaming, use a [`CallbackHandler`](https://github.com/hwchase17/langchain/blob/master/langchain/callbacks/base.py) that implements `on_llm_new_token`. In this example, we are using `StreamingStdOutCallbackHandler`.
+```python
+from langchain.llms import OpenAI
+from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
+
+
+llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)
+resp = llm("Write me a song about sparkling water.")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    Verse 1
+    I'm sippin' on sparkling water,
+    It's so refreshing and light,
+    It's the perfect way to quench my thirst
+    On a hot summer night.
+    
+    Chorus
+    Sparkling water, sparkling water,
+    It's the best way to stay hydrated,
+    It's so crisp and so clean,
+    It's the perfect way to stay refreshed.
+    
+    Verse 2
+    I'm sippin' on sparkling water,
+    It's so bubbly and bright,
+    It's the perfect way to cool me down
+    On a hot summer night.
+    
+    Chorus
+    Sparkling water, sparkling water,
+    It's the best way to stay hydrated,
+    It's so crisp and so clean,
+    It's the perfect way to stay refreshed.
+    
+    Verse 3
+    I'm sippin' on sparkling water,
+    It's so light and so clear,
+    It's the perfect way to keep me cool
+    On a hot summer night.
+    
+    Chorus
+    Sparkling water, sparkling water,
+    It's the best way to stay hydrated,
+    It's so crisp and so clean,
+    It's the perfect way to stay refreshed.
+```
+
+</CodeOutputBlock>
+
+We still have access to the end `LLMResult` if using `generate`. However, `token_usage` is not currently supported for streaming.
+
+
+```python
+llm.generate(["Tell me a joke."])
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    Q: What did the fish say when it hit the wall?
+    A: Dam!
+
+
+    LLMResult(generations=[[Generation(text='\n\nQ: What did the fish say when it hit the wall?\nA: Dam!', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {}, 'model_name': 'text-davinci-003'})
+```
+
+</CodeOutputBlock>