基于LangChain文档问答与评估教程

This commit is contained in:
wisdom-pan
2023-06-03 15:37:23 +08:00
parent 8642c12014
commit e51ecc1b03
2 changed files with 1835 additions and 0 deletions

View File

@ -0,0 +1,836 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "52824b89-532a-4e54-87e9-1410813cd39e",
"metadata": {},
"source": [
"# 第四章 基于LangChain的文档问答\n",
"本章内容主要利用langchain构建向量数据库可以在文档上方或关于文档回答问题因此给定从PDF文件、网页或某些公司的内部文档收集中提取的文本使用llm回答有关这些文档内容的问题"
]
},
{
"cell_type": "markdown",
"id": "4aac484b",
"metadata": {
"height": 30
},
"source": [
"## 环境配置\n",
"\n",
"安装langchain设置chatGPT的OPENAI_API_KEY\n",
"* 安装langchain\n",
"```\n",
"pip install langchain\n",
"```\n",
"* 安装docarray\n",
"```\n",
"pip install docarray\n",
"```\n",
"* 设置API-KEY环境变量\n",
"```\n",
"export OPENAI_API_KEY='api-key'\n",
"\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "b7ed03ed-1322-49e3-b2a2-33e94fb592ef",
"metadata": {
"height": 81,
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"\n",
"from dotenv import load_dotenv, find_dotenv\n",
"_ = load_dotenv(find_dotenv()) #读取环境变量"
]
},
{
"cell_type": "code",
"execution_count": 52,
"id": "af8c3c96",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\n\\n人工智能是一项极具前景的技术它的发展正在改变人类的生活方式带来了无数的便利也被认为是未来发展的重要标志。人工智能的发展让许多复杂的任务变得更加容易更高效的完成节省了大量的时间和精力为人类发展带来了极大的帮助。'"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.llms import OpenAI\n",
"\n",
"llm = OpenAI(model_name=\"text-davinci-003\",max_tokens=1024)\n",
"llm(\"怎么评价人工智能\")"
]
},
{
"cell_type": "markdown",
"id": "8cb7a7ec",
"metadata": {
"height": 30
},
"source": [
"### 导入embedding模型和向量存储组件\n",
"使用Dock Array内存搜索向量存储作为一个内存向量存储不需要连接外部数据库"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "974acf8e-8f88-42de-88f8-40a82cb58e8b",
"metadata": {
"height": 98
},
"outputs": [],
"source": [
"from langchain.chains import RetrievalQA #检索QA链在文档上进行检索\n",
"from langchain.chat_models import ChatOpenAI #openai模型\n",
"from langchain.document_loaders import CSVLoader #文档加载器采用csv格式存储\n",
"from langchain.vectorstores import DocArrayInMemorySearch #向量存储\n",
"from IPython.display import display, Markdown #在jupyter显示信息的工具"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "7249846e",
"metadata": {
"height": 75
},
"outputs": [],
"source": [
"#读取文件\n",
"file = 'OutdoorClothingCatalog_1000.csv'\n",
"loader = CSVLoader(file_path=file)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "7724f00e",
"metadata": {
"height": 30
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" <th>2</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>NaN</td>\n",
" <td>name</td>\n",
" <td>description</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0.0</td>\n",
" <td>Women's Campside Oxfords</td>\n",
" <td>This ultracomfortable lace-to-toe Oxford boast...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1.0</td>\n",
" <td>Recycled Waterhog Dog Mat, Chevron Weave</td>\n",
" <td>Protect your floors from spills and splashing ...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2.0</td>\n",
" <td>Infant and Toddler Girls' Coastal Chill Swimsu...</td>\n",
" <td>She'll love the bright colors, ruffles and exc...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>3.0</td>\n",
" <td>Refresh Swimwear, V-Neck Tankini Contrasts</td>\n",
" <td>Whether you're going for a swim or heading out...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>996</th>\n",
" <td>995.0</td>\n",
" <td>Men's Classic Denim, Standard Fit</td>\n",
" <td>Crafted from premium denim that will last wash...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>997</th>\n",
" <td>996.0</td>\n",
" <td>CozyPrint Sweater Fleece Pullover</td>\n",
" <td>The ultimate sweater fleece - made from superi...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>998</th>\n",
" <td>997.0</td>\n",
" <td>Women's NRS Endurance Spray Paddling Pants</td>\n",
" <td>These comfortable and affordable splash paddli...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>999</th>\n",
" <td>998.0</td>\n",
" <td>Women's Stop Flies Hoodie</td>\n",
" <td>This great-looking hoodie uses No Fly Zone Tec...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1000</th>\n",
" <td>999.0</td>\n",
" <td>Modern Utility Bag</td>\n",
" <td>This US-made crossbody bag is built with the s...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>1001 rows × 3 columns</p>\n",
"</div>"
],
"text/plain": [
" 0 1 \n",
"0 NaN name \\\n",
"1 0.0 Women's Campside Oxfords \n",
"2 1.0 Recycled Waterhog Dog Mat, Chevron Weave \n",
"3 2.0 Infant and Toddler Girls' Coastal Chill Swimsu... \n",
"4 3.0 Refresh Swimwear, V-Neck Tankini Contrasts \n",
"... ... ... \n",
"996 995.0 Men's Classic Denim, Standard Fit \n",
"997 996.0 CozyPrint Sweater Fleece Pullover \n",
"998 997.0 Women's NRS Endurance Spray Paddling Pants \n",
"999 998.0 Women's Stop Flies Hoodie \n",
"1000 999.0 Modern Utility Bag \n",
"\n",
" 2 \n",
"0 description \n",
"1 This ultracomfortable lace-to-toe Oxford boast... \n",
"2 Protect your floors from spills and splashing ... \n",
"3 She'll love the bright colors, ruffles and exc... \n",
"4 Whether you're going for a swim or heading out... \n",
"... ... \n",
"996 Crafted from premium denim that will last wash... \n",
"997 The ultimate sweater fleece - made from superi... \n",
"998 These comfortable and affordable splash paddli... \n",
"999 This great-looking hoodie uses No Fly Zone Tec... \n",
"1000 This US-made crossbody bag is built with the s... \n",
"\n",
"[1001 rows x 3 columns]"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#查看数据\n",
"import pandas as pd\n",
"data = pd.read_csv(file,header=None)\n",
"data"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3bd6422c",
"metadata": {},
"source": [
"提供了一个户外服装的CSV文件我们将使用它与语言模型结合使用"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "2963fc63",
"metadata": {},
"source": [
"#### 创建向量存储\n",
"将导入一个索引,即向量存储索引创建器"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "5bfaba30",
"metadata": {
"height": 30
},
"outputs": [],
"source": [
"from langchain.indexes import VectorstoreIndexCreator #导入向量存储索引创建器"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9e200726",
"metadata": {
"height": 64
},
"outputs": [],
"source": [
"'''\n",
"将指定向量存储类,创建完成后,我们将从加载器中调用,通过文档记载器列表加载\n",
"'''\n",
"\n",
"index = VectorstoreIndexCreator(\n",
" vectorstore_cls=DocArrayInMemorySearch\n",
").from_loaders([loader])"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "34562d81",
"metadata": {
"height": 47
},
"outputs": [],
"source": [
"query =\"Please list all your shirts with sun protection \\\n",
"in a table in markdown and summarize each one.\""
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "cfd0cc37",
"metadata": {
"height": 30
},
"outputs": [],
"source": [
"response = index.query(query)#使用索引查询创建一个响应,并传入这个查询"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "ae21f1ff",
"metadata": {
"height": 30,
"scrolled": true
},
"outputs": [
{
"data": {
"text/markdown": [
"\n",
"\n",
"| Name | Description |\n",
"| --- | --- |\n",
"| Men's Tropical Plaid Short-Sleeve Shirt | UPF 50+ rated, 100% polyester, wrinkle-resistant, front and back cape venting, two front bellows pockets |\n",
"| Men's Plaid Tropic Shirt, Short-Sleeve | UPF 50+ rated, 52% polyester and 48% nylon, machine washable and dryable, front and back cape venting, two front bellows pockets |\n",
"| Men's TropicVibe Shirt, Short-Sleeve | UPF 50+ rated, 71% Nylon, 29% Polyester, 100% Polyester knit mesh, machine wash and dry, front and back cape venting, two front bellows pockets |\n",
"| Sun Shield Shirt by | UPF 50+ rated, 78% nylon, 22% Lycra Xtra Life fiber, handwash, line dry, wicks moisture, fits comfortably over swimsuit, abrasion resistant |\n",
"\n",
"All four shirts provide UPF 50+ sun protection, blocking 98% of the sun's harmful rays. The Men's Tropical Plaid Short-Sleeve Shirt is made of 100% polyester and is wrinkle-resistant"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(Markdown(response))#查看查询返回的内容"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "eb74cc79",
"metadata": {},
"source": [
"得到了一个Markdown表格其中包含所有带有防晒衣的衬衫的名称和描述还得到了一个语言模型提供的不错的小总结"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "dd34e50e",
"metadata": {},
"source": [
"#### 使用语言模型与文档结合使用\n",
"想要使用语言模型并将其与我们的许多文档结合使用但是语言模型一次只能检查几千个单词如果我们有非常大的文档如何让语言模型回答关于其中所有内容的问题呢通过embedding和向量存储实现\n",
"* embedding \n",
"文本片段创建数值表示文本语义,相似内容的文本片段将具有相似的向量,这使我们可以在向量空间中比较文本片段\n",
"* 向量数据库 \n",
"向量数据库是存储我们在上一步中创建的这些向量表示的一种方式,我们创建这个向量数据库的方式是用来自传入文档的文本块填充它。\n",
"当我们获得一个大的传入文档时我们首先将其分成较小的块因为我们可能无法将整个文档传递给语言模型因此采用分块embedding的方式储存到向量数据库中。这就是创建索引的过程。\n",
"\n",
"通过运行时使用索引来查找与传入查询最相关的文本片段然后我们将其与向量数据库中的所有向量进行比较并选择最相似的n个返回语言模型得到最终答案"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "631396c6",
"metadata": {
"height": 30
},
"outputs": [],
"source": [
"#创建一个文档加载器通过csv格式加载\n",
"loader = CSVLoader(file_path=file)\n",
"docs = loader.load()"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "4a977f44",
"metadata": {
"height": 30
},
"outputs": [
{
"data": {
"text/plain": [
"Document(page_content=\": 0\\nname: Women's Campside Oxfords\\ndescription: This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on. \\n\\nSize & Fit: Order regular shoe size. For half sizes not offered, order up to next whole size. \\n\\nSpecs: Approx. weight: 1 lb.1 oz. per pair. \\n\\nConstruction: Soft canvas material for a broken-in feel and look. Comfortable EVA innersole with Cleansport NXT® antimicrobial odor control. Vintage hunt, fish and camping motif on innersole. Moderate arch contour of innersole. EVA foam midsole for cushioning and support. Chain-tread-inspired molded rubber outsole with modified chain-tread pattern. Imported. \\n\\nQuestions? Please contact us for any inquiries.\", metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 0})"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs[0]#查看单个文档我们可以看到每个文档对应于CSV中的一个块"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "e875693a",
"metadata": {
"height": 47
},
"outputs": [],
"source": [
"'''\n",
"因为这些文档已经非常小了,所以我们实际上不需要在这里进行任何分块,可以直接进行embedding\n",
"'''\n",
"\n",
"from langchain.embeddings import OpenAIEmbeddings #要创建可以直接进行embedding我们将使用OpenAI的可以直接进行embedding类\n",
"embeddings = OpenAIEmbeddings() #初始化"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "779bec75",
"metadata": {
"height": 30
},
"outputs": [],
"source": [
"embed = embeddings.embed_query(\"Hi my name is Harrison\")#让我们使用embedding上的查询方法为特定文本创建embedding"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "699aaaf9",
"metadata": {
"height": 30
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1536\n"
]
}
],
"source": [
"print(len(embed))#查看这个embedding我们可以看到有超过一千个不同的元素"
]
},
{
"cell_type": "code",
"execution_count": 34,
"id": "9d00d346",
"metadata": {
"height": 30
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[-0.021933607757091522, 0.006697045173496008, -0.01819835603237152, -0.039113257080316544, -0.014060650952160358]\n"
]
}
],
"source": [
"print(embed[:5])#每个元素都是不同的数字值,组合起来,这就创建了这段文本的总体数值表示"
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "27ad0bb0",
"metadata": {
"height": 81
},
"outputs": [],
"source": [
"'''\n",
"为刚才的文本创建embedding准备将它们存储在向量存储中使用向量存储上的from documents方法来实现。\n",
"该方法接受文档列表、嵌入对象,然后我们将创建一个总体向量存储\n",
"'''\n",
"db = DocArrayInMemorySearch.from_documents(\n",
" docs, \n",
" embeddings\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "0329bfd5",
"metadata": {
"height": 30
},
"outputs": [],
"source": [
"query = \"Please suggest a shirt with sunblocking\""
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "7909c6b7",
"metadata": {
"height": 30
},
"outputs": [],
"source": [
"docs = db.similarity_search(query)#使用这个向量存储来查找与传入查询类似的文本,如果我们在向量存储中使用相似性搜索方法并传入一个查询,我们将得到一个文档列表"
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "43321853",
"metadata": {
"height": 30
},
"outputs": [
{
"data": {
"text/plain": [
"4"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(docs)# 我们可以看到它返回了四个文档"
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "6eba90b5",
"metadata": {
"height": 30
},
"outputs": [
{
"data": {
"text/plain": [
"Document(page_content=': 255\\nname: Sun Shield Shirt by\\ndescription: \"Block the sun, not the fun our high-performance sun shirt is guaranteed to protect from harmful UV rays. \\n\\nSize & Fit: Slightly Fitted: Softly shapes the body. Falls at hip.\\n\\nFabric & Care: 78% nylon, 22% Lycra Xtra Life fiber. UPF 50+ rated the highest rated sun protection possible. Handwash, line dry.\\n\\nAdditional Features: Wicks moisture for quick-drying comfort. Fits comfortably over your favorite swimsuit. Abrasion resistant for season after season of wear. Imported.\\n\\nSun Protection That Won\\'t Wear Off\\nOur high-performance fabric provides SPF 50+ sun protection, blocking 98% of the sun\\'s harmful rays. This fabric is recommended by The Skin Cancer Foundation as an effective UV protectant.', metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 255})"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs[0] #,如果我们看第一个文档,我们可以看到它确实是一件关于防晒的衬衫"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "fe41b36f",
"metadata": {},
"source": [
"### 如何回答我们文档的相关问题\n",
"首先,我们需要从这个向量存储中创建一个检索器,检索器是一个通用接口,可以由任何接受查询并返回文档的方法支持。接下来,因为我们想要进行文本生成并返回自然语言响应\n"
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "c0c3596e",
"metadata": {
"height": 30
},
"outputs": [],
"source": [
"retriever = db.as_retriever() #创建检索器通用接口"
]
},
{
"cell_type": "code",
"execution_count": 55,
"id": "0625f5e8",
"metadata": {
"height": 47
},
"outputs": [],
"source": [
"llm = ChatOpenAI(temperature = 0.0,max_tokens=1024) #导入语言模型\n"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "a573f58a",
"metadata": {
"height": 47
},
"outputs": [],
"source": [
"qdocs = \"\".join([docs[i].page_content for i in range(len(docs))]) # 将合并文档中的所有页面内容到一个变量中\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "14682d95",
"metadata": {
"height": 64
},
"outputs": [],
"source": [
"response = llm.call_as_llm(f\"{qdocs} Question: Please list all your \\\n",
"shirts with sun protection in a table in markdown and summarize each one.\") #列出所有具有防晒功能的衬衫并在Markdown表格中总结每个衬衫的语言模型\n"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "8bba545b",
"metadata": {
"height": 30
},
"outputs": [
{
"data": {
"text/markdown": [
"| Name | Description |\n",
"| --- | --- |\n",
"| Sun Shield Shirt | High-performance sun shirt with UPF 50+ sun protection, moisture-wicking, and abrasion-resistant fabric. Recommended by The Skin Cancer Foundation. |\n",
"| Men's Plaid Tropic Shirt | Ultracomfortable shirt with UPF 50+ sun protection, wrinkle-free fabric, and front/back cape venting. Made with 52% polyester and 48% nylon. |\n",
"| Men's TropicVibe Shirt | Men's sun-protection shirt with built-in UPF 50+ and front/back cape venting. Made with 71% nylon and 29% polyester. |\n",
"| Men's Tropical Plaid Short-Sleeve Shirt | Lightest hot-weather shirt with UPF 50+ sun protection, front/back cape venting, and two front bellows pockets. Made with 100% polyester and is wrinkle-resistant. |\n",
"\n",
"All of these shirts provide UPF 50+ sun protection, blocking 98% of the sun's harmful rays. They are made with high-performance fabrics that are moisture-wicking, wrinkle-resistant, and abrasion-resistant. The Men's Plaid Tropic Shirt and Men's Tropical Plaid Short-Sleeve Shirt both have front/back cape venting for added breathability. The Sun Shield Shirt is recommended by The Skin Cancer Foundation as an effective UV protectant."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(Markdown(response))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "12f042e7",
"metadata": {},
"source": [
"在此处打印响应,我们可以看到我们得到了一个表格,正如我们所要求的那样"
]
},
{
"cell_type": "code",
"execution_count": 56,
"id": "32c94d22",
"metadata": {
"height": 115
},
"outputs": [],
"source": [
"''' \n",
"通过LangChain链封装起来\n",
"创建一个检索QA链对检索到的文档进行问题回答要创建这样的链我们将传入几个不同的东西\n",
"1、语言模型在最后进行文本生成\n",
"2、传入链类型这里使用stuff将所有文档塞入上下文并对语言模型进行一次调用\n",
"3、传入一个检索器\n",
"'''\n",
"\n",
"\n",
"qa_stuff = RetrievalQA.from_chain_type(\n",
" llm=llm, \n",
" chain_type=\"stuff\", \n",
" retriever=retriever, \n",
" verbose=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "e4769316",
"metadata": {
"height": 47
},
"outputs": [],
"source": [
"query = \"Please list all your shirts with sun protection in a table \\\n",
"in markdown and summarize each one.\"#创建一个查询并在此查询上运行链"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1fc3c2f3",
"metadata": {
"height": 30
},
"outputs": [],
"source": [
"response = qa_stuff.run(query)"
]
},
{
"cell_type": "code",
"execution_count": 58,
"id": "fba1a5db",
"metadata": {
"height": 30
},
"outputs": [
{
"data": {
"text/markdown": [
"\n",
"\n",
"| Name | Description |\n",
"| --- | --- |\n",
"| Men's Tropical Plaid Short-Sleeve Shirt | UPF 50+ rated, 100% polyester, wrinkle-resistant, front and back cape venting, two front bellows pockets |\n",
"| Men's Plaid Tropic Shirt, Short-Sleeve | UPF 50+ rated, 52% polyester and 48% nylon, machine washable and dryable, front and back cape venting, two front bellows pockets |\n",
"| Men's TropicVibe Shirt, Short-Sleeve | UPF 50+ rated, 71% Nylon, 29% Polyester, 100% Polyester knit mesh, machine wash and dry, front and back cape venting, two front bellows pockets |\n",
"| Sun Shield Shirt by | UPF 50+ rated, 78% nylon, 22% Lycra Xtra Life fiber, handwash, line dry, wicks moisture, fits comfortably over swimsuit, abrasion resistant |\n",
"\n",
"All four shirts provide UPF 50+ sun protection, blocking 98% of the sun's harmful rays. The Men's Tropical Plaid Short-Sleeve Shirt is made of 100% polyester and is wrinkle-resistant"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(Markdown(response))#使用 display 和 markdown 显示它"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e28c5657",
"metadata": {},
"source": [
"这两个方式返回相同的结果"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "44f1fa38",
"metadata": {},
"source": [
"#### 不同类型的chain链\n",
"想在许多不同类型的块上执行相同类型的问答该怎么办之前的实验中只返回了4个文档如果有多个文档那么我们可以使用几种不同的方法\n",
"* Map Reduce \n",
"将所有块与问题一起传递给语言模型,获取回复,使用另一个语言模型调用将所有单独的回复总结成最终答案,它可以在任意数量的文档上运行。可以并行处理单个问题,同时也需要更多的调用。它将所有文档视为独立的\n",
"* Refine \n",
"用于循环许多文档际上是迭代的建立在先前文档的答案之上非常适合前后因果信息并随时间逐步构建答案依赖于先前调用的结果。它通常需要更长的时间并且基本上需要与Map Reduce一样多的调用\n",
"* Map Re-rank \n",
"对每个文档进行单个语言模型调用,要求它返回一个分数,选择最高分,这依赖于语言模型知道分数应该是什么,需要告诉它,如果它与文档相关,则应该是高分,并在那里精细调整说明,可以批量处理它们相对较快,但是更加昂贵\n",
"* Stuff \n",
"将所有内容组合成一个文档"
]
},
{
"cell_type": "markdown",
"id": "7988f412",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@ -0,0 +1,999 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "52824b89-532a-4e54-87e9-1410813cd39e",
"metadata": {},
"source": [
"# 第五章 如何评估基于LLM的应用程序\n",
"当使用llm构建复杂应用程序时评估应用程序的表现是一个重要但有时棘手的步骤它是否满足某些准确性标准\n",
"通常更有用的是从许多不同的数据点中获得更全面的模型表现情况\n",
"一种是使用语言模型本身和链本身来评估其他语言模型、其他链和其他应用程序"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "b7ed03ed-1322-49e3-b2a2-33e94fb592ef",
"metadata": {
"height": 81,
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"\n",
"from dotenv import load_dotenv, find_dotenv\n",
"_ = load_dotenv(find_dotenv()) #读取环境变量"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "28008949",
"metadata": {},
"source": [
"## 创建LLM应用\n",
"按照langchain链的方式进行构建"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "974acf8e-8f88-42de-88f8-40a82cb58e8b",
"metadata": {
"height": 98
},
"outputs": [],
"source": [
"from langchain.chains import RetrievalQA #检索QA链在文档上进行检索\n",
"from langchain.chat_models import ChatOpenAI #openai模型\n",
"from langchain.document_loaders import CSVLoader #文档加载器采用csv格式存储\n",
"from langchain.indexes import VectorstoreIndexCreator #导入向量存储索引创建器\n",
"from langchain.vectorstores import DocArrayInMemorySearch #向量存储\n"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "9ec1106d",
"metadata": {
"height": 64
},
"outputs": [],
"source": [
"#加载数据\n",
"file = 'OutdoorClothingCatalog_1000.csv'\n",
"loader = CSVLoader(file_path=file)\n",
"data = loader.load()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "06b1ffae",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" <th>2</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>NaN</td>\n",
" <td>name</td>\n",
" <td>description</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0.0</td>\n",
" <td>Women's Campside Oxfords</td>\n",
" <td>This ultracomfortable lace-to-toe Oxford boast...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1.0</td>\n",
" <td>Recycled Waterhog Dog Mat, Chevron Weave</td>\n",
" <td>Protect your floors from spills and splashing ...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2.0</td>\n",
" <td>Infant and Toddler Girls' Coastal Chill Swimsu...</td>\n",
" <td>She'll love the bright colors, ruffles and exc...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>3.0</td>\n",
" <td>Refresh Swimwear, V-Neck Tankini Contrasts</td>\n",
" <td>Whether you're going for a swim or heading out...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>996</th>\n",
" <td>995.0</td>\n",
" <td>Men's Classic Denim, Standard Fit</td>\n",
" <td>Crafted from premium denim that will last wash...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>997</th>\n",
" <td>996.0</td>\n",
" <td>CozyPrint Sweater Fleece Pullover</td>\n",
" <td>The ultimate sweater fleece - made from superi...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>998</th>\n",
" <td>997.0</td>\n",
" <td>Women's NRS Endurance Spray Paddling Pants</td>\n",
" <td>These comfortable and affordable splash paddli...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>999</th>\n",
" <td>998.0</td>\n",
" <td>Women's Stop Flies Hoodie</td>\n",
" <td>This great-looking hoodie uses No Fly Zone Tec...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1000</th>\n",
" <td>999.0</td>\n",
" <td>Modern Utility Bag</td>\n",
" <td>This US-made crossbody bag is built with the s...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>1001 rows × 3 columns</p>\n",
"</div>"
],
"text/plain": [
" 0 1 \n",
"0 NaN name \\\n",
"1 0.0 Women's Campside Oxfords \n",
"2 1.0 Recycled Waterhog Dog Mat, Chevron Weave \n",
"3 2.0 Infant and Toddler Girls' Coastal Chill Swimsu... \n",
"4 3.0 Refresh Swimwear, V-Neck Tankini Contrasts \n",
"... ... ... \n",
"996 995.0 Men's Classic Denim, Standard Fit \n",
"997 996.0 CozyPrint Sweater Fleece Pullover \n",
"998 997.0 Women's NRS Endurance Spray Paddling Pants \n",
"999 998.0 Women's Stop Flies Hoodie \n",
"1000 999.0 Modern Utility Bag \n",
"\n",
" 2 \n",
"0 description \n",
"1 This ultracomfortable lace-to-toe Oxford boast... \n",
"2 Protect your floors from spills and splashing ... \n",
"3 She'll love the bright colors, ruffles and exc... \n",
"4 Whether you're going for a swim or heading out... \n",
"... ... \n",
"996 Crafted from premium denim that will last wash... \n",
"997 The ultimate sweater fleece - made from superi... \n",
"998 These comfortable and affordable splash paddli... \n",
"999 This great-looking hoodie uses No Fly Zone Tec... \n",
"1000 This US-made crossbody bag is built with the s... \n",
"\n",
"[1001 rows x 3 columns]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#查看数据\n",
"import pandas as pd\n",
"test_data = pd.read_csv(file,header=None)\n",
"test_data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5bb816a6",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"OPENAI_API_KEY\"] = \"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\""
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "b31c218f",
"metadata": {
"height": 64
},
"outputs": [],
"source": [
"'''\n",
"将指定向量存储类,创建完成后,我们将从加载器中调用,通过文档记载器列表加载\n",
"'''\n",
"index = VectorstoreIndexCreator(\n",
" vectorstore_cls=DocArrayInMemorySearch\n",
").from_loaders([loader])"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "a2006054",
"metadata": {
"height": 183
},
"outputs": [],
"source": [
"#通过指定语言模型、链类型、检索器和我们要打印的详细程度来创建检索QA链\n",
"llm = ChatOpenAI(temperature = 0.0)\n",
"qa = RetrievalQA.from_chain_type(\n",
" llm=llm, \n",
" chain_type=\"stuff\", \n",
" retriever=index.vectorstore.as_retriever(), \n",
" verbose=True,\n",
" chain_type_kwargs = {\n",
" \"document_separator\": \"<<<<>>>>>\"\n",
" }\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "791ebd73",
"metadata": {},
"source": [
"### 创建评估数据点\n",
"们需要做的第一件事是真正弄清楚我们想要评估它的一些数据点,我们将介绍几种不同的方法来完成这个任务\n",
"1、将自己想出好的数据点作为例子查看一些数据然后想出例子问题和答案以便以后用于评估"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "fb04a0f9",
"metadata": {
"height": 30
},
"outputs": [
{
"data": {
"text/plain": [
"Document(page_content=\": 10\\nname: Cozy Comfort Pullover Set, Stripe\\ndescription: Perfect for lounging, this striped knit set lives up to its name. We used ultrasoft fabric and an easy design that's as comfortable at bedtime as it is when we have to make a quick run out.\\n\\nSize & Fit\\n- Pants are Favorite Fit: Sits lower on the waist.\\n- Relaxed Fit: Our most generous fit sits farthest from the body.\\n\\nFabric & Care\\n- In the softest blend of 63% polyester, 35% rayon and 2% spandex.\\n\\nAdditional Features\\n- Relaxed fit top with raglan sleeves and rounded hem.\\n- Pull-on pants have a wide elastic waistband and drawstring, side pockets and a modern slim leg.\\n\\nImported.\", metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 10})"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data[10]#查看这里的一些文档,我们可以对其中发生的事情有所了解"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "fe4a88c2",
"metadata": {
"height": 30
},
"outputs": [
{
"data": {
"text/plain": [
"Document(page_content=': 11\\nname: Ultra-Lofty 850 Stretch Down Hooded Jacket\\ndescription: This technical stretch down jacket from our DownTek collection is sure to keep you warm and comfortable with its full-stretch construction providing exceptional range of motion. With a slightly fitted style that falls at the hip and best with a midweight layer, this jacket is suitable for light activity up to 20° and moderate activity up to -30°. The soft and durable 100% polyester shell offers complete windproof protection and is insulated with warm, lofty goose down. Other features include welded baffles for a no-stitch construction and excellent stretch, an adjustable hood, an interior media port and mesh stash pocket and a hem drawcord. Machine wash and dry. Imported.', metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 11})"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data[11]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "b9c52116",
"metadata": {},
"source": [
"看起来第一个文档中有这个套头衫,第二个文档中有这个夹克,从这些细节中,我们可以创建一些例子查询和答案"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "8d548aef",
"metadata": {},
"source": [
"### 创建测试用例数据\n"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "c2d59bf2",
"metadata": {
"height": 217
},
"outputs": [],
"source": [
"examples = [\n",
" {\n",
" \"query\": \"Do the Cozy Comfort Pullover Set\\\n",
" have side pockets?\",\n",
" \"answer\": \"Yes\"\n",
" },\n",
" {\n",
" \"query\": \"What collection is the Ultra-Lofty \\\n",
" 850 Stretch Down Hooded Jacket from?\",\n",
" \"answer\": \"The DownTek collection\"\n",
" }\n",
"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "b73ce510",
"metadata": {},
"source": [
"因此,我们可以问一个简单的问题,这个舒适的套头衫套装有侧口袋吗?,我们可以通过上面的内容看到,它确实有一些侧口袋,答案为是\n",
"对于第二个文档我们可以看到这件夹克来自某个系列即down tech系列答案是down tech系列。"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "c7ce3e4f",
"metadata": {},
"source": [
"### 通过LLM生成测试用例"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "d44f8376",
"metadata": {
"height": 47
},
"outputs": [],
"source": [
"from langchain.evaluation.qa import QAGenerateChain #导入QA生成链它将接收文档并从每个文档中创建一个问题答案对\n"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "34e87816",
"metadata": {
"height": 30
},
"outputs": [],
"source": [
"example_gen_chain = QAGenerateChain.from_llm(ChatOpenAI())#通过传递chat open AI语言模型来创建这个链"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "62abae09",
"metadata": {
"height": 64
},
"outputs": [],
"source": [
"new_examples = example_gen_chain.apply_and_parse(\n",
" [{\"doc\": t} for t in data[:5]]\n",
") #我们可以创建许多例子"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "31c9f786",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'query': \"What is the weight of the Women's Campside Oxfords?\",\n",
" 'answer': \"The Women's Campside Oxfords weigh approximately 1 lb.1 oz. per pair.\"},\n",
" {'query': 'What are the dimensions of the medium Recycled Waterhog dog mat?',\n",
" 'answer': 'The dimensions of the medium Recycled Waterhog dog mat are 22.5\" x 34.5\".'},\n",
" {'query': \"What are some features of the Infant and Toddler Girls' Coastal Chill Swimsuit?\",\n",
" 'answer': \"The swimsuit has bright colors, ruffles, and exclusive whimsical prints. It is made of four-way-stretch and chlorine-resistant fabric that keeps its shape and resists snags. The swimsuit is also UPF 50+ rated, providing the highest rated sun protection possible, blocking 98% of the sun's harmful rays. The crossover no-slip straps and fully lined bottom ensure a secure fit and maximum coverage. It is machine washable and should be line dried for best results. The swimsuit is imported.\"},\n",
" {'query': 'What is the fabric composition of the Refresh Swimwear, V-Neck Tankini Contrasts?',\n",
" 'answer': 'The Refresh Swimwear, V-Neck Tankini Contrasts is made of 82% recycled nylon with 18% Lycra® spandex for the body and 90% recycled nylon with 10% Lycra® spandex for the lining.'},\n",
" {'query': 'What is the name of the waterproof pants and what technology makes them more breathable?',\n",
" 'answer': 'The name of the pants is EcoFlex 3L Storm Pants and the TEK O2 technology makes them more breathable.'}]"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"new_examples #查看用例数据"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "97ab28b5",
"metadata": {
"height": 30
},
"outputs": [
{
"data": {
"text/plain": [
"{'query': \"What is the weight of the Women's Campside Oxfords?\",\n",
" 'answer': \"The Women's Campside Oxfords weigh approximately 1 lb.1 oz. per pair.\"}"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"new_examples[0]"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "0ebe4228",
"metadata": {
"height": 30
},
"outputs": [
{
"data": {
"text/plain": [
"Document(page_content=\": 0\\nname: Women's Campside Oxfords\\ndescription: This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on. \\n\\nSize & Fit: Order regular shoe size. For half sizes not offered, order up to next whole size. \\n\\nSpecs: Approx. weight: 1 lb.1 oz. per pair. \\n\\nConstruction: Soft canvas material for a broken-in feel and look. Comfortable EVA innersole with Cleansport NXT® antimicrobial odor control. Vintage hunt, fish and camping motif on innersole. Moderate arch contour of innersole. EVA foam midsole for cushioning and support. Chain-tread-inspired molded rubber outsole with modified chain-tread pattern. Imported. \\n\\nQuestions? Please contact us for any inquiries.\", metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 0})"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data[0]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "faf25f2f",
"metadata": {},
"source": [
"### 组合用例数据"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "ada2a3fc",
"metadata": {
"height": 30
},
"outputs": [],
"source": [
"examples += new_examples"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "9cdf5cf5",
"metadata": {
"height": 30
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new RetrievalQA chain...\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The Cozy Comfort Pullover Set, Stripe has side pockets.'"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"qa.run(examples[0][\"query\"])"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "63f3cb08",
"metadata": {},
"source": [
"## 人工评估\n",
"现在有了这些示例,但是我们如何评估正在发生的事情呢?\n",
"通过运行一个示例通过链,并查看它产生的输出\n",
"在这里我们传递一个查询,然后我们得到一个答案。实际上正在发生的事情,进入语言模型的实际提示是什么? \n",
"它检索的文档是什么? \n",
"中间结果是什么? \n",
"仅仅查看最终答案通常不足以了解链中出现了什么问题或可能出现了什么问题"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "fcaf622e",
"metadata": {
"height": 47
},
"outputs": [],
"source": [
"''' \n",
"LingChainDebug工具可以了解运行一个实例通过链中间所经历的步骤\n",
"'''\n",
"import langchain\n",
"langchain.debug = True"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "1e1deab0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m[chain/start]\u001b[0m \u001b[1m[1:chain:RetrievalQA] Entering Chain run with input:\n",
"\u001b[0m{\n",
" \"query\": \"Do the Cozy Comfort Pullover Set have side pockets?\"\n",
"}\n",
"\u001b[32;1m\u001b[1;3m[chain/start]\u001b[0m \u001b[1m[1:chain:RetrievalQA > 2:chain:StuffDocumentsChain] Entering Chain run with input:\n",
"\u001b[0m[inputs]\n",
"\u001b[32;1m\u001b[1;3m[chain/start]\u001b[0m \u001b[1m[1:chain:RetrievalQA > 2:chain:StuffDocumentsChain > 3:chain:LLMChain] Entering Chain run with input:\n",
"\u001b[0m{\n",
" \"question\": \"Do the Cozy Comfort Pullover Set have side pockets?\",\n",
" \"context\": \": 10\\nname: Cozy Comfort Pullover Set, Stripe\\ndescription: Perfect for lounging, this striped knit set lives up to its name. We used ultrasoft fabric and an easy design that's as comfortable at bedtime as it is when we have to make a quick run out.\\n\\nSize & Fit\\n- Pants are Favorite Fit: Sits lower on the waist.\\n- Relaxed Fit: Our most generous fit sits farthest from the body.\\n\\nFabric & Care\\n- In the softest blend of 63% polyester, 35% rayon and 2% spandex.\\n\\nAdditional Features\\n- Relaxed fit top with raglan sleeves and rounded hem.\\n- Pull-on pants have a wide elastic waistband and drawstring, side pockets and a modern slim leg.\\n\\nImported.<<<<>>>>>: 73\\nname: Cozy Cuddles Knit Pullover Set\\ndescription: Perfect for lounging, this knit set lives up to its name. We used ultrasoft fabric and an easy design that's as comfortable at bedtime as it is when we have to make a quick run out. \\n\\nSize & Fit \\nPants are Favorite Fit: Sits lower on the waist. \\nRelaxed Fit: Our most generous fit sits farthest from the body. \\n\\nFabric & Care \\nIn the softest blend of 63% polyester, 35% rayon and 2% spandex.\\n\\nAdditional Features \\nRelaxed fit top with raglan sleeves and rounded hem. \\nPull-on pants have a wide elastic waistband and drawstring, side pockets and a modern slim leg. \\nImported.<<<<>>>>>: 632\\nname: Cozy Comfort Fleece Pullover\\ndescription: The ultimate sweater fleece \\u2013 made from superior fabric and offered at an unbeatable price. \\n\\nSize & Fit\\nSlightly Fitted: Softly shapes the body. Falls at hip. \\n\\nWhy We Love It\\nOur customers (and employees) love the rugged construction and heritage-inspired styling of our popular Sweater Fleece Pullover and wear it for absolutely everything. From high-intensity activities to everyday tasks, you'll find yourself reaching for it every time.\\n\\nFabric & Care\\nRugged sweater-knit exterior and soft brushed interior for exceptional warmth and comfort. Made from soft, 100% polyester. Machine wash and dry.\\n\\nAdditional Features\\nFeatures our classic Mount Katahdin logo. Snap placket. Front princess seams create a feminine shape. Kangaroo handwarmer pockets. Cuffs and hem reinforced with jersey binding. Imported.\\n\\n \\u2013 Official Supplier to the U.S. Ski Team\\nTHEIR WILL TO WIN, WOVEN RIGHT IN. LEARN MORE<<<<>>>>>: 151\\nname: Cozy Quilted Sweatshirt\\ndescription: Our sweatshirt is an instant classic with its great quilted texture and versatile weight that easily transitions between seasons. With a traditional fit that is relaxed through the chest, sleeve, and waist, this pullover is lightweight enough to be worn most months of the year. The cotton blend fabric is super soft and comfortable, making it the perfect casual layer. To make dressing easy, this sweatshirt also features a snap placket and a heritage-inspired Mt. Katahdin logo patch. For care, machine wash and dry. Imported.\"\n",
"}\n",
"\u001b[32;1m\u001b[1;3m[llm/start]\u001b[0m \u001b[1m[1:chain:RetrievalQA > 2:chain:StuffDocumentsChain > 3:chain:LLMChain > 4:llm:ChatOpenAI] Entering LLM run with input:\n",
"\u001b[0m{\n",
" \"prompts\": [\n",
" \"System: Use the following pieces of context to answer the users question. \\nIf you don't know the answer, just say that you don't know, don't try to make up an answer.\\n----------------\\n: 10\\nname: Cozy Comfort Pullover Set, Stripe\\ndescription: Perfect for lounging, this striped knit set lives up to its name. We used ultrasoft fabric and an easy design that's as comfortable at bedtime as it is when we have to make a quick run out.\\n\\nSize & Fit\\n- Pants are Favorite Fit: Sits lower on the waist.\\n- Relaxed Fit: Our most generous fit sits farthest from the body.\\n\\nFabric & Care\\n- In the softest blend of 63% polyester, 35% rayon and 2% spandex.\\n\\nAdditional Features\\n- Relaxed fit top with raglan sleeves and rounded hem.\\n- Pull-on pants have a wide elastic waistband and drawstring, side pockets and a modern slim leg.\\n\\nImported.<<<<>>>>>: 73\\nname: Cozy Cuddles Knit Pullover Set\\ndescription: Perfect for lounging, this knit set lives up to its name. We used ultrasoft fabric and an easy design that's as comfortable at bedtime as it is when we have to make a quick run out. \\n\\nSize & Fit \\nPants are Favorite Fit: Sits lower on the waist. \\nRelaxed Fit: Our most generous fit sits farthest from the body. \\n\\nFabric & Care \\nIn the softest blend of 63% polyester, 35% rayon and 2% spandex.\\n\\nAdditional Features \\nRelaxed fit top with raglan sleeves and rounded hem. \\nPull-on pants have a wide elastic waistband and drawstring, side pockets and a modern slim leg. \\nImported.<<<<>>>>>: 632\\nname: Cozy Comfort Fleece Pullover\\ndescription: The ultimate sweater fleece \\u2013 made from superior fabric and offered at an unbeatable price. \\n\\nSize & Fit\\nSlightly Fitted: Softly shapes the body. Falls at hip. \\n\\nWhy We Love It\\nOur customers (and employees) love the rugged construction and heritage-inspired styling of our popular Sweater Fleece Pullover and wear it for absolutely everything. From high-intensity activities to everyday tasks, you'll find yourself reaching for it every time.\\n\\nFabric & Care\\nRugged sweater-knit exterior and soft brushed interior for exceptional warmth and comfort. Made from soft, 100% polyester. Machine wash and dry.\\n\\nAdditional Features\\nFeatures our classic Mount Katahdin logo. Snap placket. Front princess seams create a feminine shape. Kangaroo handwarmer pockets. Cuffs and hem reinforced with jersey binding. Imported.\\n\\n \\u2013 Official Supplier to the U.S. Ski Team\\nTHEIR WILL TO WIN, WOVEN RIGHT IN. LEARN MORE<<<<>>>>>: 151\\nname: Cozy Quilted Sweatshirt\\ndescription: Our sweatshirt is an instant classic with its great quilted texture and versatile weight that easily transitions between seasons. With a traditional fit that is relaxed through the chest, sleeve, and waist, this pullover is lightweight enough to be worn most months of the year. The cotton blend fabric is super soft and comfortable, making it the perfect casual layer. To make dressing easy, this sweatshirt also features a snap placket and a heritage-inspired Mt. Katahdin logo patch. For care, machine wash and dry. Imported.\\nHuman: Do the Cozy Comfort Pullover Set have side pockets?\"\n",
" ]\n",
"}\n",
"\u001b[36;1m\u001b[1;3m[llm/end]\u001b[0m \u001b[1m[1:chain:RetrievalQA > 2:chain:StuffDocumentsChain > 3:chain:LLMChain > 4:llm:ChatOpenAI] [1.68s] Exiting LLM run with output:\n",
"\u001b[0m{\n",
" \"generations\": [\n",
" [\n",
" {\n",
" \"text\": \"The Cozy Comfort Pullover Set, Stripe has side pockets on the pull-on pants.\",\n",
" \"generation_info\": null,\n",
" \"message\": {\n",
" \"content\": \"The Cozy Comfort Pullover Set, Stripe has side pockets on the pull-on pants.\",\n",
" \"additional_kwargs\": {},\n",
" \"example\": false\n",
" }\n",
" }\n",
" ]\n",
" ],\n",
" \"llm_output\": {\n",
" \"token_usage\": {\n",
" \"prompt_tokens\": 734,\n",
" \"completion_tokens\": 18,\n",
" \"total_tokens\": 752\n",
" },\n",
" \"model_name\": \"gpt-3.5-turbo\"\n",
" }\n",
"}\n",
"\u001b[36;1m\u001b[1;3m[chain/end]\u001b[0m \u001b[1m[1:chain:RetrievalQA > 2:chain:StuffDocumentsChain > 3:chain:LLMChain] [1.68s] Exiting Chain run with output:\n",
"\u001b[0m{\n",
" \"text\": \"The Cozy Comfort Pullover Set, Stripe has side pockets on the pull-on pants.\"\n",
"}\n",
"\u001b[36;1m\u001b[1;3m[chain/end]\u001b[0m \u001b[1m[1:chain:RetrievalQA > 2:chain:StuffDocumentsChain] [1.71s] Exiting Chain run with output:\n",
"\u001b[0m{\n",
" \"output_text\": \"The Cozy Comfort Pullover Set, Stripe has side pockets on the pull-on pants.\"\n",
"}\n",
"\u001b[36;1m\u001b[1;3m[chain/end]\u001b[0m \u001b[1m[1:chain:RetrievalQA] [2.86s] Exiting Chain run with output:\n",
"\u001b[0m{\n",
" \"result\": \"The Cozy Comfort Pullover Set, Stripe has side pockets on the pull-on pants.\"\n",
"}\n"
]
},
{
"data": {
"text/plain": [
"'The Cozy Comfort Pullover Set, Stripe has side pockets on the pull-on pants.'"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"qa.run(examples[0][\"query\"])#重新运行与上面相同的示例,可以看到它开始打印出更多的信息"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "8dee0f24",
"metadata": {},
"source": [
"我们可以看到它首先深入到检索QA链中然后它进入了一些文档链。如上所述我们正在使用stuff方法现在我们正在传递这个上下文可以看到这个上下文是由我们检索到的不同文档创建的。因此在进行问答时当返回错误结果时通常不是语言模型本身出错了实际上是检索步骤出错了仔细查看问题的确切内容和上下文可以帮助调试出错的原因。 \n",
"然后,我们可以再向下一级,看看进入语言模型的确切内容,以及 OpenAI 自身,在这里,我们可以看到传递的完整提示,我们有一个系统消息,有所使用的提示的描述,这是问题回答链使用的提示,我们可以看到提示打印出来,使用以下上下文片段回答用户的问题。\n",
"如果您不知道答案只需说您不知道即可不要试图编造答案。然后我们看到一堆之前插入的上下文我们还可以看到有关实际返回类型的更多信息。我们不仅仅返回一个答案还有token的使用情况可以了解到token数的使用情况\n",
"\n",
"\n",
"由于这是一个相对简单的链,我们现在可以看到最终的响应,舒适的毛衣套装,条纹款,有侧袋,正在起泡,通过链返回给用户,我们刚刚讲解了如何查看和调试单个输入到该链的情况。\n",
"\n",
"\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "7b37c7bc",
"metadata": {},
"source": [
"##### 如何评估新创建的实例\n",
"与创建它们类似,可以运行链条来处理所有示例,然后查看输出并尝试弄清楚,发生了什么,它是否正确"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "b3d6bef0",
"metadata": {
"height": 47
},
"outputs": [],
"source": [
"# 我们需要为所有示例创建预测,关闭调试模式,以便不将所有内容打印到屏幕上\n",
"langchain.debug = False"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "d5bdbdce",
"metadata": {},
"source": [
"## 通过LLM进行评估实例"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "a4dca05a",
"metadata": {
"height": 30
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new RetrievalQA chain...\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new RetrievalQA chain...\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 1.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new RetrievalQA chain...\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 1.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n",
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 2.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n",
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n",
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 8.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n",
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 16.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new RetrievalQA chain...\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 1.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new RetrievalQA chain...\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 1.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n",
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 2.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n",
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n",
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 8.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n",
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 16.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new RetrievalQA chain...\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\n",
"\n",
"\u001b[1m> Entering new RetrievalQA chain...\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 1.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n",
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 2.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n",
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n",
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 8.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-Nul9WsPYdsnjttqS3f0hDSWd on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
}
],
"source": [
"predictions = qa.apply(examples) #为所有不同的示例创建预测"
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "6012a3e0",
"metadata": {
"height": 30
},
"outputs": [],
"source": [
"''' \n",
"对预测的结果进行评估导入QA问题回答评估链通过语言模型创建此链\n",
"'''\n",
"from langchain.evaluation.qa import QAEvalChain #导入QA问题回答评估链"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "724b1c0b",
"metadata": {
"height": 47
},
"outputs": [],
"source": [
"#通过调用chatGPT进行评估\n",
"llm = ChatOpenAI(temperature=0)\n",
"eval_chain = QAEvalChain.from_llm(llm)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8b46ae55",
"metadata": {
"height": 30
},
"outputs": [],
"source": [
"graded_outputs = eval_chain.evaluate(examples, predictions)#在此链上调用evaluate进行评估"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "9ad64f72",
"metadata": {},
"source": [
"##### 评估思路\n",
"当它面前有整个文档时它可以生成一个真实的答案我们将打印出预测的答当它进行QA链时使用embedding和向量数据库进行检索时将其传递到语言模型中然后尝试猜测预测的答案我们还将打印出成绩这也是语言模型生成的。当它要求评估链评估正在发生的事情时以及它是否正确或不正确。因此当我们循环遍历所有这些示例并将它们打印出来时可以详细了解每个示例"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "3437cfbe",
"metadata": {
"height": 132
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Example 0:\n",
"Question: Do the Cozy Comfort Pullover Set have side pockets?\n",
"Real Answer: Yes\n",
"Predicted Answer: The Cozy Comfort Pullover Set, Stripe does have side pockets.\n",
"Predicted Grade: CORRECT\n",
"\n",
"Example 1:\n",
"Question: What collection is the Ultra-Lofty 850 Stretch Down Hooded Jacket from?\n",
"Real Answer: The DownTek collection\n",
"Predicted Answer: The Ultra-Lofty 850 Stretch Down Hooded Jacket is from the DownTek collection.\n",
"Predicted Grade: CORRECT\n",
"\n",
"Example 2:\n",
"Question: What is the weight of each pair of Women's Campside Oxfords?\n",
"Real Answer: The approximate weight of each pair of Women's Campside Oxfords is 1 lb. 1 oz.\n",
"Predicted Answer: The weight of each pair of Women's Campside Oxfords is approximately 1 lb. 1 oz.\n",
"Predicted Grade: CORRECT\n",
"\n",
"Example 3:\n",
"Question: What are the dimensions of the small and medium Recycled Waterhog Dog Mat?\n",
"Real Answer: The dimensions of the small Recycled Waterhog Dog Mat are 18\" x 28\" and the dimensions of the medium Recycled Waterhog Dog Mat are 22.5\" x 34.5\".\n",
"Predicted Answer: The small Recycled Waterhog Dog Mat has dimensions of 18\" x 28\" and the medium size has dimensions of 22.5\" x 34.5\".\n",
"Predicted Grade: CORRECT\n",
"\n",
"Example 4:\n",
"Question: What are some features of the Infant and Toddler Girls' Coastal Chill Swimsuit?\n",
"Real Answer: The swimsuit features bright colors, ruffles, and exclusive whimsical prints. It is made of four-way-stretch and chlorine-resistant fabric, ensuring that it keeps its shape and resists snags. The swimsuit is also UPF 50+ rated, providing the highest rated sun protection possible by blocking 98% of the sun's harmful rays. The crossover no-slip straps and fully lined bottom ensure a secure fit and maximum coverage. Finally, it can be machine washed and line dried for best results.\n",
"Predicted Answer: The Infant and Toddler Girls' Coastal Chill Swimsuit is a two-piece swimsuit with bright colors, ruffles, and exclusive whimsical prints. It is made of four-way-stretch and chlorine-resistant fabric that keeps its shape and resists snags. The swimsuit has UPF 50+ rated fabric that provides the highest rated sun protection possible, blocking 98% of the sun's harmful rays. The crossover no-slip straps and fully lined bottom ensure a secure fit and maximum coverage. It is machine washable and should be line dried for best results.\n",
"Predicted Grade: CORRECT\n",
"\n",
"Example 5:\n",
"Question: What is the fabric composition of the Refresh Swimwear V-Neck Tankini Contrasts?\n",
"Real Answer: The body of the Refresh Swimwear V-Neck Tankini Contrasts is made of 82% recycled nylon and 18% Lycra® spandex, while the lining is made of 90% recycled nylon and 10% Lycra® spandex.\n",
"Predicted Answer: The Refresh Swimwear V-Neck Tankini Contrasts is made of 82% recycled nylon with 18% Lycra® spandex for the body and 90% recycled nylon with 10% Lycra® spandex for the lining.\n",
"Predicted Grade: CORRECT\n",
"\n",
"Example 6:\n",
"Question: What is the fabric composition of the EcoFlex 3L Storm Pants?\n",
"Real Answer: The EcoFlex 3L Storm Pants are made of 100% nylon, exclusive of trim.\n",
"Predicted Answer: The fabric composition of the EcoFlex 3L Storm Pants is 100% nylon, exclusive of trim.\n",
"Predicted Grade: CORRECT\n",
"\n"
]
}
],
"source": [
"#我们将传入示例和预测,得到一堆分级输出,循环遍历它们打印答案\n",
"for i, eg in enumerate(examples):\n",
" print(f\"Example {i}:\")\n",
" print(\"Question: \" + predictions[i]['query'])\n",
" print(\"Real Answer: \" + predictions[i]['answer'])\n",
" print(\"Predicted Answer: \" + predictions[i]['result'])\n",
" print(\"Predicted Grade: \" + graded_outputs[i]['text'])\n",
" print()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "87ecb476",
"metadata": {},
"source": [
"#### 结果分析\n",
"对于每个示例,它看起来都是正确的,让我们看看第一个例子。\n",
"这里的问题是,舒适的套头衫套装,有侧口袋吗?真正的答案,我们创建了这个,是肯定的。模型预测的答案是舒适的套头衫套装条纹,确实有侧口袋。因此,我们可以理解这是一个正确的答案。它将其评为正确。 \n",
"#### 使用模型评估的优势\n",
"\n",
"你有这些答案,它们是任意的字符串。没有单一的真实字符串是最好的可能答案,有许多不同的变体,只要它们具有相同的语义,它们应该被评为相似。如果使用正则进行精准匹配就会丢失语义信息,到目前为止存在的许多评估指标都不够好。目前最有趣和最受欢迎的之一就是使用语言模型进行评估。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
}
},
"nbformat": 4,
"nbformat_minor": 5
}