Merge pull request #47 from joyenjoye/edit
Edit for LangChain for LLM Application Development
This commit is contained in:
35
content/LangChain Chat with Data/1.简介 Introduction.md
Normal file
35
content/LangChain Chat with Data/1.简介 Introduction.md
Normal file
@ -0,0 +1,35 @@
|
||||
# 第一章 简介
|
||||
|
||||
本课程由哈里森·蔡斯 (Harrison Chase,LangChain作者)与Deeplearning.ai合作开发,课程将介绍如何使用LangChain和自有数据进行对话。
|
||||
|
||||
|
||||
## 一、背景
|
||||
大语言模型(Large Language Model, LLM), 比如ChatGPT, 可以回答许多不同的问题。但是大语言模型的知识来源于其训练数据集,并没有用户的信息(比如用户的个人数据,公司的自有数据),也没有最新发生时事的信息(在大模型数据训练后发表的文章或者新闻)。因此大模型能给出的答案比较受限。
|
||||
|
||||
如果能够让大模型在训练数据集的基础上,利用我们自有数据中的信息来回答我们的问题,那便能够得到更有用的答案。
|
||||
|
||||
|
||||
## 二、 课程基本内容
|
||||
|
||||
在本课程中,我们学习如何使用LangChain和自有数据进行对话。
|
||||
|
||||
LangChain是用于构建大模型应用程序的开源框架,有Python和JavaScript两个不同版本的包。LangChain基于模块化组合,有许多单独的组件,可以一起使用或单独使用。LangChain的组件包括:
|
||||
|
||||
- 提示(Prompts): 使模型执行操作的方式。
|
||||
- 模型(Models):大语言模型、对话模型,文本表示模型。目前包含多个模型的集成。
|
||||
- 索引(Indexes): 获取数据的方式,可以与模型结合使用。
|
||||
- 链式(Chains): 端到端功能实现。
|
||||
- 代理(Agents): 使用模型作为推理引擎
|
||||
|
||||
此外LangChain还拥有很多应用案例,帮助我们了解如何将这些模块化组件以链式方式组合,以形成更多端到端的应用程序。如果你想要了解关于LangChain的基础知识,可以学习使用 LangChain 开发基于 LLM 的应用程序课程(LangChain for LLM Application Development)。
|
||||
|
||||
在本课程中,我们将重点介绍LangChain常见的使用场景:使用LangChain和自有数据进行对话。我们首先会介绍如何使用LangChain文档加载器 (Document Loader)从不同数据源加载文档。然后,我们学习如何将这些文档切割为具有语意的段落。这步看起来简单,不同的处理可能会影响颇大。接下来,我们简要介绍语义搜索(Semantic search),以及信息检索的基础方法 - 对于的用户输入的问题,获取最相关的信息。该方法很简单,但是在某些情况下可能无法使用。我们将分析这些情况并给出解决方案。最后,我们介绍如何使用检索得到的文档,来让大语言模型(LLM)来回答关于文档的问题。
|
||||
|
||||
|
||||
## 三、致谢课程重要贡献者
|
||||
|
||||
最后特别感谢对本课程内容贡献者
|
||||
- Ankush Gola(LandChain)
|
||||
- Lance Martin(LandChain)
|
||||
- Geoff Ladwig(DeepLearning.AI)
|
||||
- Diala Ezzedine(DeepLearning.AI)
|
||||
819
content/LangChain Chat with Data/2.文档加载 Document Loading.ipynb
Normal file
819
content/LangChain Chat with Data/2.文档加载 Document Loading.ipynb
Normal file
File diff suppressed because one or more lines are too long
@ -0,0 +1,119 @@
|
||||
# Blendle's Employee Handbook
|
||||
|
||||
This is a living document with everything we've learned working with people while running a startup. And, of course, we continue to learn. Therefore it's a document that will continue to change.
|
||||
|
||||
**Everything related to working at Blendle and the people of Blendle, made public.**
|
||||
|
||||
These are the lessons from three years of working with the people of Blendle. It contains everything from [how our leaders lead](https://www.notion.so/ecfb7e647136468a9a0a32f1771a8f52?pvs=21) to [how we increase salaries](https://www.notion.so/Salary-Review-e11b6161c6d34f5c9568bb3e83ed96b6?pvs=21), from [how we hire](https://www.notion.so/Hiring-451bbcfe8d9b49438c0633326bb7af0a?pvs=21) and [fire](https://www.notion.so/Firing-5567687a2000496b8412e53cd58eed9d?pvs=21) to [how we think people should give each other feedback](https://www.notion.so/Our-Feedback-Process-eb64f1de796b4350aeab3bc068e3801f?pvs=21) — and much more.
|
||||
|
||||
We've made this document public because we want to learn from you. We're very much interested in your feedback (including weeding out typo's and Dunglish ;)). Email us at hr@blendle.com. If you're starting your own company or if you're curious as to how we do things at Blendle, we hope that our employee handbook inspires you.
|
||||
|
||||
If you want to work at Blendle you can check our [job ads here](https://blendle.homerun.co/). If you want to be kept in the loop about Blendle, you can sign up for [our behind the scenes newsletter](https://blendle.homerun.co/yes-keep-me-posted/tr/apply?token=8092d4128c306003d97dd3821bad06f2).
|
||||
|
||||
## Blendle general
|
||||
|
||||
*Information gap closing in 3... 2... 1...*
|
||||
|
||||
---
|
||||
|
||||
[To Do/Read in your first week](https://www.notion.so/To-Do-Read-in-your-first-week-9ef69b65b63a4ec7b8394ec703856c32?pvs=21)
|
||||
|
||||
[History](https://www.notion.so/History-29b2b8fd36dd48db80dc682119aaefef?pvs=21)
|
||||
|
||||
[DNA & culture](https://www.notion.so/DNA-culture-7723839e26124ed2ba3adafe8de0a080?pvs=21)
|
||||
|
||||
[General & practical ](https://www.notion.so/General-practical-87085be150824011b79891eb30ca9530?pvs=21)
|
||||
|
||||
## People operations
|
||||
|
||||
*You can tell a company's DNA by looking at how they deal with the practical stuff.*
|
||||
|
||||
---
|
||||
|
||||
[Office](https://www.notion.so/Office-b014d3d2c62240308865d11bba495322?pvs=21)
|
||||
|
||||
[Time off: holidays and national holidays](https://www.notion.so/Time-off-holidays-and-national-holidays-bd94b931280a45a6b8eb3f29c2c4b42a?pvs=21)
|
||||
|
||||
[Calling in sick/better](https://www.notion.so/Calling-in-sick-better-b82ec184fd544a8e9aa926ac37bb1ab1?pvs=21)
|
||||
|
||||
[Perks and benefits](https://www.notion.so/Perks-and-benefits-820593b38ebc44209fe35ae553100de6?pvs=21)
|
||||
|
||||
[Travel costs and reimbursements](https://www.notion.so/Travel-costs-and-reimbursements-e76623c6e0664863a769aeed028954e2?pvs=21)
|
||||
|
||||
[Parenthood](https://www.notion.so/Parenthood-a6d62b65a9d84489a75586a3c542b3f1?pvs=21)
|
||||
|
||||
## People topics
|
||||
|
||||
*Themes we care about.*
|
||||
|
||||
---
|
||||
|
||||
[Blendle Social Code](https://www.notion.so/Blendle-Social-Code-685a79c8df154ee09f35b35cc147af6b?pvs=21)
|
||||
|
||||
[Diversity and inclusion](https://www.notion.so/Diversity-and-inclusion-d7f9d3e6b6ef4a1ab8f2c0a7b3ea3eec?pvs=21)
|
||||
|
||||
[#letstalkaboutstress](https://www.notion.so/letstalkaboutstress-d46961f6ac98432ab07b5d5afc52c2d0?pvs=21)
|
||||
|
||||
## Feedback and development
|
||||
|
||||
*The number 1 reason for people to work at Blendle is growth and learning from smart people.*
|
||||
|
||||
---
|
||||
|
||||
[Your 1st month ](https://www.notion.so/Your-1st-month-85909edc55a34f349bbed522c5245a65?pvs=21)
|
||||
|
||||
[Goals](https://www.notion.so/Goals-122bff69bd634c519cd3c6dc01dbc282?pvs=21)
|
||||
|
||||
[Feedback cycle](https://www.notion.so/Feedback-cycle-5f32358dba874c39be5ca5aa464c310e?pvs=21)
|
||||
|
||||
[The Matrix™ (job profiles)](https://www.notion.so/The-Matrix-job-profiles-da91736ff35545458559eceb0075ed66?pvs=21)
|
||||
|
||||
[Blendle library](https://www.notion.so/Blendle-library-f34188e536234c9a8976c9d4602b0be3?pvs=21)
|
||||
|
||||
## **Hiring**
|
||||
|
||||
*The coolest and most impactful thing when done right.*
|
||||
|
||||
---
|
||||
|
||||
[Rating systems](https://www.notion.so/Rating-systems-2ba332377459427194acc798e5f8869c?pvs=21)
|
||||
|
||||
[Getting people in (branding&sourcing)](https://www.notion.so/Getting-people-in-branding-sourcing-a3277fef078041a881f56556e24f0d8a?pvs=21)
|
||||
|
||||
[Highly Skilled Migrants and relocation](https://www.notion.so/Highly-Skilled-Migrants-and-relocation-84a6576fb27d4a8fae2f73e4eae57d21?pvs=21)
|
||||
|
||||
## How to lead at Blendle
|
||||
|
||||
*Here are some tips and tools to help you become a great leader.*
|
||||
|
||||
---
|
||||
|
||||
[How to lead at Blendle ](https://www.notion.so/How-to-lead-at-Blendle-f8c6b1d989d841bb87510fc2ab1ba970?pvs=21)
|
||||
|
||||
[Your check-list](https://www.notion.so/Your-check-list-aaca857a846848688da3a37f28682c15?pvs=21)
|
||||
|
||||
[Leading Feedback ](https://www.notion.so/Leading-Feedback-a1970c9f7b70443d881ca92d4e98be25?pvs=21)
|
||||
|
||||
[Salary talks](https://www.notion.so/Salary-talks-35681ab732c048a9bbdf8c50babe64b5?pvs=21)
|
||||
|
||||
[Hiring ](https://www.notion.so/Hiring-0bdf54d3d25f4c59bfdf3712a5104bbc?pvs=21)
|
||||
|
||||
[Firing](https://www.notion.so/Firing-e0da1de62b304751bbd95a681908c7ad?pvs=21)
|
||||
|
||||
[Party and study budget](https://www.notion.so/Party-and-study-budget-4e31001531c24d0fa447bbfcd6ccfd3f?pvs=21)
|
||||
|
||||
[Holidays](https://www.notion.so/Holidays-1529506bb8884f0aa11cc799ced11ed0?pvs=21)
|
||||
|
||||
[Sickness absence](https://www.notion.so/Sickness-absence-79a495f601df4004801475ea79b3d198?pvs=21)
|
||||
|
||||
[Personal User Guide](https://www.notion.so/Personal-User-Guide-be2238ccb597412e8a517d40cda7e7d5?pvs=21)
|
||||
|
||||
[Soft shizzle](https://www.notion.so/Soft-shizzle-41255d79fbe84492b153121cd7a2e3e8?pvs=21)
|
||||
|
||||
## About this document
|
||||
|
||||
---
|
||||
|
||||
*Lessons from three years of HR*
|
||||
|
||||
[About this document and the author](https://www.notion.so/About-this-document-and-the-author-ee1faab1bcae4456b8c62043a8a194cd?pvs=21)
|
||||
Binary file not shown.
@ -0,0 +1,32 @@
|
||||
# 第一章 简介
|
||||
|
||||
欢迎来到LangChain大模型应用开发短期课程👏🏻👏🏻
|
||||
|
||||
本课程由哈里森·蔡斯 (Harrison Chase,LangChain作者)与Deeplearning.ai合作开发,旨在教大家使用这个神奇工具。
|
||||
|
||||
|
||||
## 一、LangChain的诞生和发展
|
||||
|
||||
通过对LLM或大型语言模型给出提示(prompt),现在可以比以往更快地开发AI应用程序,但是一个应用程序可能需要进行多轮提示以及解析输出。
|
||||
|
||||
在此过程有很多胶水代码需要编写,基于此需求,哈里森·蔡斯 (Harrison Chase) 创建了LangChain,使开发过程变得更加丝滑。
|
||||
|
||||
LangChain开源社区快速发展,贡献者已达数百人,正以惊人的速度更新代码和功能。
|
||||
|
||||
|
||||
## 二、课程基本内容
|
||||
|
||||
LangChain是用于构建大模型应用程序的开源框架,有Python和JavaScript两个不同版本的包。LangChain基于模块化组合,有许多单独的组件,可以一起使用或单独使用。此外LangChain还拥有很多应用案例,帮助我们了解如何将这些模块化组件以链式方式组合,以形成更多端到端的应用程序 。
|
||||
|
||||
在本课程中,我们将介绍LandChain的常见组件。具体而言我们会讨论一下几个方面
|
||||
- 模型(Models)
|
||||
- 提示(Prompts): 使模型执行操作的方式
|
||||
- 索引(Indexes): 获取数据的方式,可以与模型结合使用
|
||||
- 链式(Chains): 端到端功能实现
|
||||
- 代理(Agents): 使用模型作为推理引擎
|
||||
|
||||
|
||||
|
||||
## 三、致谢课程重要贡献者
|
||||
|
||||
最后特别感谢Ankush Gholar(LandChain的联合作者)、Geoff Ladwig,、Eddy Shyu 以及 Diala Ezzedine,他们也为本课程内容贡献颇多~
|
||||
@ -1,87 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "cfab521b-77fa-41be-a964-1f50f2ef4689",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 1. 简介\n",
|
||||
"<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#1.1-LangChain的诞生和发展\" data-toc-modified-id=\"1.1-LangChain的诞生和发展-1\">1.1 LangChain的诞生和发展</a></span></li><li><span><a href=\"#1.2-课程基本内容\" data-toc-modified-id=\"1.2-课程基本内容-2\">1.2 课程基本内容</a></span></li><li><span><a href=\"#1.3-致谢课程重要贡献者\" data-toc-modified-id=\"1.3-致谢课程重要贡献者-3\">1.3 致谢课程重要贡献者</a></span></li></ul></div>\n",
|
||||
"\n",
|
||||
"欢迎来到LangChain大模型应用开发短期课程👏🏻👏🏻\n",
|
||||
"\n",
|
||||
"本课程由哈里森·蔡斯 (Harrison Chase,LangChain作者)与Deeplearning.ai合作开发,旨在教大家使用这个神奇工具。\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"## 1.1 LangChain的诞生和发展\n",
|
||||
"\n",
|
||||
"通过对LLM或大型语言模型给出提示(prompt),现在可以比以往更快地开发AI应用程序,但是一个应用程序可能需要进行多轮提示以及解析输出。\n",
|
||||
"\n",
|
||||
"在此过程有很多胶水代码需要编写,基于此需求,哈里森·蔡斯 (Harrison Chase) 创建了LangChain,使开发过程变得更加丝滑。\n",
|
||||
"\n",
|
||||
"LangChain开源社区快速发展,贡献者已达数百人,正以惊人的速度更新代码和功能。\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"## 1.2 课程基本内容\n",
|
||||
"\n",
|
||||
"LangChain是用于构建大模型应用程序的开源框架,有Python和JavaScript两个不同版本的包。LangChain基于模块化组合,有许多单独的组件,可以一起使用或单独使用。此外LangChain还拥有很多应用案例,帮助我们了解如何将这些模块化组件以链式方式组合,以形成更多端到端的应用程序 。\n",
|
||||
"\n",
|
||||
"在本课程中,我们将介绍LandChain的常见组件。具体而言我们会讨论一下几个方面\n",
|
||||
"- 模型(Models)\n",
|
||||
"- 提示(Prompts): 使模型执行操作的方式\n",
|
||||
"- 索引(Indexes): 获取数据的方式,可以与模型结合使用\n",
|
||||
"- 链式(Chains): 端到端功能实现\n",
|
||||
"- 代理(Agents): 使用模型作为推理引擎\n",
|
||||
"\n",
|
||||
" \n",
|
||||
"\n",
|
||||
"## 1.3 致谢课程重要贡献者\n",
|
||||
"\n",
|
||||
"最后特别感谢Ankush Gholar(LandChain的联合作者)、Geoff Ladwig,、Eddy Shyu 以及 Diala Ezzedine,他们也为本课程内容贡献颇多~ "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e3618ca8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.12"
|
||||
},
|
||||
"toc": {
|
||||
"base_numbering": 1,
|
||||
"nav_menu": {},
|
||||
"number_sections": false,
|
||||
"sideBar": true,
|
||||
"skip_h1_title": false,
|
||||
"title_cell": "Table of Contents",
|
||||
"title_sidebar": "Contents",
|
||||
"toc_cell": false,
|
||||
"toc_position": {},
|
||||
"toc_section_display": true,
|
||||
"toc_window_display": true
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large
Load Diff
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large
Load Diff
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large
Load Diff
File diff suppressed because one or more lines are too long
@ -1,848 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f200ba9a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 5 基于文档的问答 \n",
|
||||
"<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#5.1-导入embedding模型和向量存储组件\" data-toc-modified-id=\"5.1-导入embedding模型和向量存储组件-1\">5.1 导入embedding模型和向量存储组件</a></span><ul class=\"toc-item\"><li><span><a href=\"#5.1.2-创建向量存储\" data-toc-modified-id=\"5.1.2-创建向量存储-1.1\">5.1.2 创建向量存储</a></span></li><li><span><a href=\"#5.1.3-使用语言模型与文档结合使用\" data-toc-modified-id=\"5.1.3-使用语言模型与文档结合使用-1.2\">5.1.3 使用语言模型与文档结合使用</a></span></li></ul></li><li><span><a href=\"#5.2-如何回答我们文档的相关问题\" data-toc-modified-id=\"5.2-如何回答我们文档的相关问题-2\">5.2 如何回答我们文档的相关问题</a></span><ul class=\"toc-item\"><li><span><a href=\"#5.2.1-不同类型的chain链\" data-toc-modified-id=\"5.2.1-不同类型的chain链-2.1\">5.2.1 不同类型的chain链</a></span></li></ul></li></ul></div>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "52824b89-532a-4e54-87e9-1410813cd39e",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"\n",
|
||||
"本章内容主要利用langchain构建向量数据库,可以在文档上方或关于文档回答问题,因此,给定从PDF文件、网页或某些公司的内部文档收集中提取的文本,使用llm回答有关这些文档内容的问题"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4aac484b",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"source": [
|
||||
"\n",
|
||||
"\n",
|
||||
"安装langchain,设置chatGPT的OPENAI_API_KEY\n",
|
||||
"\n",
|
||||
"* 安装langchain\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"pip install langchain\n",
|
||||
"```\n",
|
||||
"* 安装docarray\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"pip install docarray\n",
|
||||
"```\n",
|
||||
"* 设置API-KEY环境变量\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"export OPENAI_API_KEY='api-key'\n",
|
||||
"\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "b7ed03ed-1322-49e3-b2a2-33e94fb592ef",
|
||||
"metadata": {
|
||||
"height": 81,
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"from dotenv import load_dotenv, find_dotenv\n",
|
||||
"_ = load_dotenv(find_dotenv()) #读取环境变量"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 52,
|
||||
"id": "af8c3c96",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'\\n\\n人工智能是一项极具前景的技术,它的发展正在改变人类的生活方式,带来了无数的便利,也被认为是未来发展的重要标志。人工智能的发展让许多复杂的任务变得更加容易,更高效的完成,节省了大量的时间和精力,为人类发展带来了极大的帮助。'"
|
||||
]
|
||||
},
|
||||
"execution_count": 52,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain.llms import OpenAI\n",
|
||||
"\n",
|
||||
"llm = OpenAI(model_name=\"text-davinci-003\",max_tokens=1024)\n",
|
||||
"llm(\"怎么评价人工智能\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8cb7a7ec",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"source": [
|
||||
"## 5.1 导入embedding模型和向量存储组件\n",
|
||||
"使用Dock Array内存搜索向量存储,作为一个内存向量存储,不需要连接外部数据库"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "974acf8e-8f88-42de-88f8-40a82cb58e8b",
|
||||
"metadata": {
|
||||
"height": 98
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.chains import RetrievalQA #检索QA链,在文档上进行检索\n",
|
||||
"from langchain.chat_models import ChatOpenAI #openai模型\n",
|
||||
"from langchain.document_loaders import CSVLoader #文档加载器,采用csv格式存储\n",
|
||||
"from langchain.vectorstores import DocArrayInMemorySearch #向量存储\n",
|
||||
"from IPython.display import display, Markdown #在jupyter显示信息的工具"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "7249846e",
|
||||
"metadata": {
|
||||
"height": 75
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#读取文件\n",
|
||||
"file = 'OutdoorClothingCatalog_1000.csv'\n",
|
||||
"loader = CSVLoader(file_path=file)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 24,
|
||||
"id": "7724f00e",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/html": [
|
||||
"<div>\n",
|
||||
"<style scoped>\n",
|
||||
" .dataframe tbody tr th:only-of-type {\n",
|
||||
" vertical-align: middle;\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
" .dataframe tbody tr th {\n",
|
||||
" vertical-align: top;\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
" .dataframe thead th {\n",
|
||||
" text-align: right;\n",
|
||||
" }\n",
|
||||
"</style>\n",
|
||||
"<table border=\"1\" class=\"dataframe\">\n",
|
||||
" <thead>\n",
|
||||
" <tr style=\"text-align: right;\">\n",
|
||||
" <th></th>\n",
|
||||
" <th>0</th>\n",
|
||||
" <th>1</th>\n",
|
||||
" <th>2</th>\n",
|
||||
" </tr>\n",
|
||||
" </thead>\n",
|
||||
" <tbody>\n",
|
||||
" <tr>\n",
|
||||
" <th>0</th>\n",
|
||||
" <td>NaN</td>\n",
|
||||
" <td>name</td>\n",
|
||||
" <td>description</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>1</th>\n",
|
||||
" <td>0.0</td>\n",
|
||||
" <td>Women's Campside Oxfords</td>\n",
|
||||
" <td>This ultracomfortable lace-to-toe Oxford boast...</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>2</th>\n",
|
||||
" <td>1.0</td>\n",
|
||||
" <td>Recycled Waterhog Dog Mat, Chevron Weave</td>\n",
|
||||
" <td>Protect your floors from spills and splashing ...</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>3</th>\n",
|
||||
" <td>2.0</td>\n",
|
||||
" <td>Infant and Toddler Girls' Coastal Chill Swimsu...</td>\n",
|
||||
" <td>She'll love the bright colors, ruffles and exc...</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>4</th>\n",
|
||||
" <td>3.0</td>\n",
|
||||
" <td>Refresh Swimwear, V-Neck Tankini Contrasts</td>\n",
|
||||
" <td>Whether you're going for a swim or heading out...</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>...</th>\n",
|
||||
" <td>...</td>\n",
|
||||
" <td>...</td>\n",
|
||||
" <td>...</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>996</th>\n",
|
||||
" <td>995.0</td>\n",
|
||||
" <td>Men's Classic Denim, Standard Fit</td>\n",
|
||||
" <td>Crafted from premium denim that will last wash...</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>997</th>\n",
|
||||
" <td>996.0</td>\n",
|
||||
" <td>CozyPrint Sweater Fleece Pullover</td>\n",
|
||||
" <td>The ultimate sweater fleece - made from superi...</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>998</th>\n",
|
||||
" <td>997.0</td>\n",
|
||||
" <td>Women's NRS Endurance Spray Paddling Pants</td>\n",
|
||||
" <td>These comfortable and affordable splash paddli...</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>999</th>\n",
|
||||
" <td>998.0</td>\n",
|
||||
" <td>Women's Stop Flies Hoodie</td>\n",
|
||||
" <td>This great-looking hoodie uses No Fly Zone Tec...</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>1000</th>\n",
|
||||
" <td>999.0</td>\n",
|
||||
" <td>Modern Utility Bag</td>\n",
|
||||
" <td>This US-made crossbody bag is built with the s...</td>\n",
|
||||
" </tr>\n",
|
||||
" </tbody>\n",
|
||||
"</table>\n",
|
||||
"<p>1001 rows × 3 columns</p>\n",
|
||||
"</div>"
|
||||
],
|
||||
"text/plain": [
|
||||
" 0 1 \n",
|
||||
"0 NaN name \\\n",
|
||||
"1 0.0 Women's Campside Oxfords \n",
|
||||
"2 1.0 Recycled Waterhog Dog Mat, Chevron Weave \n",
|
||||
"3 2.0 Infant and Toddler Girls' Coastal Chill Swimsu... \n",
|
||||
"4 3.0 Refresh Swimwear, V-Neck Tankini Contrasts \n",
|
||||
"... ... ... \n",
|
||||
"996 995.0 Men's Classic Denim, Standard Fit \n",
|
||||
"997 996.0 CozyPrint Sweater Fleece Pullover \n",
|
||||
"998 997.0 Women's NRS Endurance Spray Paddling Pants \n",
|
||||
"999 998.0 Women's Stop Flies Hoodie \n",
|
||||
"1000 999.0 Modern Utility Bag \n",
|
||||
"\n",
|
||||
" 2 \n",
|
||||
"0 description \n",
|
||||
"1 This ultracomfortable lace-to-toe Oxford boast... \n",
|
||||
"2 Protect your floors from spills and splashing ... \n",
|
||||
"3 She'll love the bright colors, ruffles and exc... \n",
|
||||
"4 Whether you're going for a swim or heading out... \n",
|
||||
"... ... \n",
|
||||
"996 Crafted from premium denim that will last wash... \n",
|
||||
"997 The ultimate sweater fleece - made from superi... \n",
|
||||
"998 These comfortable and affordable splash paddli... \n",
|
||||
"999 This great-looking hoodie uses No Fly Zone Tec... \n",
|
||||
"1000 This US-made crossbody bag is built with the s... \n",
|
||||
"\n",
|
||||
"[1001 rows x 3 columns]"
|
||||
]
|
||||
},
|
||||
"execution_count": 24,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"#查看数据\n",
|
||||
"import pandas as pd\n",
|
||||
"data = pd.read_csv(file,header=None)\n",
|
||||
"data"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "3bd6422c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"提供了一个户外服装的CSV文件,我们将使用它与语言模型结合使用"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "2963fc63",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 5.1.2 创建向量存储\n",
|
||||
"将导入一个索引,即向量存储索引创建器"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 25,
|
||||
"id": "5bfaba30",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.indexes import VectorstoreIndexCreator #导入向量存储索引创建器"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "9e200726",
|
||||
"metadata": {
|
||||
"height": 64
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"'''\n",
|
||||
"将指定向量存储类,创建完成后,我们将从加载器中调用,通过文档记载器列表加载\n",
|
||||
"'''\n",
|
||||
"\n",
|
||||
"index = VectorstoreIndexCreator(\n",
|
||||
" vectorstore_cls=DocArrayInMemorySearch\n",
|
||||
").from_loaders([loader])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "34562d81",
|
||||
"metadata": {
|
||||
"height": 47
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"query =\"Please list all your shirts with sun protection \\\n",
|
||||
"in a table in markdown and summarize each one.\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 21,
|
||||
"id": "cfd0cc37",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response = index.query(query)#使用索引查询创建一个响应,并传入这个查询"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 23,
|
||||
"id": "ae21f1ff",
|
||||
"metadata": {
|
||||
"height": 30,
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/markdown": [
|
||||
"\n",
|
||||
"\n",
|
||||
"| Name | Description |\n",
|
||||
"| --- | --- |\n",
|
||||
"| Men's Tropical Plaid Short-Sleeve Shirt | UPF 50+ rated, 100% polyester, wrinkle-resistant, front and back cape venting, two front bellows pockets |\n",
|
||||
"| Men's Plaid Tropic Shirt, Short-Sleeve | UPF 50+ rated, 52% polyester and 48% nylon, machine washable and dryable, front and back cape venting, two front bellows pockets |\n",
|
||||
"| Men's TropicVibe Shirt, Short-Sleeve | UPF 50+ rated, 71% Nylon, 29% Polyester, 100% Polyester knit mesh, machine wash and dry, front and back cape venting, two front bellows pockets |\n",
|
||||
"| Sun Shield Shirt by | UPF 50+ rated, 78% nylon, 22% Lycra Xtra Life fiber, handwash, line dry, wicks moisture, fits comfortably over swimsuit, abrasion resistant |\n",
|
||||
"\n",
|
||||
"All four shirts provide UPF 50+ sun protection, blocking 98% of the sun's harmful rays. The Men's Tropical Plaid Short-Sleeve Shirt is made of 100% polyester and is wrinkle-resistant"
|
||||
],
|
||||
"text/plain": [
|
||||
"<IPython.core.display.Markdown object>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"display(Markdown(response))#查看查询返回的内容"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "eb74cc79",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"得到了一个Markdown表格,其中包含所有带有防晒衣的衬衫的名称和描述,还得到了一个语言模型提供的不错的小总结"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "dd34e50e",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 5.1.3 使用语言模型与文档结合使用\n",
|
||||
"想要使用语言模型并将其与我们的许多文档结合使用,但是语言模型一次只能检查几千个单词,如果我们有非常大的文档,如何让语言模型回答关于其中所有内容的问题呢?通过embedding和向量存储实现\n",
|
||||
"* embedding \n",
|
||||
"文本片段创建数值表示文本语义,相似内容的文本片段将具有相似的向量,这使我们可以在向量空间中比较文本片段\n",
|
||||
"* 向量数据库 \n",
|
||||
"向量数据库是存储我们在上一步中创建的这些向量表示的一种方式,我们创建这个向量数据库的方式是用来自传入文档的文本块填充它。\n",
|
||||
"当我们获得一个大的传入文档时,我们首先将其分成较小的块,因为我们可能无法将整个文档传递给语言模型,因此采用分块embedding的方式储存到向量数据库中。这就是创建索引的过程。\n",
|
||||
"\n",
|
||||
"通过运行时使用索引来查找与传入查询最相关的文本片段,然后我们将其与向量数据库中的所有向量进行比较,并选择最相似的n个,返回语言模型得到最终答案"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 26,
|
||||
"id": "631396c6",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#创建一个文档加载器,通过csv格式加载\n",
|
||||
"loader = CSVLoader(file_path=file)\n",
|
||||
"docs = loader.load()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 27,
|
||||
"id": "4a977f44",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Document(page_content=\": 0\\nname: Women's Campside Oxfords\\ndescription: This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on. \\n\\nSize & Fit: Order regular shoe size. For half sizes not offered, order up to next whole size. \\n\\nSpecs: Approx. weight: 1 lb.1 oz. per pair. \\n\\nConstruction: Soft canvas material for a broken-in feel and look. Comfortable EVA innersole with Cleansport NXT® antimicrobial odor control. Vintage hunt, fish and camping motif on innersole. Moderate arch contour of innersole. EVA foam midsole for cushioning and support. Chain-tread-inspired molded rubber outsole with modified chain-tread pattern. Imported. \\n\\nQuestions? Please contact us for any inquiries.\", metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 0})"
|
||||
]
|
||||
},
|
||||
"execution_count": 27,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"docs[0]#查看单个文档,我们可以看到每个文档对应于CSV中的一个块"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 31,
|
||||
"id": "e875693a",
|
||||
"metadata": {
|
||||
"height": 47
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"'''\n",
|
||||
"因为这些文档已经非常小了,所以我们实际上不需要在这里进行任何分块,可以直接进行embedding\n",
|
||||
"'''\n",
|
||||
"\n",
|
||||
"from langchain.embeddings import OpenAIEmbeddings #要创建可以直接进行embedding,我们将使用OpenAI的可以直接进行embedding类\n",
|
||||
"embeddings = OpenAIEmbeddings() #初始化"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 32,
|
||||
"id": "779bec75",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"embed = embeddings.embed_query(\"Hi my name is Harrison\")#让我们使用embedding上的查询方法为特定文本创建embedding"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 33,
|
||||
"id": "699aaaf9",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"1536\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(len(embed))#查看这个embedding,我们可以看到有超过一千个不同的元素"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 34,
|
||||
"id": "9d00d346",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[-0.021933607757091522, 0.006697045173496008, -0.01819835603237152, -0.039113257080316544, -0.014060650952160358]\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(embed[:5])#每个元素都是不同的数字值,组合起来,这就创建了这段文本的总体数值表示"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 35,
|
||||
"id": "27ad0bb0",
|
||||
"metadata": {
|
||||
"height": 81
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"'''\n",
|
||||
"为刚才的文本创建embedding,准备将它们存储在向量存储中,使用向量存储上的from documents方法来实现。\n",
|
||||
"该方法接受文档列表、嵌入对象,然后我们将创建一个总体向量存储\n",
|
||||
"'''\n",
|
||||
"db = DocArrayInMemorySearch.from_documents(\n",
|
||||
" docs, \n",
|
||||
" embeddings\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 36,
|
||||
"id": "0329bfd5",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"query = \"Please suggest a shirt with sunblocking\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 37,
|
||||
"id": "7909c6b7",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"docs = db.similarity_search(query)#使用这个向量存储来查找与传入查询类似的文本,如果我们在向量存储中使用相似性搜索方法并传入一个查询,我们将得到一个文档列表"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 38,
|
||||
"id": "43321853",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"4"
|
||||
]
|
||||
},
|
||||
"execution_count": 38,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"len(docs)# 我们可以看到它返回了四个文档"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 39,
|
||||
"id": "6eba90b5",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Document(page_content=': 255\\nname: Sun Shield Shirt by\\ndescription: \"Block the sun, not the fun – our high-performance sun shirt is guaranteed to protect from harmful UV rays. \\n\\nSize & Fit: Slightly Fitted: Softly shapes the body. Falls at hip.\\n\\nFabric & Care: 78% nylon, 22% Lycra Xtra Life fiber. UPF 50+ rated – the highest rated sun protection possible. Handwash, line dry.\\n\\nAdditional Features: Wicks moisture for quick-drying comfort. Fits comfortably over your favorite swimsuit. Abrasion resistant for season after season of wear. Imported.\\n\\nSun Protection That Won\\'t Wear Off\\nOur high-performance fabric provides SPF 50+ sun protection, blocking 98% of the sun\\'s harmful rays. This fabric is recommended by The Skin Cancer Foundation as an effective UV protectant.', metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 255})"
|
||||
]
|
||||
},
|
||||
"execution_count": 39,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"docs[0] #,如果我们看第一个文档,我们可以看到它确实是一件关于防晒的衬衫"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "fe41b36f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 5.2 如何回答我们文档的相关问题\n",
|
||||
"首先,我们需要从这个向量存储中创建一个检索器,检索器是一个通用接口,可以由任何接受查询并返回文档的方法支持。接下来,因为我们想要进行文本生成并返回自然语言响应\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 40,
|
||||
"id": "c0c3596e",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"retriever = db.as_retriever() #创建检索器通用接口"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 55,
|
||||
"id": "0625f5e8",
|
||||
"metadata": {
|
||||
"height": 47
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"llm = ChatOpenAI(temperature = 0.0,max_tokens=1024) #导入语言模型\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 43,
|
||||
"id": "a573f58a",
|
||||
"metadata": {
|
||||
"height": 47
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"qdocs = \"\".join([docs[i].page_content for i in range(len(docs))]) # 将合并文档中的所有页面内容到一个变量中\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "14682d95",
|
||||
"metadata": {
|
||||
"height": 64
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response = llm.call_as_llm(f\"{qdocs} Question: Please list all your \\\n",
|
||||
"shirts with sun protection in a table in markdown and summarize each one.\") #列出所有具有防晒功能的衬衫并在Markdown表格中总结每个衬衫的语言模型\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 28,
|
||||
"id": "8bba545b",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/markdown": [
|
||||
"| Name | Description |\n",
|
||||
"| --- | --- |\n",
|
||||
"| Sun Shield Shirt | High-performance sun shirt with UPF 50+ sun protection, moisture-wicking, and abrasion-resistant fabric. Recommended by The Skin Cancer Foundation. |\n",
|
||||
"| Men's Plaid Tropic Shirt | Ultracomfortable shirt with UPF 50+ sun protection, wrinkle-free fabric, and front/back cape venting. Made with 52% polyester and 48% nylon. |\n",
|
||||
"| Men's TropicVibe Shirt | Men's sun-protection shirt with built-in UPF 50+ and front/back cape venting. Made with 71% nylon and 29% polyester. |\n",
|
||||
"| Men's Tropical Plaid Short-Sleeve Shirt | Lightest hot-weather shirt with UPF 50+ sun protection, front/back cape venting, and two front bellows pockets. Made with 100% polyester and is wrinkle-resistant. |\n",
|
||||
"\n",
|
||||
"All of these shirts provide UPF 50+ sun protection, blocking 98% of the sun's harmful rays. They are made with high-performance fabrics that are moisture-wicking, wrinkle-resistant, and abrasion-resistant. The Men's Plaid Tropic Shirt and Men's Tropical Plaid Short-Sleeve Shirt both have front/back cape venting for added breathability. The Sun Shield Shirt is recommended by The Skin Cancer Foundation as an effective UV protectant."
|
||||
],
|
||||
"text/plain": [
|
||||
"<IPython.core.display.Markdown object>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"display(Markdown(response))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "12f042e7",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"在此处打印响应,我们可以看到我们得到了一个表格,正如我们所要求的那样"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 56,
|
||||
"id": "32c94d22",
|
||||
"metadata": {
|
||||
"height": 115
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"''' \n",
|
||||
"通过LangChain链封装起来\n",
|
||||
"创建一个检索QA链,对检索到的文档进行问题回答,要创建这样的链,我们将传入几个不同的东西\n",
|
||||
"1、语言模型,在最后进行文本生成\n",
|
||||
"2、传入链类型,这里使用stuff,将所有文档塞入上下文并对语言模型进行一次调用\n",
|
||||
"3、传入一个检索器\n",
|
||||
"'''\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"qa_stuff = RetrievalQA.from_chain_type(\n",
|
||||
" llm=llm, \n",
|
||||
" chain_type=\"stuff\", \n",
|
||||
" retriever=retriever, \n",
|
||||
" verbose=True\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 46,
|
||||
"id": "e4769316",
|
||||
"metadata": {
|
||||
"height": 47
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"query = \"Please list all your shirts with sun protection in a table \\\n",
|
||||
"in markdown and summarize each one.\"#创建一个查询并在此查询上运行链"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "1fc3c2f3",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response = qa_stuff.run(query)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 58,
|
||||
"id": "fba1a5db",
|
||||
"metadata": {
|
||||
"height": 30
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/markdown": [
|
||||
"\n",
|
||||
"\n",
|
||||
"| Name | Description |\n",
|
||||
"| --- | --- |\n",
|
||||
"| Men's Tropical Plaid Short-Sleeve Shirt | UPF 50+ rated, 100% polyester, wrinkle-resistant, front and back cape venting, two front bellows pockets |\n",
|
||||
"| Men's Plaid Tropic Shirt, Short-Sleeve | UPF 50+ rated, 52% polyester and 48% nylon, machine washable and dryable, front and back cape venting, two front bellows pockets |\n",
|
||||
"| Men's TropicVibe Shirt, Short-Sleeve | UPF 50+ rated, 71% Nylon, 29% Polyester, 100% Polyester knit mesh, machine wash and dry, front and back cape venting, two front bellows pockets |\n",
|
||||
"| Sun Shield Shirt by | UPF 50+ rated, 78% nylon, 22% Lycra Xtra Life fiber, handwash, line dry, wicks moisture, fits comfortably over swimsuit, abrasion resistant |\n",
|
||||
"\n",
|
||||
"All four shirts provide UPF 50+ sun protection, blocking 98% of the sun's harmful rays. The Men's Tropical Plaid Short-Sleeve Shirt is made of 100% polyester and is wrinkle-resistant"
|
||||
],
|
||||
"text/plain": [
|
||||
"<IPython.core.display.Markdown object>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"display(Markdown(response))#使用 display 和 markdown 显示它"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "e28c5657",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"这两个方式返回相同的结果"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "44f1fa38",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 5.2.1 不同类型的chain链\n",
|
||||
"想在许多不同类型的块上执行相同类型的问答,该怎么办?之前的实验中只返回了4个文档,如果有多个文档,那么我们可以使用几种不同的方法\n",
|
||||
"* Map Reduce \n",
|
||||
"将所有块与问题一起传递给语言模型,获取回复,使用另一个语言模型调用将所有单独的回复总结成最终答案,它可以在任意数量的文档上运行。可以并行处理单个问题,同时也需要更多的调用。它将所有文档视为独立的\n",
|
||||
"* Refine \n",
|
||||
"用于循环许多文档,际上是迭代的,建立在先前文档的答案之上,非常适合前后因果信息并随时间逐步构建答案,依赖于先前调用的结果。它通常需要更长的时间,并且基本上需要与Map Reduce一样多的调用\n",
|
||||
"* Map Re-rank \n",
|
||||
"对每个文档进行单个语言模型调用,要求它返回一个分数,选择最高分,这依赖于语言模型知道分数应该是什么,需要告诉它,如果它与文档相关,则应该是高分,并在那里精细调整说明,可以批量处理它们相对较快,但是更加昂贵\n",
|
||||
"* Stuff \n",
|
||||
"将所有内容组合成一个文档"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.12"
|
||||
},
|
||||
"toc": {
|
||||
"base_numbering": 1,
|
||||
"nav_menu": {},
|
||||
"number_sections": false,
|
||||
"sideBar": true,
|
||||
"skip_h1_title": false,
|
||||
"title_cell": "Table of Contents",
|
||||
"title_sidebar": "Contents",
|
||||
"toc_cell": false,
|
||||
"toc_position": {},
|
||||
"toc_section_display": true,
|
||||
"toc_window_display": true
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large
Load Diff
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,19 @@
|
||||
# 第八章 总结
|
||||
|
||||
|
||||
本次简短课程涵盖了一系列LangChain的应用实践,包括处理顾客评论和基于文档回答问题,以及通过LLM判断何时求助外部工具 (如网站) 来回答复杂问题。
|
||||
|
||||
**👍🏻 LangChain如此强大**
|
||||
|
||||
构建这类应用曾经需要耗费数周时间,而现在只需要非常少的代码,就可以通过LangChain高效构建所需的应用程序。LangChain已成为开发大模型应用的有力范式,希望大家拥抱这个强大工具,积极探索更多更广泛的应用场景。
|
||||
|
||||
**🌈 不同组合, 更多可能性**
|
||||
|
||||
LangChain还可以协助我们做什么呢:基于CSV文件回答问题、查询sql数据库、与api交互,有很多例子通过Chain以及不同的提示(Prompts)和输出解析器(output parsers)组合得以实现。
|
||||
|
||||
**💪🏻 出发 去探索新世界吧**
|
||||
|
||||
因此非常感谢社区中做出贡献的每一个人,无论是协助文档的改进,还是让其他人更容易上手,还是构建新的Chain打开一个全新的世界。
|
||||
|
||||
如果你还没有这样做,快去打开电脑,运行 pip install LangChain,然后去使用LangChain、搭建惊艳的应用吧~
|
||||
|
||||
@ -1,64 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "87f7cfaa",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 8. 总结\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"本次简短课程涵盖了一系列LangChain的应用实践,包括处理顾客评论和基于文档回答问题,以及通过LLM判断何时求助外部工具 (如网站) 来回答复杂问题。\n",
|
||||
"\n",
|
||||
"**👍🏻 LangChain如此强大**\n",
|
||||
"\n",
|
||||
"构建这类应用曾经需要耗费数周时间,而现在只需要非常少的代码,就可以通过LangChain高效构建所需的应用程序。LangChain已成为开发大模型应用的有力范式,希望大家拥抱这个强大工具,积极探索更多更广泛的应用场景。\n",
|
||||
"\n",
|
||||
"**🌈 不同组合, 更多可能性**\n",
|
||||
"\n",
|
||||
"LangChain还可以协助我们做什么呢:基于CSV文件回答问题、查询sql数据库、与api交互,有很多例子通过Chain以及不同的提示(Prompts)和输出解析器(output parsers)组合得以实现。\n",
|
||||
"\n",
|
||||
"**💪🏻 出发 去探索新世界吧**\n",
|
||||
"\n",
|
||||
"因此非常感谢社区中做出贡献的每一个人,无论是协助文档的改进,还是让其他人更容易上手,还是构建新的Chain打开一个全新的世界。\n",
|
||||
"\n",
|
||||
"如果你还没有这样做,快去打开电脑,运行 pip install LangChain,然后去使用LangChain、搭建惊艳的应用吧~\n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.12"
|
||||
},
|
||||
"toc": {
|
||||
"base_numbering": 1,
|
||||
"nav_menu": {},
|
||||
"number_sections": false,
|
||||
"sideBar": true,
|
||||
"skip_h1_title": false,
|
||||
"title_cell": "Table of Contents",
|
||||
"title_sidebar": "Contents",
|
||||
"toc_cell": false,
|
||||
"toc_position": {},
|
||||
"toc_section_display": true,
|
||||
"toc_window_display": true
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
|
Can't render this file because it is too large.
|
Reference in New Issue
Block a user