添加目录,完善图片,完善部分章节信息

This commit is contained in:
gaoliye
2023-07-15 17:25:44 +08:00
parent 34a4b9c42c
commit 17225f37a5
15 changed files with 42 additions and 7134 deletions

View File

@ -10,7 +10,7 @@
使用 ChatGPT 不仅仅是一个单一的 Prompt 或单一的模型调用,本课程将分享使用 LLM 构建复杂应用的最佳实践。
本课程以构建客服助手为例,使用不同的 Prompt 链式调用语言模型具体的Prompt选择将取决于上一次调用的输出结果有时还需要从外部来源查找信息。
本课程以构建客服助手为例,使用不同的 Prompt 链式调用语言模型,具体的 Prompt 选择将取决于上一次调用的输出结果,有时还需要从外部来源查找信息。
本课程将围绕该主题,逐步了解应用程序内部的构建步骤,并分享在长期视角下系统评估和持续改进方面的最佳实践。

View File

@ -6,7 +6,7 @@
### 📚 课程回顾
本课程详细介绍了 LLM 工作原理包括分词器tokenizer的细节、评估用户输入的质量和安全性的方法、使用思维链作为 prompt、通过链式 prompt 分割任务以及返回用户前检查输出等。
本课程详细介绍了 LLM 工作原理包括分词器tokenizer的细节、评估用户输入的质量和安全性的方法、使用思维链作为 Prompt、通过链式 Prompt 分割任务以及返回用户前检查输出等。
本课程还介绍了评估系统的长期性能,以监控和改进表现的方法。

File diff suppressed because one or more lines are too long

View File

@ -1,477 +0,0 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# 第五章 处理输入: 思维链推理"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"在本节中,我们将专注于处理输入,即通过一系列步骤生成有用的输出。\n",
"\n",
"有时,模型在回答特定问题之前需要进行详细地推理。如果您参加过我们之前的课程,您将看到许多这样的例子。有时,模型可能会因为过于匆忙得出结论而在推理过程中出错。因此,我们可以重新构思查询,要求模型在给出最终答案之前提供一系列相关的推理步骤,这样它就可以更长时间、更深入地思考问题。\n",
"\n",
"通常我们称这种要求模型逐步推理问题的策略为思维链推理chain of thought reasoning。"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## 一、环境配置\n",
"### 1.1 加载 API key 和相关的 Python 库.\n",
"在这门课程中,我们提供了一些代码,帮助您加载 OpenAI API key。"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"# 导入第三方库\n",
"\n",
"openai.api_key = \"sk-...\"\n",
"# 设置 API_KEY, 请替换成您自己的 API_KEY\n",
"\n",
"# 以下为基于环境变量的配置方法示例,这样更加安全。仅供参考,后续将不再涉及。\n",
"# import openai\n",
"# import os\n",
"# OPENAI_API_KEY = os.environ.get(\"OPENAI_API_KEY\")\n",
"# openai.api_key = OPENAI_API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"def get_completion_from_messages(messages, \n",
" model=\"gpt-3.5-turbo\", \n",
" temperature=0, \n",
" max_tokens=500):\n",
" '''\n",
" 封装一个访问 OpenAI GPT3.5 的函数\n",
"\n",
" 参数: \n",
" messages: 这是一个消息列表,每个消息都是一个字典,包含 role(角色)和 content(内容)。角色可以是'system'、'user' 或 'assistant内容是角色的消息。\n",
" model: 调用的模型,默认为 gpt-3.5-turbo(ChatGPT),有内测资格的用户可以选择 gpt-4\n",
" temperature: 这决定模型输出的随机程度默认为0表示输出将非常确定。增加温度会使输出更随机。\n",
" max_tokens: 这决定模型输出的最大的 token 数。\n",
" '''\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=temperature, # 这决定模型输出的随机程度\n",
" max_tokens=max_tokens, # 这决定模型输出的最大的 token 数\n",
" )\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## 二、 思维链提示"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"我们在这里要求模型在得出结论之前一步一步推理答案。\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"delimiter = \"####\"\n",
"system_message = f\"\"\"\n",
"Follow these steps to answer the customer queries.\n",
"The customer query will be delimited with four hashtags,\\\n",
"i.e. {delimiter}. \n",
"\n",
"Step 1:{delimiter} First decide whether the user is \\\n",
"asking a question about a specific product or products. \\\n",
"Product cateogry doesn't count. \n",
"\n",
"Step 2:{delimiter} If the user is asking about \\\n",
"specific products, identify whether \\\n",
"the products are in the following list.\n",
"All available products: \n",
"1. Product: TechPro Ultrabook\n",
" Category: Computers and Laptops\n",
" Brand: TechPro\n",
" Model Number: TP-UB100\n",
" Warranty: 1 year\n",
" Rating: 4.5\n",
" Features: 13.3-inch display, 8GB RAM, 256GB SSD, Intel Core i5 processor\n",
" Description: A sleek and lightweight ultrabook for everyday use.\n",
" Price: $799.99\n",
"\n",
"2. Product: BlueWave Gaming Laptop\n",
" Category: Computers and Laptops\n",
" Brand: BlueWave\n",
" Model Number: BW-GL200\n",
" Warranty: 2 years\n",
" Rating: 4.7\n",
" Features: 15.6-inch display, 16GB RAM, 512GB SSD, NVIDIA GeForce RTX 3060\n",
" Description: A high-performance gaming laptop for an immersive experience.\n",
" Price: $1199.99\n",
"\n",
"3. Product: PowerLite Convertible\n",
" Category: Computers and Laptops\n",
" Brand: PowerLite\n",
" Model Number: PL-CV300\n",
" Warranty: 1 year\n",
" Rating: 4.3\n",
" Features: 14-inch touchscreen, 8GB RAM, 256GB SSD, 360-degree hinge\n",
" Description: A versatile convertible laptop with a responsive touchscreen.\n",
" Price: $699.99\n",
"\n",
"4. Product: TechPro Desktop\n",
" Category: Computers and Laptops\n",
" Brand: TechPro\n",
" Model Number: TP-DT500\n",
" Warranty: 1 year\n",
" Rating: 4.4\n",
" Features: Intel Core i7 processor, 16GB RAM, 1TB HDD, NVIDIA GeForce GTX 1660\n",
" Description: A powerful desktop computer for work and play.\n",
" Price: $999.99\n",
"\n",
"5. Product: BlueWave Chromebook\n",
" Category: Computers and Laptops\n",
" Brand: BlueWave\n",
" Model Number: BW-CB100\n",
" Warranty: 1 year\n",
" Rating: 4.1\n",
" Features: 11.6-inch display, 4GB RAM, 32GB eMMC, Chrome OS\n",
" Description: A compact and affordable Chromebook for everyday tasks.\n",
" Price: $249.99\n",
"\n",
"Step 3:{delimiter} If the message contains products \\\n",
"in the list above, list any assumptions that the \\\n",
"user is making in their \\\n",
"message e.g. that Laptop X is bigger than \\\n",
"Laptop Y, or that Laptop Z has a 2 year warranty.\n",
"\n",
"Step 4:{delimiter}: If the user made any assumptions, \\\n",
"figure out whether the assumption is true based on your \\\n",
"product information. \n",
"\n",
"Step 5:{delimiter}: First, politely correct the \\\n",
"customer's incorrect assumptions if applicable. \\\n",
"Only mention or reference products in the list of \\\n",
"5 available products, as these are the only 5 \\\n",
"products that the store sells. \\\n",
"Answer the customer in a friendly tone.\n",
"\n",
"Use the following format:\n",
"Step 1:{delimiter} <step 1 reasoning>\n",
"Step 2:{delimiter} <step 2 reasoning>\n",
"Step 3:{delimiter} <step 3 reasoning>\n",
"Step 4:{delimiter} <step 4 reasoning>\n",
"Response to user:{delimiter} <response to customer>\n",
"\n",
"Make sure to include {delimiter} to separate every step.\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"delimiter = \"####\"\n",
"system_message = f\"\"\"\n",
"请按照以下步骤回答客户的查询。客户的查询将以四个井号(#)分隔,即 {delimiter}。\n",
"\n",
"步骤 1:{delimiter} 首先确定用户是否正在询问有关特定产品或产品的问题。产品类别不计入范围。\n",
"\n",
"步骤 2:{delimiter} 如果用户询问特定产品,请确认产品是否在以下列表中。所有可用产品:\n",
"\n",
"产品TechPro 超极本\n",
"类别:计算机和笔记本电脑\n",
"品牌TechPro\n",
"型号TP-UB100\n",
"保修期1 年\n",
"评分4.5\n",
"特点13.3 英寸显示屏8GB RAM256GB SSDIntel Core i5 处理器\n",
"描述:一款适用于日常使用的时尚轻便的超极本。\n",
"价格:$799.99\n",
"\n",
"产品BlueWave 游戏笔记本电脑\n",
"类别:计算机和笔记本电脑\n",
"品牌BlueWave\n",
"型号BW-GL200\n",
"保修期2 年\n",
"评分4.7\n",
"特点15.6 英寸显示屏16GB RAM512GB SSDNVIDIA GeForce RTX 3060\n",
"描述:一款高性能的游戏笔记本电脑,提供沉浸式体验。\n",
"价格:$1199.99\n",
"\n",
"产品PowerLite 可转换笔记本电脑\n",
"类别:计算机和笔记本电脑\n",
"品牌PowerLite\n",
"型号PL-CV300\n",
"保修期1年\n",
"评分4.3\n",
"特点14 英寸触摸屏8GB RAM256GB SSD360 度铰链\n",
"描述:一款多功能可转换笔记本电脑,具有响应触摸屏。\n",
"价格:$699.99\n",
"\n",
"产品TechPro 台式电脑\n",
"类别:计算机和笔记本电脑\n",
"品牌TechPro\n",
"型号TP-DT500\n",
"保修期1年\n",
"评分4.4\n",
"特点Intel Core i7 处理器16GB RAM1TB HDDNVIDIA GeForce GTX 1660\n",
"描述:一款功能强大的台式电脑,适用于工作和娱乐。\n",
"价格:$999.99\n",
"\n",
"产品BlueWave Chromebook\n",
"类别:计算机和笔记本电脑\n",
"品牌BlueWave\n",
"型号BW-CB100\n",
"保修期1 年\n",
"评分4.1\n",
"特点11.6 英寸显示屏4GB RAM32GB eMMCChrome OS\n",
"描述:一款紧凑而价格实惠的 Chromebook适用于日常任务。\n",
"价格:$249.99\n",
"\n",
"步骤 3:{delimiter} 如果消息中包含上述列表中的产品,请列出用户在消息中做出的任何假设,例如笔记本电脑 X 比笔记本电脑 Y 大,或者笔记本电脑 Z 有 2 年保修期。\n",
"\n",
"步骤 4:{delimiter} 如果用户做出了任何假设,请根据产品信息确定假设是否正确。\n",
"\n",
"步骤 5:{delimiter} 如果用户有任何错误的假设,请先礼貌地纠正客户的错误假设(如果适用)。只提及或引用可用产品列表中的产品,因为这是商店销售的唯一五款产品。以友好的口吻回答客户。\n",
"\n",
"使用以下格式回答问题:\n",
"步骤 1:{delimiter} <步骤 1的推理>\n",
"步骤 2:{delimiter} <步骤 2 的推理>\n",
"步骤 3:{delimiter} <步骤 3 的推理>\n",
"步骤 4:{delimiter} <步骤 4 的推理>\n",
"回复客户:{delimiter} <回复客户的内容>\n",
"\n",
"请确保在每个步骤之间使用 {delimiter} 进行分隔。\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Step 1:#### The user is asking a question about two specific products, the BlueWave Chromebook and the TechPro Desktop.\n",
"Step 2:#### The prices of the two products are as follows:\n",
"- BlueWave Chromebook: $249.99\n",
"- TechPro Desktop: $999.99\n",
"Step 3:#### The user is assuming that the BlueWave Chromebook is more expensive than the TechPro Desktop.\n",
"Step 4:#### The assumption is incorrect. The TechPro Desktop is actually more expensive than the BlueWave Chromebook.\n",
"Response to user:#### The BlueWave Chromebook is actually less expensive than the TechPro Desktop. The BlueWave Chromebook costs $249.99 while the TechPro Desktop costs $999.99.\n"
]
}
],
"source": [
"user_message = f\"\"\"\n",
"by how much is the BlueWave Chromebook more expensive \\\n",
"than the TechPro Desktop\"\"\"\n",
"\n",
"messages = [ \n",
"{'role':'system', \n",
" 'content': system_message}, \n",
"{'role':'user', \n",
" 'content': f\"{delimiter}{user_message}{delimiter}\"}, \n",
"] \n",
"\n",
"response = get_completion_from_messages(messages)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"步骤 1:#### 确认用户正在询问有关特定产品的问题。\n",
"\n",
"步骤 2:#### 用户询问 BlueWave Chromebook 和 TechPro 台式电脑之间的价格差异。\n",
"\n",
"步骤 3:#### 用户假设 BlueWave Chromebook 的价格高于 TechPro 台式电脑。\n",
"\n",
"步骤 4:#### 用户的假设是正确的。BlueWave Chromebook 的价格为 $249.99,而 TechPro 台式电脑的价格为 $999.99,因此 BlueWave Chromebook 的价格比 TechPro 台式电脑低 $750。\n",
"\n",
"回复客户:#### BlueWave Chromebook 比 TechPro 台式电脑便宜 $750。\n"
]
}
],
"source": [
"user_message = f\"\"\"BlueWave Chromebook 比 TechPro 台式电脑贵多少?\"\"\"\n",
"\n",
"messages = [ \n",
"{'role':'system', \n",
" 'content': system_message}, \n",
"{'role':'user', \n",
" 'content': f\"{delimiter}{user_message}{delimiter}\"}, \n",
"] \n",
"\n",
"response = get_completion_from_messages(messages)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Step 1:#### The user is asking if the store sells TVs.\n",
"Step 2:#### The list of available products does not include any TVs.\n",
"Response to user:#### I'm sorry, but we do not sell TVs at this store. Our available products include computers and laptops.\n"
]
}
],
"source": [
"user_message = f\"\"\"\n",
"do you sell tvs\"\"\"\n",
"messages = [ \n",
"{'role':'system', \n",
" 'content': system_message}, \n",
"{'role':'user', \n",
" 'content': f\"{delimiter}{user_message}{delimiter}\"}, \n",
"] \n",
"response = get_completion_from_messages(messages)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"步骤 1:#### 首先确定用户是否正在询问有关特定产品或产品的问题。产品类别不计入范围。\n",
"\n",
"步骤 2:#### 如果用户询问特定产品,请确认产品是否在以下列表中。所有可用产品:\n",
"\n",
"我们很抱歉,我们商店不出售电视机。\n",
"\n",
"步骤 3:#### 如果消息中包含上述列表中的产品,请列出用户在消息中做出的任何假设,例如笔记本电脑 X 比笔记本电脑 Y 大,或者笔记本电脑 Z 有 2 年保修期。\n",
"\n",
"N/A\n",
"\n",
"步骤 4:#### 如果用户做出了任何假设,请根据产品信息确定假设是否正确。\n",
"\n",
"N/A\n",
"\n",
"回复客户:#### 我们很抱歉,我们商店不出售电视机。\n"
]
}
],
"source": [
"user_message = f\"\"\"你有电视机么\"\"\"\n",
"messages = [ \n",
"{'role':'system', \n",
" 'content': system_message}, \n",
"{'role':'user', \n",
" 'content': f\"{delimiter}{user_message}{delimiter}\"}, \n",
"] \n",
"response = get_completion_from_messages(messages)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## 三、 内心独白Inner monologue"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"对于某些应用程序,模型的推理过程可能不适合与用户共享。例如,在辅导类应用程序中,我们可能希望鼓励学生自行解决问题,但模型对学生解决方案的推理过程可能会泄露答案。\n",
"\n",
"内心独白是一种可以用来缓解这种情况的策略,这是一种隐藏模型推理过程的高级方法。\n",
"\n",
"内心独白的思想是让模型以一种不会透露答案的方式生成部分输出,这样用户就无法看到完整的推理过程。目标是将这些部分隐藏在一个结构化的格式中,使得传递它们变得容易。然后,在向用户呈现输出之前,对输出进行一些转化,使得只有部分输出是可见的。"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"我们很抱歉,我们商店不出售电视机。\n"
]
}
],
"source": [
"try:\n",
" final_response = response.split(delimiter)[-1].strip()\n",
"except Exception as e:\n",
" final_response = \"Sorry, I'm having trouble right now, please try asking another question.\"\n",
" \n",
"print(final_response)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

File diff suppressed because one or more lines are too long

View File

@ -0,0 +1,31 @@
import os
import codecs
import json
def add_toc(ipynb_file):
f = codecs.open(ipynb_file, 'r')
source = f.read()
y = json.loads(source)
toc = ["\n"]
for item in y["cells"]:
if item["cell_type"]=='markdown' and len(item['source'])>0:
item_start = item['source'][0].strip("\n")
if item_start.startswith("#"):
l = len(item_start.split()[0])
if l<=3 and l>1:
name = " ".join(item_start.split(" ")[1:])
tag = "-".join(item_start.split(" ")[1:])
tab = " "*(l-2)
toc.append(f' {tab}- [{name}](#{tag})\n')
y["cells"][0]['source']= y["cells"][0]['source'][0:1]
y["cells"][0]['source'].extend(toc)
f = codecs.open(ipynb_file, 'w')
f.write(json.dumps(y))
f.close()
for file in os.listdir("."):
print(file)
if file.endswith("ipynb") and file[0].isdigit():
print(file)
add_toc(file)

BIN
figures/chat-format.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 314 KiB

BIN
figures/tokens.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 412 KiB