414 lines
12 KiB
Plaintext
414 lines
12 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "63651c26",
|
||
"metadata": {},
|
||
"source": [
|
||
"第三章 评估输入——分类"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "b12f80c9",
|
||
"metadata": {},
|
||
"source": [
|
||
"在本节中,我们将专注于评估输入的任务,这对于确保系统的质量和安全性非常重要。\n",
|
||
"\n",
|
||
"对于需要处理不同情况下的许多独立指令集的任务,首先对查询类型进行分类,然后根据该分类确定要使用哪些指令会很有好处。\n",
|
||
"\n",
|
||
"这可以通过定义固定的类别和hard-coding与处理给定类别任务相关的指令来实现。\n",
|
||
"\n",
|
||
"例如,在构建客户服务助手时,首先对查询类型进行分类,然后根据该分类确定要使用哪些指令可能比较重要。\n",
|
||
"\n",
|
||
"因此,例如,如果用户要求关闭其帐户,您可能会给出不同的辅助指令,而如果用户询问特定产品,则可能会添加其他产品信息。\n"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "87d9de1d",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Setup\n",
|
||
"加载 API_KEY 并封装一个调用 API 的函数"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"id": "55ee24ab",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import os\n",
|
||
"import openai\n",
|
||
"from dotenv import load_dotenv, find_dotenv\n",
|
||
"_ = load_dotenv(find_dotenv()) # read local .env file\n",
|
||
"openai.api_key = os.environ['OPENAI_API_KEY']\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"id": "0318b89e",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def get_completion_from_messages(messages, \n",
|
||
" model=\"gpt-3.5-turbo\", \n",
|
||
" temperature=0, \n",
|
||
" max_tokens=500):\n",
|
||
" response = openai.ChatCompletion.create(\n",
|
||
" model=model,\n",
|
||
" messages=messages,\n",
|
||
" temperature=temperature, \n",
|
||
" max_tokens=max_tokens,\n",
|
||
" )\n",
|
||
" return response.choices[0].message[\"content\"]"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "f2b55807",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### 对用户指令进行分类"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "c3216166",
|
||
"metadata": {},
|
||
"source": [
|
||
"在这里,我们有我们的系统消息,它是对整个系统的指导,并且我们正在使用这个分隔符——#。\n",
|
||
"\n",
|
||
"分隔符只是一种分隔指令或输出不同部分的方式,它有助于模型确定不同的部分。\n",
|
||
"\n",
|
||
"因此,对于这个例子,我们将使用#作为分隔符。\n",
|
||
"\n",
|
||
"这是一个很好的分隔符,因为它实际上被表示为一个token。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"id": "3b406ba8",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"delimiter = \"####\""
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "049d0d82",
|
||
"metadata": {},
|
||
"source": [
|
||
"这是我们的系统消息,我们正在以下面的方式询问模型。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"id": "29e2d170",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"system_message = f\"\"\"\n",
|
||
"You will be provided with customer service queries. \\\n",
|
||
"The customer service query will be delimited with \\\n",
|
||
"{delimiter} characters.\n",
|
||
"Classify each query into a primary category \\\n",
|
||
"and a secondary category. \n",
|
||
"Provide your output in json format with the \\\n",
|
||
"keys: primary and secondary.\n",
|
||
"\n",
|
||
"Primary categories: Billing, Technical Support, \\\n",
|
||
"Account Management, or General Inquiry.\n",
|
||
"\n",
|
||
"Billing secondary categories:\n",
|
||
"Unsubscribe or upgrade\n",
|
||
"Add a payment method\n",
|
||
"Explanation for charge\n",
|
||
"Dispute a charge\n",
|
||
"\n",
|
||
"Technical Support secondary categories:\n",
|
||
"General troubleshooting\n",
|
||
"Device compatibility\n",
|
||
"Software updates\n",
|
||
"\n",
|
||
"Account Management secondary categories:\n",
|
||
"Password reset\n",
|
||
"Update personal information\n",
|
||
"Close account\n",
|
||
"Account security\n",
|
||
"\n",
|
||
"General Inquiry secondary categories:\n",
|
||
"Product information\n",
|
||
"Pricing\n",
|
||
"Feedback\n",
|
||
"Speak to a human\n",
|
||
"\n",
|
||
"\"\"\""
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"id": "61f4b474",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# 中文 Prompt\n",
|
||
"system_message = f\"\"\"\n",
|
||
"你将获得客户服务查询。\n",
|
||
"每个客户服务查询都将用{delimiter}字符分隔。\n",
|
||
"将每个查询分类到一个主要类别和一个次要类别中。\n",
|
||
"以JSON格式提供你的输出,包含以下键:primary和secondary。\n",
|
||
"\n",
|
||
"主要类别:计费(Billing)、技术支持(Technical Support)、账户管理(Account Management)或一般咨询(General Inquiry)。\n",
|
||
"\n",
|
||
"计费次要类别:\n",
|
||
"取消订阅或升级(Unsubscribe or upgrade)\n",
|
||
"添加付款方式(Add a payment method)\n",
|
||
"收费解释(Explanation for charge)\n",
|
||
"争议费用(Dispute a charge)\n",
|
||
"\n",
|
||
"技术支持次要类别:\n",
|
||
"常规故障排除(General troubleshooting)\n",
|
||
"设备兼容性(Device compatibility)\n",
|
||
"软件更新(Software updates)\n",
|
||
"\n",
|
||
"账户管理次要类别:\n",
|
||
"重置密码(Password reset)\n",
|
||
"更新个人信息(Update personal information)\n",
|
||
"关闭账户(Close account)\n",
|
||
"账户安全(Account security)\n",
|
||
"\n",
|
||
"一般咨询次要类别:\n",
|
||
"产品信息(Product information)\n",
|
||
"定价(Pricing)\n",
|
||
"反馈(Feedback)\n",
|
||
"与人工对话(Speak to a human)\n",
|
||
"\n",
|
||
"\"\"\""
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "e6a932ce",
|
||
"metadata": {},
|
||
"source": [
|
||
"现在我们来看一个用户消息的例子,我们将使用以下内容。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"id": "2b2df0bf",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"user_message = f\"\"\"\\ \n",
|
||
"I want you to delete my profile and all of my user data\"\"\""
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"id": "3b8070bf",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"user_message = f\"\"\"\\ \n",
|
||
"我希望你删除我的个人资料和所有用户数据。\"\"\""
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "3a2c1cf0",
|
||
"metadata": {},
|
||
"source": [
|
||
"将这个消息格式化为一个消息列表,系统消息和用户消息使用####\"进行分隔。\n",
|
||
"\n",
|
||
"让我们想一想,作为人类,这句话什么意思:\"我想让您删除我的个人资料。\"\n",
|
||
"\n",
|
||
"这句话看上去属于\"Account Management\"类别,也许是属于\"Close account\"这一项。 "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"id": "6e2b9049",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"messages = [ \n",
|
||
"{'role':'system', \n",
|
||
" 'content': system_message}, \n",
|
||
"{'role':'user', \n",
|
||
" 'content': f\"{delimiter}{user_message}{delimiter}\"}, \n",
|
||
"]"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "4b295207",
|
||
"metadata": {},
|
||
"source": [
|
||
"让我们看看模型是如何思考的\n",
|
||
"\n",
|
||
"模型的分类是\"Account Management\"作为\"primary\",\"Close account\"作为\"secondary\"。\n",
|
||
"\n",
|
||
"请求结构化输出(如JSON)的好处是,您可以轻松地将其读入某个对象中,\n",
|
||
"\n",
|
||
"例如Python中的字典,或者如果您使用其他语言,则可以使用其他对象作为输入到后续步骤中。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"id": "77328388",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"{\n",
|
||
" \"primary\": \"账户管理\",\n",
|
||
" \"secondary\": \"关闭账户\"\n",
|
||
"}\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"response = get_completion_from_messages(messages)\n",
|
||
"print(response)"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "2f6b353b",
|
||
"metadata": {},
|
||
"source": [
|
||
"这是另一个用户消息: \"告诉我更多关于你们的平板电视\"\n",
|
||
"\n",
|
||
"我们只是有相同的消息列表,模型的响应,然后我们打印它。\n",
|
||
"\n",
|
||
"结果这里是我们的第二个分类,看起来应该是正确的。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 31,
|
||
"id": "edf8fbe9",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"{\n",
|
||
" \"primary\": \"General Inquiry\",\n",
|
||
" \"secondary\": \"Product information\"\n",
|
||
"}\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"user_message = f\"\"\"\\\n",
|
||
"Tell me more about your flat screen tvs\"\"\"\n",
|
||
"messages = [ \n",
|
||
"{'role':'system', \n",
|
||
" 'content': system_message}, \n",
|
||
"{'role':'user', \n",
|
||
" 'content': f\"{delimiter}{user_message}{delimiter}\"}, \n",
|
||
"] \n",
|
||
"response = get_completion_from_messages(messages)\n",
|
||
"print(response)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"id": "f1d738e1",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"以下是针对平板电脑的一般咨询:\n",
|
||
"\n",
|
||
"{\n",
|
||
" \"primary\": \"General Inquiry\",\n",
|
||
" \"secondary\": \"Product information\"\n",
|
||
"}\n",
|
||
"\n",
|
||
"如果您有任何特定的问题或需要更详细的信息,请告诉我,我会尽力回答。\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"user_message = f\"\"\"\\\n",
|
||
"告诉我更多有关你们的平板电脑的信息\"\"\"\n",
|
||
"messages = [ \n",
|
||
"{'role':'system', \n",
|
||
" 'content': system_message}, \n",
|
||
"{'role':'user', \n",
|
||
" 'content': f\"{delimiter}{user_message}{delimiter}\"}, \n",
|
||
"] \n",
|
||
"response = get_completion_from_messages(messages)\n",
|
||
"print(response)"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "8f87f68d",
|
||
"metadata": {},
|
||
"source": [
|
||
"所以总的来说,根据客户咨询的分类,我们现在可以提供一套更具体的指令来处理后续步骤。\n",
|
||
"\n",
|
||
"在这种情况下,我们可能会添加关于电视的额外信息,而不同情况下,我们可能希望提供关闭账户的链接或类似的内容。\n",
|
||
"\n",
|
||
"我们将在以后的视频中了解更多有关处理输入的不同方法。\n",
|
||
"\n",
|
||
"在下一个视频中,我们将探讨更多评估输入的方法,特别是确保用户以负责任的方式使用系统的方法。"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.10.11"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|