Files
prompt-engineering-for-deve…/content/Building Systems with the ChatGPT API/3.Classification.ipynb
2023-06-03 23:17:06 +08:00

414 lines
12 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "63651c26",
"metadata": {},
"source": [
"第三章 评估输入——分类"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "b12f80c9",
"metadata": {},
"source": [
"在本节中,我们将专注于评估输入的任务,这对于确保系统的质量和安全性非常重要。\n",
"\n",
"对于需要处理不同情况下的许多独立指令集的任务,首先对查询类型进行分类,然后根据该分类确定要使用哪些指令会很有好处。\n",
"\n",
"这可以通过定义固定的类别和hard-coding与处理给定类别任务相关的指令来实现。\n",
"\n",
"例如,在构建客户服务助手时,首先对查询类型进行分类,然后根据该分类确定要使用哪些指令可能比较重要。\n",
"\n",
"因此,例如,如果用户要求关闭其帐户,您可能会给出不同的辅助指令,而如果用户询问特定产品,则可能会添加其他产品信息。\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "87d9de1d",
"metadata": {},
"source": [
"## Setup\n",
"加载 API_KEY 并封装一个调用 API 的函数"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "55ee24ab",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import openai\n",
"from dotenv import load_dotenv, find_dotenv\n",
"_ = load_dotenv(find_dotenv()) # read local .env file\n",
"openai.api_key = os.environ['OPENAI_API_KEY']\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "0318b89e",
"metadata": {},
"outputs": [],
"source": [
"def get_completion_from_messages(messages, \n",
" model=\"gpt-3.5-turbo\", \n",
" temperature=0, \n",
" max_tokens=500):\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=temperature, \n",
" max_tokens=max_tokens,\n",
" )\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f2b55807",
"metadata": {},
"source": [
"#### 对用户指令进行分类"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "c3216166",
"metadata": {},
"source": [
"在这里,我们有我们的系统消息,它是对整个系统的指导,并且我们正在使用这个分隔符——#。\n",
"\n",
"分隔符只是一种分隔指令或输出不同部分的方式,它有助于模型确定不同的部分。\n",
"\n",
"因此,对于这个例子,我们将使用#作为分隔符。\n",
"\n",
"这是一个很好的分隔符因为它实际上被表示为一个token。"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "3b406ba8",
"metadata": {},
"outputs": [],
"source": [
"delimiter = \"####\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "049d0d82",
"metadata": {},
"source": [
"这是我们的系统消息,我们正在以下面的方式询问模型。"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "29e2d170",
"metadata": {},
"outputs": [],
"source": [
"system_message = f\"\"\"\n",
"You will be provided with customer service queries. \\\n",
"The customer service query will be delimited with \\\n",
"{delimiter} characters.\n",
"Classify each query into a primary category \\\n",
"and a secondary category. \n",
"Provide your output in json format with the \\\n",
"keys: primary and secondary.\n",
"\n",
"Primary categories: Billing, Technical Support, \\\n",
"Account Management, or General Inquiry.\n",
"\n",
"Billing secondary categories:\n",
"Unsubscribe or upgrade\n",
"Add a payment method\n",
"Explanation for charge\n",
"Dispute a charge\n",
"\n",
"Technical Support secondary categories:\n",
"General troubleshooting\n",
"Device compatibility\n",
"Software updates\n",
"\n",
"Account Management secondary categories:\n",
"Password reset\n",
"Update personal information\n",
"Close account\n",
"Account security\n",
"\n",
"General Inquiry secondary categories:\n",
"Product information\n",
"Pricing\n",
"Feedback\n",
"Speak to a human\n",
"\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "61f4b474",
"metadata": {},
"outputs": [],
"source": [
"# 中文 Prompt\n",
"system_message = f\"\"\"\n",
"你将获得客户服务查询。\n",
"每个客户服务查询都将用{delimiter}字符分隔。\n",
"将每个查询分类到一个主要类别和一个次要类别中。\n",
"以JSON格式提供你的输出包含以下键primary和secondary。\n",
"\n",
"主要类别计费Billing、技术支持Technical Support、账户管理Account Management或一般咨询General Inquiry。\n",
"\n",
"计费次要类别:\n",
"取消订阅或升级Unsubscribe or upgrade\n",
"添加付款方式Add a payment method\n",
"收费解释Explanation for charge\n",
"争议费用Dispute a charge\n",
"\n",
"技术支持次要类别:\n",
"常规故障排除General troubleshooting\n",
"设备兼容性Device compatibility\n",
"软件更新Software updates\n",
"\n",
"账户管理次要类别:\n",
"重置密码Password reset\n",
"更新个人信息Update personal information\n",
"关闭账户Close account\n",
"账户安全Account security\n",
"\n",
"一般咨询次要类别:\n",
"产品信息Product information\n",
"定价Pricing\n",
"反馈Feedback\n",
"与人工对话Speak to a human\n",
"\n",
"\"\"\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e6a932ce",
"metadata": {},
"source": [
"现在我们来看一个用户消息的例子,我们将使用以下内容。"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "2b2df0bf",
"metadata": {},
"outputs": [],
"source": [
"user_message = f\"\"\"\\ \n",
"I want you to delete my profile and all of my user data\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "3b8070bf",
"metadata": {},
"outputs": [],
"source": [
"user_message = f\"\"\"\\ \n",
"我希望你删除我的个人资料和所有用户数据。\"\"\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3a2c1cf0",
"metadata": {},
"source": [
"将这个消息格式化为一个消息列表,系统消息和用户消息使用####\"进行分隔。\n",
"\n",
"让我们想一想,作为人类,这句话什么意思:\"我想让您删除我的个人资料。\"\n",
"\n",
"这句话看上去属于\"Account Management\"类别,也许是属于\"Close account\"这一项。 "
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "6e2b9049",
"metadata": {},
"outputs": [],
"source": [
"messages = [ \n",
"{'role':'system', \n",
" 'content': system_message}, \n",
"{'role':'user', \n",
" 'content': f\"{delimiter}{user_message}{delimiter}\"}, \n",
"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "4b295207",
"metadata": {},
"source": [
"让我们看看模型是如何思考的\n",
"\n",
"模型的分类是\"Account Management\"作为\"primary\"\"Close account\"作为\"secondary\"。\n",
"\n",
"请求结构化输出如JSON的好处是您可以轻松地将其读入某个对象中\n",
"\n",
"例如Python中的字典或者如果您使用其他语言则可以使用其他对象作为输入到后续步骤中。"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "77328388",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"primary\": \"账户管理\",\n",
" \"secondary\": \"关闭账户\"\n",
"}\n"
]
}
],
"source": [
"response = get_completion_from_messages(messages)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "2f6b353b",
"metadata": {},
"source": [
"这是另一个用户消息: \"告诉我更多关于你们的平板电视\"\n",
"\n",
"我们只是有相同的消息列表,模型的响应,然后我们打印它。\n",
"\n",
"结果这里是我们的第二个分类,看起来应该是正确的。"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "edf8fbe9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"primary\": \"General Inquiry\",\n",
" \"secondary\": \"Product information\"\n",
"}\n"
]
}
],
"source": [
"user_message = f\"\"\"\\\n",
"Tell me more about your flat screen tvs\"\"\"\n",
"messages = [ \n",
"{'role':'system', \n",
" 'content': system_message}, \n",
"{'role':'user', \n",
" 'content': f\"{delimiter}{user_message}{delimiter}\"}, \n",
"] \n",
"response = get_completion_from_messages(messages)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "f1d738e1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"以下是针对平板电脑的一般咨询:\n",
"\n",
"{\n",
" \"primary\": \"General Inquiry\",\n",
" \"secondary\": \"Product information\"\n",
"}\n",
"\n",
"如果您有任何特定的问题或需要更详细的信息,请告诉我,我会尽力回答。\n"
]
}
],
"source": [
"user_message = f\"\"\"\\\n",
"告诉我更多有关你们的平板电脑的信息\"\"\"\n",
"messages = [ \n",
"{'role':'system', \n",
" 'content': system_message}, \n",
"{'role':'user', \n",
" 'content': f\"{delimiter}{user_message}{delimiter}\"}, \n",
"] \n",
"response = get_completion_from_messages(messages)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "8f87f68d",
"metadata": {},
"source": [
"所以总的来说,根据客户咨询的分类,我们现在可以提供一套更具体的指令来处理后续步骤。\n",
"\n",
"在这种情况下,我们可能会添加关于电视的额外信息,而不同情况下,我们可能希望提供关闭账户的链接或类似的内容。\n",
"\n",
"我们将在以后的视频中了解更多有关处理输入的不同方法。\n",
"\n",
"在下一个视频中,我们将探讨更多评估输入的方法,特别是确保用户以负责任的方式使用系统的方法。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
}
},
"nbformat": 4,
"nbformat_minor": 5
}