prompt-engineering-for-deve…/docs/content/C2 Building Systems with the ChatGPT API/3.评估输入-分类 Classification.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "63651c26",
   "metadata": {},
   "source": [
    "# 第三章 评估输入——分类\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b12f80c9",
   "metadata": {},
   "source": [
    "在本章中，我们将重点探讨评估输入任务的重要性，这关乎到整个系统的质量和安全性。\n",
    "\n",
    "在处理不同情况下的多个独立指令集的任务时，首先对查询类型进行分类，并以此为基础确定要使用哪些指令，具有诸多优势。这可以通过定义固定类别和硬编码与处理特定类别任务相关的指令来实现。例如，在构建客户服务助手时，对查询类型进行分类并根据分类确定要使用的指令可能非常关键。具体来说，如果用户要求关闭其账户，那么二级指令可能是添加有关如何关闭账户的额外说明；如果用户询问特定产品信息，则二级指令可能会提供更多的产品信息。\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "3b406ba8",
   "metadata": {},
   "outputs": [],
   "source": [
    "delimiter = \"####\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a22cb6b3",
   "metadata": {},
   "source": [
    "在这个例子中，我们使用系统消息（system_message）作为整个系统的全局指导，并选择使用 “#” 作为分隔符。**`分隔符是用来区分指令或输出中不同部分的工具`**，它可以帮助模型更好地识别各个部分，从而提高系统在执行特定任务时的准确性和效率。 “#” 也是一个理想的分隔符，因为它可以被视为一个单独的 token 。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "049d0d82",
   "metadata": {},
   "source": [
    "这是我们定义的系统消息，我们正在以下面的方式询问模型。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "61f4b474",
   "metadata": {},
   "outputs": [],
   "source": [
    "system_message = f\"\"\"\n",
    "你将获得客户服务查询。\n",
    "每个客户服务查询都将用{delimiter}字符分隔。\n",
    "将每个查询分类到一个主要类别和一个次要类别中。\n",
    "以 JSON 格式提供你的输出，包含以下键：primary 和 secondary。\n",
    "\n",
    "主要类别：计费（Billing）、技术支持（Technical Support）、账户管理（Account Management）或一般咨询（General Inquiry）。\n",
    "\n",
    "计费次要类别：\n",
    "取消订阅或升级（Unsubscribe or upgrade）\n",
    "添加付款方式（Add a payment method）\n",
    "收费解释（Explanation for charge）\n",
    "争议费用（Dispute a charge）\n",
    "\n",
    "技术支持次要类别：\n",
    "常规故障排除（General troubleshooting）\n",
    "设备兼容性（Device compatibility）\n",
    "软件更新（Software updates）\n",
    "\n",
    "账户管理次要类别：\n",
    "重置密码（Password reset）\n",
    "更新个人信息（Update personal information）\n",
    "关闭账户（Close account）\n",
    "账户安全（Account security）\n",
    "\n",
    "一般咨询次要类别：\n",
    "产品信息（Product information）\n",
    "定价（Pricing）\n",
    "反馈（Feedback）\n",
    "与人工对话（Speak to a human）\n",
    "\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e6a932ce",
   "metadata": {},
   "source": [
    "了解了系统消息后，现在让我们来看一个用户消息（user message）的例子。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "3b8070bf",
   "metadata": {},
   "outputs": [],
   "source": [
    "user_message = f\"\"\"\\ \n",
    "我希望你删除我的个人资料和所有用户数据。\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3a2c1cf0",
   "metadata": {},
   "source": [
    "首先，将这个用户消息格式化为一个消息列表，并将系统消息和用户消息之间使用 \"####\" 进行分隔。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "6e2b9049",
   "metadata": {},
   "outputs": [],
   "source": [
    "messages =  [  \n",
    "{'role':'system', \n",
    " 'content': system_message},    \n",
    "{'role':'user', \n",
    " 'content': f\"{delimiter}{user_message}{delimiter}\"},  \n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4b295207",
   "metadata": {},
   "source": [
    "如果让你来判断，下面这句话属于哪个类别：\"我想让您删除我的个人资料。我们思考一下，这句话似乎看上去属于“账户管理（Account Management）”或者属于“关闭账户（Close account）”。 \n",
    "\n",
    "让我们看看模型是如何思考的："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "77328388",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"primary\": \"账户管理\",\n",
      "  \"secondary\": \"关闭账户\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "from tool import get_completion_from_messages\n",
    "\n",
    "response = get_completion_from_messages(messages)\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1513835e",
   "metadata": {},
   "source": [
    "模型的分类是将“账户管理”作为 “primary” ，“关闭账户”作为 “secondary” 。\n",
    "\n",
    "请求结构化输出（如 JSON ）的好处是，您可以轻松地将其读入某个对象中，例如 Python 中的字典。如果您使用其他语言，也可以转换为其他对象，然后输入到后续步骤中。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2f6b353b",
   "metadata": {},
   "source": [
    "下面让我们再看一个例子：\n",
    "```\n",
    "用户消息: “告诉我更多关于你们的平板电脑的信息”\n",
    "```\n",
    "我们运用相同的消息列表来获取模型的响应，然后打印出来。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "f1d738e1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"primary\": \"一般咨询\",\n",
      "  \"secondary\": \"产品信息\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "user_message = f\"\"\"\\\n",
    "告诉我更多有关你们的平板电脑的信息\"\"\"\n",
    "messages =  [  \n",
    "{'role':'system', \n",
    " 'content': system_message},    \n",
    "{'role':'user', \n",
    " 'content': f\"{delimiter}{user_message}{delimiter}\"},  \n",
    "] \n",
    "response = get_completion_from_messages(messages)\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8f87f68d",
   "metadata": {},
   "source": [
    "这里返回了另一个分类结果，并且看起来似乎是正确的。因此，根据客户咨询的分类，我们现在可以提供一套更具体的指令来处理后续步骤。在这种情况下，我们可能会添加关于平板电脑的额外信息，而在其他情况下，我们可能希望提供关闭账户的链接或类似的内容。这里返回了另一个分类结果，并且看起来应该是正确的。\n",
    "\n",
    "在下一章中，我们将探讨更多关于评估输入的方法，特别是如何确保用户以负责任的方式使用系统。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "74b4b957",
   "metadata": {},
   "source": [
    "## 英文版"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "79667ca0",
   "metadata": {},
   "outputs": [],
   "source": [
    "system_message = f\"\"\"\n",
    "You will be provided with customer service queries. \\\n",
    "The customer service query will be delimited with \\\n",
    "{delimiter} characters.\n",
    "Classify each query into a primary category \\\n",
    "and a secondary category. \n",
    "Provide your output in json format with the \\\n",
    "keys: primary and secondary.\n",
    "\n",
    "Primary categories: Billing, Technical Support, \\\n",
    "Account Management, or General Inquiry.\n",
    "\n",
    "Billing secondary categories:\n",
    "Unsubscribe or upgrade\n",
    "Add a payment method\n",
    "Explanation for charge\n",
    "Dispute a charge\n",
    "\n",
    "Technical Support secondary categories:\n",
    "General troubleshooting\n",
    "Device compatibility\n",
    "Software updates\n",
    "\n",
    "Account Management secondary categories:\n",
    "Password reset\n",
    "Update personal information\n",
    "Close account\n",
    "Account security\n",
    "\n",
    "General Inquiry secondary categories:\n",
    "Product information\n",
    "Pricing\n",
    "Feedback\n",
    "Speak to a human\n",
    "\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "30a0f506",
   "metadata": {},
   "outputs": [],
   "source": [
    "user_message = f\"\"\"\\ \n",
    "I want you to delete my profile and all of my user data\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "3233bd04",
   "metadata": {},
   "outputs": [],
   "source": [
    "messages =  [  \n",
    "{'role':'system', \n",
    " 'content': system_message},    \n",
    "{'role':'user', \n",
    " 'content': f\"{delimiter}{user_message}{delimiter}\"},  \n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "da52d0b2",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"primary\": \"Account Management\",\n",
      "  \"secondary\": \"Close account\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "response = get_completion_from_messages(messages)\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "92e1e647",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"primary\": \"General Inquiry\",\n",
      "  \"secondary\": \"Product information\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "user_message = f\"\"\"\\\n",
    "Tell me more about your flat screen tvs\"\"\"\n",
    "messages =  [  \n",
    "{'role':'system', \n",
    " 'content': system_message},    \n",
    "{'role':'user', \n",
    " 'content': f\"{delimiter}{user_message}{delimiter}\"},  \n",
    "] \n",
    "response = get_completion_from_messages(messages)\n",
    "print(response)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}