prompt-engineering-for-deve…/content/Building Systems with the ChatGPT API/3.Classification.ipynb

{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "63651c26",
   "metadata": {},
   "source": [
    "第三章 评估输入——分类"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "b12f80c9",
   "metadata": {},
   "source": [
    "在本节中，我们将专注于评估输入的任务，这对于确保系统的质量和安全性非常重要。\n",
    "\n",
    "对于需要处理不同情况下的许多独立指令集的任务，首先对查询类型进行分类，然后根据该分类确定要使用哪些指令会很有好处。\n",
    "\n",
    "这可以通过定义固定的类别和hard-coding与处理给定类别任务相关的指令来实现。\n",
    "\n",
    "例如，在构建客户服务助手时，首先对查询类型进行分类，然后根据该分类确定要使用哪些指令可能比较重要。\n",
    "\n",
    "因此，例如，如果用户要求关闭其帐户，您可能会给出不同的辅助指令，而如果用户询问特定产品，则可能会添加其他产品信息。\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "87d9de1d",
   "metadata": {},
   "source": [
    "## Setup\n",
    "加载 API_KEY 并封装一个调用 API 的函数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "55ee24ab",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import openai\n",
    "from dotenv import load_dotenv, find_dotenv\n",
    "_ = load_dotenv(find_dotenv()) # read local .env file\n",
    "openai.api_key = os.environ['OPENAI_API_KEY']\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "0318b89e",
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_completion_from_messages(messages, \n",
    "                                 model=\"gpt-3.5-turbo\", \n",
    "                                 temperature=0, \n",
    "                                 max_tokens=500):\n",
    "    response = openai.ChatCompletion.create(\n",
    "        model=model,\n",
    "        messages=messages,\n",
    "        temperature=temperature, \n",
    "        max_tokens=max_tokens,\n",
    "    )\n",
    "    return response.choices[0].message[\"content\"]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "f2b55807",
   "metadata": {},
   "source": [
    "#### 对用户指令进行分类"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "c3216166",
   "metadata": {},
   "source": [
    "在这里，我们有我们的系统消息，它是对整个系统的指导，并且我们正在使用这个分隔符——#。\n",
    "\n",
    "分隔符只是一种分隔指令或输出不同部分的方式，它有助于模型确定不同的部分。\n",
    "\n",
    "因此，对于这个例子，我们将使用#作为分隔符。\n",
    "\n",
    "这是一个很好的分隔符，因为它实际上被表示为一个token。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "3b406ba8",
   "metadata": {},
   "outputs": [],
   "source": [
    "delimiter = \"####\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "049d0d82",
   "metadata": {},
   "source": [
    "这是我们的系统消息，我们正在以下面的方式询问模型。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "29e2d170",
   "metadata": {},
   "outputs": [],
   "source": [
    "system_message = f\"\"\"\n",
    "You will be provided with customer service queries. \\\n",
    "The customer service query will be delimited with \\\n",
    "{delimiter} characters.\n",
    "Classify each query into a primary category \\\n",
    "and a secondary category. \n",
    "Provide your output in json format with the \\\n",
    "keys: primary and secondary.\n",
    "\n",
    "Primary categories: Billing, Technical Support, \\\n",
    "Account Management, or General Inquiry.\n",
    "\n",
    "Billing secondary categories:\n",
    "Unsubscribe or upgrade\n",
    "Add a payment method\n",
    "Explanation for charge\n",
    "Dispute a charge\n",
    "\n",
    "Technical Support secondary categories:\n",
    "General troubleshooting\n",
    "Device compatibility\n",
    "Software updates\n",
    "\n",
    "Account Management secondary categories:\n",
    "Password reset\n",
    "Update personal information\n",
    "Close account\n",
    "Account security\n",
    "\n",
    "General Inquiry secondary categories:\n",
    "Product information\n",
    "Pricing\n",
    "Feedback\n",
    "Speak to a human\n",
    "\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "61f4b474",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 中文 Prompt\n",
    "system_message = f\"\"\"\n",
    "你将获得客户服务查询。\n",
    "每个客户服务查询都将用{delimiter}字符分隔。\n",
    "将每个查询分类到一个主要类别和一个次要类别中。\n",
    "以JSON格式提供你的输出，包含以下键：primary和secondary。\n",
    "\n",
    "主要类别：计费（Billing）、技术支持（Technical Support）、账户管理（Account Management）或一般咨询（General Inquiry）。\n",
    "\n",
    "计费次要类别：\n",
    "取消订阅或升级（Unsubscribe or upgrade）\n",
    "添加付款方式（Add a payment method）\n",
    "收费解释（Explanation for charge）\n",
    "争议费用（Dispute a charge）\n",
    "\n",
    "技术支持次要类别：\n",
    "常规故障排除（General troubleshooting）\n",
    "设备兼容性（Device compatibility）\n",
    "软件更新（Software updates）\n",
    "\n",
    "账户管理次要类别：\n",
    "重置密码（Password reset）\n",
    "更新个人信息（Update personal information）\n",
    "关闭账户（Close account）\n",
    "账户安全（Account security）\n",
    "\n",
    "一般咨询次要类别：\n",
    "产品信息（Product information）\n",
    "定价（Pricing）\n",
    "反馈（Feedback）\n",
    "与人工对话（Speak to a human）\n",
    "\n",
    "\"\"\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "e6a932ce",
   "metadata": {},
   "source": [
    "现在我们来看一个用户消息的例子，我们将使用以下内容。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "2b2df0bf",
   "metadata": {},
   "outputs": [],
   "source": [
    "user_message = f\"\"\"\\ \n",
    "I want you to delete my profile and all of my user data\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "3b8070bf",
   "metadata": {},
   "outputs": [],
   "source": [
    "user_message = f\"\"\"\\ \n",
    "我希望你删除我的个人资料和所有用户数据。\"\"\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "3a2c1cf0",
   "metadata": {},
   "source": [
    "将这个消息格式化为一个消息列表，系统消息和用户消息使用####\"进行分隔。\n",
    "\n",
    "让我们想一想，作为人类，这句话什么意思：\"我想让您删除我的个人资料。\"\n",
    "\n",
    "这句话看上去属于\"Account Management\"类别，也许是属于\"Close account\"这一项。 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "6e2b9049",
   "metadata": {},
   "outputs": [],
   "source": [
    "messages =  [  \n",
    "{'role':'system', \n",
    " 'content': system_message},    \n",
    "{'role':'user', \n",
    " 'content': f\"{delimiter}{user_message}{delimiter}\"},  \n",
    "]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "4b295207",
   "metadata": {},
   "source": [
    "让我们看看模型是如何思考的\n",
    "\n",
    "模型的分类是\"Account Management\"作为\"primary\"，\"Close account\"作为\"secondary\"。\n",
    "\n",
    "请求结构化输出（如JSON）的好处是，您可以轻松地将其读入某个对象中，\n",
    "\n",
    "例如Python中的字典，或者如果您使用其他语言，则可以使用其他对象作为输入到后续步骤中。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "77328388",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "    \"primary\": \"账户管理\",\n",
      "    \"secondary\": \"关闭账户\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "response = get_completion_from_messages(messages)\n",
    "print(response)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "2f6b353b",
   "metadata": {},
   "source": [
    "这是另一个用户消息: \"告诉我更多关于你们的平板电视\"\n",
    "\n",
    "我们只是有相同的消息列表，模型的响应，然后我们打印它。\n",
    "\n",
    "结果这里是我们的第二个分类，看起来应该是正确的。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "edf8fbe9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"primary\": \"General Inquiry\",\n",
      "  \"secondary\": \"Product information\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "user_message = f\"\"\"\\\n",
    "Tell me more about your flat screen tvs\"\"\"\n",
    "messages =  [  \n",
    "{'role':'system', \n",
    " 'content': system_message},    \n",
    "{'role':'user', \n",
    " 'content': f\"{delimiter}{user_message}{delimiter}\"},  \n",
    "] \n",
    "response = get_completion_from_messages(messages)\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "f1d738e1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "以下是针对平板电脑的一般咨询：\n",
      "\n",
      "{\n",
      "  \"primary\": \"General Inquiry\",\n",
      "  \"secondary\": \"Product information\"\n",
      "}\n",
      "\n",
      "如果您有任何特定的问题或需要更详细的信息，请告诉我，我会尽力回答。\n"
     ]
    }
   ],
   "source": [
    "user_message = f\"\"\"\\\n",
    "告诉我更多有关你们的平板电脑的信息\"\"\"\n",
    "messages =  [  \n",
    "{'role':'system', \n",
    " 'content': system_message},    \n",
    "{'role':'user', \n",
    " 'content': f\"{delimiter}{user_message}{delimiter}\"},  \n",
    "] \n",
    "response = get_completion_from_messages(messages)\n",
    "print(response)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "8f87f68d",
   "metadata": {},
   "source": [
    "所以总的来说，根据客户咨询的分类，我们现在可以提供一套更具体的指令来处理后续步骤。\n",
    "\n",
    "在这种情况下，我们可能会添加关于电视的额外信息，而不同情况下，我们可能希望提供关闭账户的链接或类似的内容。\n",
    "\n",
    "我们将在以后的视频中了解更多有关处理输入的不同方法。\n",
    "\n",
    "在下一个视频中，我们将探讨更多评估输入的方法，特别是确保用户以负责任的方式使用系统的方法。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}