diff --git a/docs/content/C2 Building Systems with the ChatGPT API/7.检查结果 Check Outputs.ipynb b/docs/content/C2 Building Systems with the ChatGPT API/7.检查结果 Check Outputs.ipynb index 46d5493..2792745 100644 --- a/docs/content/C2 Building Systems with the ChatGPT API/7.检查结果 Check Outputs.ipynb +++ b/docs/content/C2 Building Systems with the ChatGPT API/7.检查结果 Check Outputs.ipynb @@ -1 +1 @@ -{"cells":[{"attachments":{},"cell_type":"markdown","id":"f99b8a44","metadata":{},"source":["# 第七章 检查结果\n","\n"]},{"cell_type":"markdown","id":"d8822242","metadata":{},"source":["在本章中,我们将重点如何检查系统生成的输出。在向用户展示输出之前,检查输出的质量、相关性和安全性对于确保提供的回应非常重要,无论是在自动化流程中还是其他场景中。我们将学习如何使用审查 API 来评估输出,并探讨如何使用额外的 Prompt 来提升模型在展示输出之前的质量评估。"]},{"attachments":{},"cell_type":"markdown","id":"59f69c2e","metadata":{},"source":["## 一、检查有害内容\n","主要就是 Moderation API 的使用"]},{"cell_type":"code","execution_count":3,"id":"943f5396","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["{\n"," \"categories\": {\n"," \"harassment\": false,\n"," \"harassment/threatening\": false,\n"," \"hate\": false,\n"," \"hate/threatening\": false,\n"," \"self-harm\": false,\n"," \"self-harm/instructions\": false,\n"," \"self-harm/intent\": false,\n"," \"sexual\": false,\n"," \"sexual/minors\": false,\n"," \"violence\": false,\n"," \"violence/graphic\": false\n"," },\n"," \"category_scores\": {\n"," \"harassment\": 4.2861907e-07,\n"," \"harassment/threatening\": 5.9538485e-09,\n"," \"hate\": 2.079682e-07,\n"," \"hate/threatening\": 5.6982725e-09,\n"," \"self-harm\": 2.3966843e-08,\n"," \"self-harm/instructions\": 1.5763412e-08,\n"," \"self-harm/intent\": 5.042827e-09,\n"," \"sexual\": 2.6989035e-06,\n"," \"sexual/minors\": 1.1349888e-06,\n"," \"violence\": 1.2788286e-06,\n"," \"violence/graphic\": 2.6259923e-07\n"," },\n"," \"flagged\": false\n","}\n"]}],"source":["import openai\n","from tool import get_completion_from_messages\n","\n","final_response_to_customer = f\"\"\"\n","SmartX ProPhone 有一个 6.1 英寸的显示屏,128GB 存储、\\\n","1200 万像素的双摄像头,以及 5G。FotoSnap 单反相机\\\n","有一个 2420 万像素的传感器,1080p 视频,3 英寸 LCD 和\\\n","可更换的镜头。我们有各种电视,包括 CineView 4K 电视,\\\n","55 英寸显示屏,4K 分辨率、HDR,以及智能电视功能。\\\n","我们也有 SoundMax 家庭影院系统,具有 5.1 声道,\\\n","1000W 输出,无线重低音扬声器和蓝牙。关于这些产品或\\\n","我们提供的任何其他产品您是否有任何具体问题?\n","\"\"\"\n","# Moderation 是 OpenAI 的内容审核函数,用于检测这段内容的危害含量\n","response = openai.Moderation.create(\n"," input=final_response_to_customer\n",")\n","moderation_output = response[\"results\"][0]\n","print(moderation_output)"]},{"cell_type":"markdown","id":"b1f1399a","metadata":{},"source":["正如您所见,这个输出没有被标记,并且在所有类别中都获得了非常低的分数,说明给定的回应是合理的。\n","\n","总的来说,检查输出也是非常重要的。例如,如果您正在为敏感的受众创建一个聊天机器人,您可以使用更低的阈值来标记输出。一般来说,如果审查输出表明内容被标记,您可以采取适当的行动,例如回应一个备用答案或生成一个新的回应。\n","\n","请注意,随着我们改进模型,它们也越来越不太可能返回任何有害的输出。\n","\n","另一种检查输出的方法是询问模型本身生成的结果是否令人满意,是否符合您所定义的标准。这可以通过将生成的输出作为输入的一部分提供给模型,并要求它评估输出的质量来实现。您可以以多种方式进行这样的操作。让我们看一个例子。"]},{"attachments":{},"cell_type":"markdown","id":"f57f8dad","metadata":{},"source":["## 二、检查是否符合产品信息"]},{"cell_type":"code","execution_count":4,"id":"552e3d8c","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["Y\n"]}],"source":["# 这是一段电子产品相关的信息\n","system_message = f\"\"\"\n","您是一个助理,用于评估客服代理的回复是否充分回答了客户问题,\\\n","并验证助理从产品信息中引用的所有事实是否正确。 \n","产品信息、用户和客服代理的信息将使用三个反引号(即 ```)\\\n","进行分隔。 \n","请以 Y 或 N 的字符形式进行回复,不要包含标点符号:\\\n","Y - 如果输出充分回答了问题并且回复正确地使用了产品信息\\\n","N - 其他情况。\n","\n","仅输出单个字母。\n","\"\"\"\n","\n","#这是顾客的提问\n","customer_message = f\"\"\"\n","告诉我有关 smartx pro 手机\\\n","和 fotosnap 相机(单反相机)的信息。\\\n","还有您电视的信息。\n","\"\"\"\n","product_information = \"\"\"{ \"name\": \"SmartX ProPhone\", \"category\": \"Smartphones and Accessories\", \"brand\": \"SmartX\", \"model_number\": \"SX-PP10\", \"warranty\": \"1 year\", \"rating\": 4.6, \"features\": [ \"6.1-inch display\", \"128GB storage\", \"12MP dual camera\", \"5G\" ], \"description\": \"A powerful smartphone with advanced camera features.\", \"price\": 899.99 } { \"name\": \"FotoSnap DSLR Camera\", \"category\": \"Cameras and Camcorders\", \"brand\": \"FotoSnap\", \"model_number\": \"FS-DSLR200\", \"warranty\": \"1 year\", \"rating\": 4.7, \"features\": [ \"24.2MP sensor\", \"1080p video\", \"3-inch LCD\", \"Interchangeable lenses\" ], \"description\": \"Capture stunning photos and videos with this versatile DSLR camera.\", \"price\": 599.99 } { \"name\": \"CineView 4K TV\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"CineView\", \"model_number\": \"CV-4K55\", \"warranty\": \"2 years\", \"rating\": 4.8, \"features\": [ \"55-inch display\", \"4K resolution\", \"HDR\", \"Smart TV\" ], \"description\": \"A stunning 4K TV with vibrant colors and smart features.\", \"price\": 599.99 } { \"name\": \"SoundMax Home Theater\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"SoundMax\", \"model_number\": \"SM-HT100\", \"warranty\": \"1 year\", \"rating\": 4.4, \"features\": [ \"5.1 channel\", \"1000W output\", \"Wireless subwoofer\", \"Bluetooth\" ], \"description\": \"A powerful home theater system for an immersive audio experience.\", \"price\": 399.99 } { \"name\": \"CineView 8K TV\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"CineView\", \"model_number\": \"CV-8K65\", \"warranty\": \"2 years\", \"rating\": 4.9, \"features\": [ \"65-inch display\", \"8K resolution\", \"HDR\", \"Smart TV\" ], \"description\": \"Experience the future of television with this stunning 8K TV.\", \"price\": 2999.99 } { \"name\": \"SoundMax Soundbar\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"SoundMax\", \"model_number\": \"SM-SB50\", \"warranty\": \"1 year\", \"rating\": 4.3, \"features\": [ \"2.1 channel\", \"300W output\", \"Wireless subwoofer\", \"Bluetooth\" ], \"description\": \"Upgrade your TV's audio with this sleek and powerful soundbar.\", \"price\": 199.99 } { \"name\": \"CineView OLED TV\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"CineView\", \"model_number\": \"CV-OLED55\", \"warranty\": \"2 years\", \"rating\": 4.7, \"features\": [ \"55-inch display\", \"4K resolution\", \"HDR\", \"Smart TV\" ], \"description\": \"Experience true blacks and vibrant colors with this OLED TV.\", \"price\": 1499.99 }\"\"\"\n","\n","q_a_pair = f\"\"\"\n","顾客的信息: ```{customer_message}```\n","产品信息: ```{product_information}```\n","代理的回复: ```{final_response_to_customer}```\n","\n","回复是否正确使用了检索的信息?\n","回复是否充分地回答了问题?\n","\n","输出 Y 或 N\n","\"\"\"\n","#判断相关性\n","messages = [\n"," {'role': 'system', 'content': system_message},\n"," {'role': 'user', 'content': q_a_pair}\n","]\n","\n","response = get_completion_from_messages(messages, max_tokens=1)\n","print(response)"]},{"cell_type":"code","execution_count":5,"id":"afb1b82f","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["Y\n"]}],"source":["another_response = \"生活就像一盒巧克力\"\n","q_a_pair = f\"\"\"\n","顾客的信息: ```{customer_message}```\n","产品信息: ```{product_information}```\n","代理的回复: ```{final_response_to_customer}```\n","\n","回复是否正确使用了检索的信息?\n","回复是否充分地回答了问题?\n","\n","输出 Y 或 N\n","\"\"\"\n","messages = [\n"," {'role': 'system', 'content': system_message},\n"," {'role': 'user', 'content': q_a_pair}\n","]\n","\n","response = get_completion_from_messages(messages)\n","print(response)"]},{"cell_type":"markdown","id":"51dd8979","metadata":{},"source":["因此,您可以看到,模型能够提供关于生成输出质量的反馈。您可以利用这个反馈来决定是否展示输出给用户或生成新的回应。甚至可以尝试为每个用户查询生成多个模型回应,然后选择最佳的回应展示给用户。因此,您有多种尝试的方式。\n","\n","总的来说,使用审查 API 来检查输出是一个不错的做法。但是,我认为在大部分情况下这可能是不必要的,尤其是当您使用更先进的模型,例如 GPT-4 时。\n","\n","事实上,我们并没有看到很多人在实际生产环境中采取这种做法。这也会增加系统的延迟和成本,因为您必须等待额外的调用,还需要额外的 tokens。如果您的应用或产品的错误率只有 0.0000001%,那么或许您可以尝试这种方法。但总的来说,我们不建议您在实际应用中采用这种方式。\n","\n","在下一章中,我们将把我们在评估输入部分、处理部分和检查输出中学到的所有内容结合起来,构建一个端到端的系统。\n","\n"]},{"cell_type":"markdown","id":"19bb0780","metadata":{},"source":["## 三、英文版"]},{"cell_type":"markdown","id":"690f32f2","metadata":{},"source":["**1.1 检查有害信息**"]},{"cell_type":"code","execution_count":6,"id":"b4175302","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["{\n"," \"categories\": {\n"," \"harassment\": false,\n"," \"harassment/threatening\": false,\n"," \"hate\": false,\n"," \"hate/threatening\": false,\n"," \"self-harm\": false,\n"," \"self-harm/instructions\": false,\n"," \"self-harm/intent\": false,\n"," \"sexual\": false,\n"," \"sexual/minors\": false,\n"," \"violence\": false,\n"," \"violence/graphic\": false\n"," },\n"," \"category_scores\": {\n"," \"harassment\": 3.4429521e-09,\n"," \"harassment/threatening\": 9.538529e-10,\n"," \"hate\": 6.0008998e-09,\n"," \"hate/threatening\": 3.5339007e-10,\n"," \"self-harm\": 5.6997046e-10,\n"," \"self-harm/instructions\": 3.864466e-08,\n"," \"self-harm/intent\": 9.3394e-10,\n"," \"sexual\": 2.2777907e-07,\n"," \"sexual/minors\": 2.6869095e-08,\n"," \"violence\": 3.5471032e-07,\n"," \"violence/graphic\": 7.8637696e-10\n"," },\n"," \"flagged\": false\n","}\n"]}],"source":["final_response_to_customer = f\"\"\"\n","The SmartX ProPhone has a 6.1-inch display, 128GB storage, \\\n","12MP dual camera, and 5G. The FotoSnap DSLR Camera \\\n","has a 24.2MP sensor, 1080p video, 3-inch LCD, and \\\n","interchangeable lenses. We have a variety of TVs, including \\\n","the CineView 4K TV with a 55-inch display, 4K resolution, \\\n","HDR, and smart TV features. We also have the SoundMax \\\n","Home Theater system with 5.1 channel, 1000W output, wireless \\\n","subwoofer, and Bluetooth. Do you have any specific questions \\\n","about these products or any other products we offer?\n","\"\"\"\n","\n","\n","response = openai.Moderation.create(\n"," input=final_response_to_customer\n",")\n","moderation_output = response[\"results\"][0]\n","print(moderation_output)"]},{"cell_type":"markdown","id":"4a7fb209","metadata":{},"source":["**2.1 检查是否符合产品信息**"]},{"cell_type":"code","execution_count":7,"id":"7859ffed","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["Y\n"]}],"source":["# 这是一段电子产品相关的信息\n","system_message = f\"\"\"\n","You are an assistant that evaluates whether \\\n","customer service agent responses sufficiently \\\n","answer customer questions, and also validates that \\\n","all the facts the assistant cites from the product \\\n","information are correct.\n","The product information and user and customer \\\n","service agent messages will be delimited by \\\n","3 backticks, i.e. ```.\n","Respond with a Y or N character, with no punctuation:\n","Y - if the output sufficiently answers the question \\\n","AND the response correctly uses product information\n","N - otherwise\n","\n","Output a single letter only.\n","\"\"\"\n","\n","#这是顾客的提问\n","customer_message = f\"\"\"\n","tell me about the smartx pro phone and \\\n","the fotosnap camera, the dslr one. \\\n","Also tell me about your tvs\"\"\"\n","product_information = \"\"\"{ \"name\": \"SmartX ProPhone\", \"category\": \"Smartphones and Accessories\", \"brand\": \"SmartX\", \"model_number\": \"SX-PP10\", \"warranty\": \"1 year\", \"rating\": 4.6, \"features\": [ \"6.1-inch display\", \"128GB storage\", \"12MP dual camera\", \"5G\" ], \"description\": \"A powerful smartphone with advanced camera features.\", \"price\": 899.99 } { \"name\": \"FotoSnap DSLR Camera\", \"category\": \"Cameras and Camcorders\", \"brand\": \"FotoSnap\", \"model_number\": \"FS-DSLR200\", \"warranty\": \"1 year\", \"rating\": 4.7, \"features\": [ \"24.2MP sensor\", \"1080p video\", \"3-inch LCD\", \"Interchangeable lenses\" ], \"description\": \"Capture stunning photos and videos with this versatile DSLR camera.\", \"price\": 599.99 } { \"name\": \"CineView 4K TV\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"CineView\", \"model_number\": \"CV-4K55\", \"warranty\": \"2 years\", \"rating\": 4.8, \"features\": [ \"55-inch display\", \"4K resolution\", \"HDR\", \"Smart TV\" ], \"description\": \"A stunning 4K TV with vibrant colors and smart features.\", \"price\": 599.99 } { \"name\": \"SoundMax Home Theater\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"SoundMax\", \"model_number\": \"SM-HT100\", \"warranty\": \"1 year\", \"rating\": 4.4, \"features\": [ \"5.1 channel\", \"1000W output\", \"Wireless subwoofer\", \"Bluetooth\" ], \"description\": \"A powerful home theater system for an immersive audio experience.\", \"price\": 399.99 } { \"name\": \"CineView 8K TV\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"CineView\", \"model_number\": \"CV-8K65\", \"warranty\": \"2 years\", \"rating\": 4.9, \"features\": [ \"65-inch display\", \"8K resolution\", \"HDR\", \"Smart TV\" ], \"description\": \"Experience the future of television with this stunning 8K TV.\", \"price\": 2999.99 } { \"name\": \"SoundMax Soundbar\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"SoundMax\", \"model_number\": \"SM-SB50\", \"warranty\": \"1 year\", \"rating\": 4.3, \"features\": [ \"2.1 channel\", \"300W output\", \"Wireless subwoofer\", \"Bluetooth\" ], \"description\": \"Upgrade your TV's audio with this sleek and powerful soundbar.\", \"price\": 199.99 } { \"name\": \"CineView OLED TV\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"CineView\", \"model_number\": \"CV-OLED55\", \"warranty\": \"2 years\", \"rating\": 4.7, \"features\": [ \"55-inch display\", \"4K resolution\", \"HDR\", \"Smart TV\" ], \"description\": \"Experience true blacks and vibrant colors with this OLED TV.\", \"price\": 1499.99 }\"\"\"\n","\n","q_a_pair = f\"\"\"\n","Customer message: ```{customer_message}```\n","Product information: ```{product_information}```\n","Agent response: ```{final_response_to_customer}```\n","\n","Does the response use the retrieved information correctly?\n","Does the response sufficiently answer the question?\n","\n","Output Y or N\n","\"\"\"\n","#判断相关性\n","messages = [\n"," {'role': 'system', 'content': system_message},\n"," {'role': 'user', 'content': q_a_pair}\n","]\n","\n","response = get_completion_from_messages(messages, max_tokens=1)\n","print(response)"]},{"cell_type":"code","execution_count":8,"id":"544aeabd","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["N\n"]}],"source":["another_response = \"life is like a box of chocolates\"\n","q_a_pair = f\"\"\"\n","Customer message: ```{customer_message}```\n","Product information: ```{product_information}```\n","Agent response: ```{another_response}```\n","\n","Does the response use the retrieved information correctly?\n","Does the response sufficiently answer the question?\n","\n","Output Y or N\n","\"\"\"\n","messages = [\n"," {'role': 'system', 'content': system_message},\n"," {'role': 'user', 'content': q_a_pair}\n","]\n","\n","response = get_completion_from_messages(messages)\n","print(response)"]}],"metadata":{"kernelspec":{"display_name":"Python 3.9.6 64-bit","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.10.11"},"vscode":{"interpreter":{"hash":"31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"}}},"nbformat":4,"nbformat_minor":5} +{"cells":[{"attachments":{},"cell_type":"markdown","id":"f99b8a44","metadata":{},"source":["# 第七章 检查结果\n","\n"]},{"cell_type":"markdown","id":"d8822242","metadata":{},"source":["随着我们深入本书的学习,本章将引领你了解如何评估系统生成的输出。在任何场景中,无论是自动化流程还是其他环境,我们都必须确保在向用户展示输出之前,对其质量、相关性和安全性进行严格的检查,以保证我们提供的反馈是准确和适用的。我们将学习如何运用审查(Moderation) API 来对输出进行评估,并深入探讨如何通过额外的 Prompt 提升模型在展示输出之前的质量评估。"]},{"attachments":{},"cell_type":"markdown","id":"59f69c2e","metadata":{},"source":["## 一、检查有害内容\n","主要就是 Moderation API 的使用"]},{"cell_type":"code","execution_count":3,"id":"943f5396","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["{\n"," \"categories\": {\n"," \"harassment\": false,\n"," \"harassment/threatening\": false,\n"," \"hate\": false,\n"," \"hate/threatening\": false,\n"," \"self-harm\": false,\n"," \"self-harm/instructions\": false,\n"," \"self-harm/intent\": false,\n"," \"sexual\": false,\n"," \"sexual/minors\": false,\n"," \"violence\": false,\n"," \"violence/graphic\": false\n"," },\n"," \"category_scores\": {\n"," \"harassment\": 4.2861907e-07,\n"," \"harassment/threatening\": 5.9538485e-09,\n"," \"hate\": 2.079682e-07,\n"," \"hate/threatening\": 5.6982725e-09,\n"," \"self-harm\": 2.3966843e-08,\n"," \"self-harm/instructions\": 1.5763412e-08,\n"," \"self-harm/intent\": 5.042827e-09,\n"," \"sexual\": 2.6989035e-06,\n"," \"sexual/minors\": 1.1349888e-06,\n"," \"violence\": 1.2788286e-06,\n"," \"violence/graphic\": 2.6259923e-07\n"," },\n"," \"flagged\": false\n","}\n"]}],"source":["import openai\n","from tool import get_completion_from_messages\n","\n","final_response_to_customer = f\"\"\"\n","SmartX ProPhone 有一个 6.1 英寸的显示屏,128GB 存储、\\\n","1200 万像素的双摄像头,以及 5G。FotoSnap 单反相机\\\n","有一个 2420 万像素的传感器,1080p 视频,3 英寸 LCD 和\\\n","可更换的镜头。我们有各种电视,包括 CineView 4K 电视,\\\n","55 英寸显示屏,4K 分辨率、HDR,以及智能电视功能。\\\n","我们也有 SoundMax 家庭影院系统,具有 5.1 声道,\\\n","1000W 输出,无线重低音扬声器和蓝牙。关于这些产品或\\\n","我们提供的任何其他产品您是否有任何具体问题?\n","\"\"\"\n","# Moderation 是 OpenAI 的内容审核函数,旨在评估并检测文本内容中的潜在风险。\n","response = openai.Moderation.create(\n"," input=final_response_to_customer\n",")\n","moderation_output = response[\"results\"][0]\n","print(moderation_output)"]},{"cell_type":"markdown","id":"b1f1399a","metadata":{},"source":["如你所见,这个输出没有被标记为任何特定类别,并且在所有类别中都获得了非常低的得分,说明给出的结果评判是合理的。\n","\n","总体来说,检查输出的质量同样是十分重要的。例如,如果你正在为一个对内容有特定敏感度的受众构建一个聊天机器人,你可以设定更低的阈值来标记可能存在问题的输出。通常情况下,如果审查结果显示某些内容被标记,你可以采取适当的措施,比如提供一个替代答案或生成一个新的响应。\n","\n","值得注意的是,随着我们对模型的持续改进,它们越来越不太可能产生有害的输出。\n","\n","检查输出质量的另一种方法是向模型询问其自身生成的结果是否满意,是否达到了你所设定的标准。这可以通过将生成的输出作为输入的一部分再次提供给模型,并要求它对输出的质量进行评估。这种操作可以通过多种方式完成。接下来,我们将通过一个例子来展示这种方法。"]},{"attachments":{},"cell_type":"markdown","id":"f57f8dad","metadata":{},"source":["## 二、检查是否符合产品信息"]},{"cell_type":"code","execution_count":4,"id":"552e3d8c","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["Y\n"]}],"source":["# 这是一段电子产品相关的信息\n","system_message = f\"\"\"\n","您是一个助理,用于评估客服代理的回复是否充分回答了客户问题,\\\n","并验证助理从产品信息中引用的所有事实是否正确。 \n","产品信息、用户和客服代理的信息将使用三个反引号(即 ```)\\\n","进行分隔。 \n","请以 Y 或 N 的字符形式进行回复,不要包含标点符号:\\\n","Y - 如果输出充分回答了问题并且回复正确地使用了产品信息\\\n","N - 其他情况。\n","\n","仅输出单个字母。\n","\"\"\"\n","\n","#这是顾客的提问\n","customer_message = f\"\"\"\n","告诉我有关 smartx pro 手机\\\n","和 fotosnap 相机(单反相机)的信息。\\\n","还有您电视的信息。\n","\"\"\"\n","product_information = \"\"\"{ \"name\": \"SmartX ProPhone\", \"category\": \"Smartphones and Accessories\", \"brand\": \"SmartX\", \"model_number\": \"SX-PP10\", \"warranty\": \"1 year\", \"rating\": 4.6, \"features\": [ \"6.1-inch display\", \"128GB storage\", \"12MP dual camera\", \"5G\" ], \"description\": \"A powerful smartphone with advanced camera features.\", \"price\": 899.99 } { \"name\": \"FotoSnap DSLR Camera\", \"category\": \"Cameras and Camcorders\", \"brand\": \"FotoSnap\", \"model_number\": \"FS-DSLR200\", \"warranty\": \"1 year\", \"rating\": 4.7, \"features\": [ \"24.2MP sensor\", \"1080p video\", \"3-inch LCD\", \"Interchangeable lenses\" ], \"description\": \"Capture stunning photos and videos with this versatile DSLR camera.\", \"price\": 599.99 } { \"name\": \"CineView 4K TV\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"CineView\", \"model_number\": \"CV-4K55\", \"warranty\": \"2 years\", \"rating\": 4.8, \"features\": [ \"55-inch display\", \"4K resolution\", \"HDR\", \"Smart TV\" ], \"description\": \"A stunning 4K TV with vibrant colors and smart features.\", \"price\": 599.99 } { \"name\": \"SoundMax Home Theater\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"SoundMax\", \"model_number\": \"SM-HT100\", \"warranty\": \"1 year\", \"rating\": 4.4, \"features\": [ \"5.1 channel\", \"1000W output\", \"Wireless subwoofer\", \"Bluetooth\" ], \"description\": \"A powerful home theater system for an immersive audio experience.\", \"price\": 399.99 } { \"name\": \"CineView 8K TV\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"CineView\", \"model_number\": \"CV-8K65\", \"warranty\": \"2 years\", \"rating\": 4.9, \"features\": [ \"65-inch display\", \"8K resolution\", \"HDR\", \"Smart TV\" ], \"description\": \"Experience the future of television with this stunning 8K TV.\", \"price\": 2999.99 } { \"name\": \"SoundMax Soundbar\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"SoundMax\", \"model_number\": \"SM-SB50\", \"warranty\": \"1 year\", \"rating\": 4.3, \"features\": [ \"2.1 channel\", \"300W output\", \"Wireless subwoofer\", \"Bluetooth\" ], \"description\": \"Upgrade your TV's audio with this sleek and powerful soundbar.\", \"price\": 199.99 } { \"name\": \"CineView OLED TV\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"CineView\", \"model_number\": \"CV-OLED55\", \"warranty\": \"2 years\", \"rating\": 4.7, \"features\": [ \"55-inch display\", \"4K resolution\", \"HDR\", \"Smart TV\" ], \"description\": \"Experience true blacks and vibrant colors with this OLED TV.\", \"price\": 1499.99 }\"\"\"\n","\n","q_a_pair = f\"\"\"\n","顾客的信息: ```{customer_message}```\n","产品信息: ```{product_information}```\n","代理的回复: ```{final_response_to_customer}```\n","\n","回复是否正确使用了检索的信息?\n","回复是否充分地回答了问题?\n","\n","输出 Y 或 N\n","\"\"\"\n","#判断相关性\n","messages = [\n"," {'role': 'system', 'content': system_message},\n"," {'role': 'user', 'content': q_a_pair}\n","]\n","\n","response = get_completion_from_messages(messages, max_tokens=1)\n","print(response)"]},{"cell_type":"code","execution_count":5,"id":"afb1b82f","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["N\n"]}],"source":["another_response = \"生活就像一盒巧克力\"\n","q_a_pair = f\"\"\"\n","顾客的信息: ```{customer_message}```\n","产品信息: ```{product_information}```\n","代理的回复: ```{another_response}```\n","\n","回复是否正确使用了检索的信息?\n","回复是否充分地回答了问题?\n","\n","输出 Y 或 N\n","\"\"\"\n","messages = [\n"," {'role': 'system', 'content': system_message},\n"," {'role': 'user', 'content': q_a_pair}\n","]\n","\n","response = get_completion_from_messages(messages)\n","print(response)"]},{"cell_type":"markdown","id":"51dd8979","metadata":{},"source":["因此,你可以看到,模型具有提供生成输出质量反馈的能力。你可以使用这种反馈来决定是否将输出展示给用户,或是生成新的回应。你甚至可以尝试为每个用户查询生成多个模型回应,然后从中挑选出最佳的回应呈现给用户。所以,你有多种可能的尝试方式。\n","\n","总的来说,借助审查 API 来检查输出是一个可取的策略。但在我看来,这在大多数情况下可能是不必要的,特别是当你使用更先进的模型,比如 GPT-4 。\n","\n","实际上,在真实生产环境中,我们并未看到很多人采取这种方式。这种做法也会增加系统的延迟和成本,因为你需要等待额外的 API 调用,并且需要额外的 token 。如果你的应用或产品的错误率仅为0.0000001%,那么你可能可以尝试这种策略。但总的来说,我们并不建议在实际应用中使用这种方式。\n","\n","在接下来的章节中,我们将把我们在评估输入、处理输出以及审查生成内容所学到的知识整合起来,构建一个端到端的系统。"]},{"cell_type":"markdown","id":"19bb0780","metadata":{},"source":["## 三、英文版"]},{"cell_type":"markdown","id":"690f32f2","metadata":{},"source":["**1.1 检查有害信息**"]},{"cell_type":"code","execution_count":6,"id":"b4175302","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["{\n"," \"categories\": {\n"," \"harassment\": false,\n"," \"harassment/threatening\": false,\n"," \"hate\": false,\n"," \"hate/threatening\": false,\n"," \"self-harm\": false,\n"," \"self-harm/instructions\": false,\n"," \"self-harm/intent\": false,\n"," \"sexual\": false,\n"," \"sexual/minors\": false,\n"," \"violence\": false,\n"," \"violence/graphic\": false\n"," },\n"," \"category_scores\": {\n"," \"harassment\": 3.4429521e-09,\n"," \"harassment/threatening\": 9.538529e-10,\n"," \"hate\": 6.0008998e-09,\n"," \"hate/threatening\": 3.5339007e-10,\n"," \"self-harm\": 5.6997046e-10,\n"," \"self-harm/instructions\": 3.864466e-08,\n"," \"self-harm/intent\": 9.3394e-10,\n"," \"sexual\": 2.2777907e-07,\n"," \"sexual/minors\": 2.6869095e-08,\n"," \"violence\": 3.5471032e-07,\n"," \"violence/graphic\": 7.8637696e-10\n"," },\n"," \"flagged\": false\n","}\n"]}],"source":["final_response_to_customer = f\"\"\"\n","The SmartX ProPhone has a 6.1-inch display, 128GB storage, \\\n","12MP dual camera, and 5G. The FotoSnap DSLR Camera \\\n","has a 24.2MP sensor, 1080p video, 3-inch LCD, and \\\n","interchangeable lenses. We have a variety of TVs, including \\\n","the CineView 4K TV with a 55-inch display, 4K resolution, \\\n","HDR, and smart TV features. We also have the SoundMax \\\n","Home Theater system with 5.1 channel, 1000W output, wireless \\\n","subwoofer, and Bluetooth. Do you have any specific questions \\\n","about these products or any other products we offer?\n","\"\"\"\n","\n","\n","response = openai.Moderation.create(\n"," input=final_response_to_customer\n",")\n","moderation_output = response[\"results\"][0]\n","print(moderation_output)"]},{"cell_type":"markdown","id":"4a7fb209","metadata":{},"source":["**2.1 检查是否符合产品信息**"]},{"cell_type":"code","execution_count":7,"id":"7859ffed","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["Y\n"]}],"source":["# 这是一段电子产品相关的信息\n","system_message = f\"\"\"\n","You are an assistant that evaluates whether \\\n","customer service agent responses sufficiently \\\n","answer customer questions, and also validates that \\\n","all the facts the assistant cites from the product \\\n","information are correct.\n","The product information and user and customer \\\n","service agent messages will be delimited by \\\n","3 backticks, i.e. ```.\n","Respond with a Y or N character, with no punctuation:\n","Y - if the output sufficiently answers the question \\\n","AND the response correctly uses product information\n","N - otherwise\n","\n","Output a single letter only.\n","\"\"\"\n","\n","#这是顾客的提问\n","customer_message = f\"\"\"\n","tell me about the smartx pro phone and \\\n","the fotosnap camera, the dslr one. \\\n","Also tell me about your tvs\"\"\"\n","product_information = \"\"\"{ \"name\": \"SmartX ProPhone\", \"category\": \"Smartphones and Accessories\", \"brand\": \"SmartX\", \"model_number\": \"SX-PP10\", \"warranty\": \"1 year\", \"rating\": 4.6, \"features\": [ \"6.1-inch display\", \"128GB storage\", \"12MP dual camera\", \"5G\" ], \"description\": \"A powerful smartphone with advanced camera features.\", \"price\": 899.99 } { \"name\": \"FotoSnap DSLR Camera\", \"category\": \"Cameras and Camcorders\", \"brand\": \"FotoSnap\", \"model_number\": \"FS-DSLR200\", \"warranty\": \"1 year\", \"rating\": 4.7, \"features\": [ \"24.2MP sensor\", \"1080p video\", \"3-inch LCD\", \"Interchangeable lenses\" ], \"description\": \"Capture stunning photos and videos with this versatile DSLR camera.\", \"price\": 599.99 } { \"name\": \"CineView 4K TV\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"CineView\", \"model_number\": \"CV-4K55\", \"warranty\": \"2 years\", \"rating\": 4.8, \"features\": [ \"55-inch display\", \"4K resolution\", \"HDR\", \"Smart TV\" ], \"description\": \"A stunning 4K TV with vibrant colors and smart features.\", \"price\": 599.99 } { \"name\": \"SoundMax Home Theater\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"SoundMax\", \"model_number\": \"SM-HT100\", \"warranty\": \"1 year\", \"rating\": 4.4, \"features\": [ \"5.1 channel\", \"1000W output\", \"Wireless subwoofer\", \"Bluetooth\" ], \"description\": \"A powerful home theater system for an immersive audio experience.\", \"price\": 399.99 } { \"name\": \"CineView 8K TV\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"CineView\", \"model_number\": \"CV-8K65\", \"warranty\": \"2 years\", \"rating\": 4.9, \"features\": [ \"65-inch display\", \"8K resolution\", \"HDR\", \"Smart TV\" ], \"description\": \"Experience the future of television with this stunning 8K TV.\", \"price\": 2999.99 } { \"name\": \"SoundMax Soundbar\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"SoundMax\", \"model_number\": \"SM-SB50\", \"warranty\": \"1 year\", \"rating\": 4.3, \"features\": [ \"2.1 channel\", \"300W output\", \"Wireless subwoofer\", \"Bluetooth\" ], \"description\": \"Upgrade your TV's audio with this sleek and powerful soundbar.\", \"price\": 199.99 } { \"name\": \"CineView OLED TV\", \"category\": \"Televisions and Home Theater Systems\", \"brand\": \"CineView\", \"model_number\": \"CV-OLED55\", \"warranty\": \"2 years\", \"rating\": 4.7, \"features\": [ \"55-inch display\", \"4K resolution\", \"HDR\", \"Smart TV\" ], \"description\": \"Experience true blacks and vibrant colors with this OLED TV.\", \"price\": 1499.99 }\"\"\"\n","\n","q_a_pair = f\"\"\"\n","Customer message: ```{customer_message}```\n","Product information: ```{product_information}```\n","Agent response: ```{final_response_to_customer}```\n","\n","Does the response use the retrieved information correctly?\n","Does the response sufficiently answer the question?\n","\n","Output Y or N\n","\"\"\"\n","#判断相关性\n","messages = [\n"," {'role': 'system', 'content': system_message},\n"," {'role': 'user', 'content': q_a_pair}\n","]\n","\n","response = get_completion_from_messages(messages, max_tokens=1)\n","print(response)"]},{"cell_type":"code","execution_count":8,"id":"544aeabd","metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":["N\n"]}],"source":["another_response = \"life is like a box of chocolates\"\n","q_a_pair = f\"\"\"\n","Customer message: ```{customer_message}```\n","Product information: ```{product_information}```\n","Agent response: ```{another_response}```\n","\n","Does the response use the retrieved information correctly?\n","Does the response sufficiently answer the question?\n","\n","Output Y or N\n","\"\"\"\n","messages = [\n"," {'role': 'system', 'content': system_message},\n"," {'role': 'user', 'content': q_a_pair}\n","]\n","\n","response = get_completion_from_messages(messages)\n","print(response)"]}],"metadata":{"kernelspec":{"display_name":"Python 3.9.6 64-bit","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.10.11"},"vscode":{"interpreter":{"hash":"31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"}}},"nbformat":4,"nbformat_minor":5}