Files
prompt-engineering-for-deve…/content/Building Systems with the ChatGPT API/8.Evaluation.ipynb
nowadays0421 69ec8458df 修正ch-->zh
2023-06-06 23:18:07 +08:00

548 lines
44 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# 第七章 搭建一个带评估的端到端问答系统"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"在本节课中,我们将搭建一个带评估的端到端问答系统,综合了之前多节课的内容,加入了评估过程。\n",
"\n",
"首先,我们将检查输入,看看它是否能够通过审核 API 的审核。\n",
"\n",
"其次,如果没有,我们将提取产品列表。\n",
"\n",
"第三,如果找到了产品,我们将尝试查找它们。\n",
"\n",
"第四,我们将使用模型回答用户问题。\n",
"\n",
"最后我们将通过审核API对答案进行审核。\n",
"\n",
"如果没有被标记,我们将把答案返回给用户。"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"环境配置"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"application/javascript": "(function(root) {\n function now() {\n return new Date();\n }\n\n var force = true;\n\n if (typeof root._bokeh_onload_callbacks === \"undefined\" || force === true) {\n root._bokeh_onload_callbacks = [];\n root._bokeh_is_loading = undefined;\n }\n\n if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n root._bokeh_timeout = Date.now() + 5000;\n root._bokeh_failed_load = false;\n }\n\n function run_callbacks() {\n try {\n root._bokeh_onload_callbacks.forEach(function(callback) {\n if (callback != null)\n callback();\n });\n } finally {\n delete root._bokeh_onload_callbacks\n }\n console.debug(\"Bokeh: all callbacks have finished\");\n }\n\n function load_libs(css_urls, js_urls, js_modules, callback) {\n if (css_urls == null) css_urls = [];\n if (js_urls == null) js_urls = [];\n if (js_modules == null) js_modules = [];\n\n root._bokeh_onload_callbacks.push(callback);\n if (root._bokeh_is_loading > 0) {\n console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n return null;\n }\n if (js_urls.length === 0 && js_modules.length === 0) {\n run_callbacks();\n return null;\n }\n console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n\n function on_load() {\n root._bokeh_is_loading--;\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n run_callbacks()\n }\n }\n\n function on_error() {\n console.error(\"failed to load \" + url);\n }\n\n for (var i = 0; i < css_urls.length; i++) {\n var url = css_urls[i];\n const element = document.createElement(\"link\");\n element.onload = on_load;\n element.onerror = on_error;\n element.rel = \"stylesheet\";\n element.type = \"text/css\";\n element.href = url;\n console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n document.body.appendChild(element);\n }\n\n var skip = [];\n if (window.requirejs) {\n window.requirejs.config({'packages': {}, 'paths': {'gridstack': 'https://cdn.jsdelivr.net/npm/gridstack@4.2.5/dist/gridstack-h5', 'notyf': 'https://cdn.jsdelivr.net/npm/notyf@3/notyf.min'}, 'shim': {'gridstack': {'exports': 'GridStack'}}});\n require([\"gridstack\"], function(GridStack) {\n\twindow.GridStack = GridStack\n\ton_load()\n })\n require([\"notyf\"], function() {\n\ton_load()\n })\n root._bokeh_is_loading = css_urls.length + 2;\n } else {\n root._bokeh_is_loading = css_urls.length + js_urls.length + js_modules.length;\n } if (((window['GridStack'] !== undefined) && (!(window['GridStack'] instanceof HTMLElement))) || window.requirejs) {\n var urls = ['https://cdn.holoviz.org/panel/0.14.4/dist/bundled/gridstack/gridstack@4.2.5/dist/gridstack-h5.js'];\n for (var i = 0; i < urls.length; i++) {\n skip.push(urls[i])\n }\n } if (((window['Notyf'] !== undefined) && (!(window['Notyf'] instanceof HTMLElement))) || window.requirejs) {\n var urls = ['https://cdn.holoviz.org/panel/0.14.4/dist/bundled/notificationarea/notyf@3/notyf.min.js'];\n for (var i = 0; i < urls.length; i++) {\n skip.push(urls[i])\n }\n } for (var i = 0; i < js_urls.length; i++) {\n var url = js_urls[i];\n if (skip.indexOf(url) >= 0) {\n\tif (!window.requirejs) {\n\t on_load();\n\t}\n\tcontinue;\n }\n var element = document.createElement('script');\n element.onload = on_load;\n element.onerror = on_error;\n element.async = false;\n element.src = url;\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.head.appendChild(element);\n }\n for (var i = 0; i < js_modules.length; i++) {\n var url = js_modules[i];\n if (skip.indexOf(url) >= 0) {\n\tif (!window.requirejs) {\n\t on_load();\n\t}\n\tcontinue;\n }\n var element = document.createElement('script');\n element.onload = on_load;\n element.onerror = on_error;\n element.async = false;\n element.src = url;\n element.type = \"module\";\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.head.appendChild(element);\n }\n if (!js_urls.length && !js_modules.length) {\n on_load()\n }\n };\n\n function inject_raw_css(css) {\n const element = document.createElement(\"style\");\n element.appendChild(document.createTextNode(css));\n document.body.appendChild(element);\n }\n\n var js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-2.4.3.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-2.4.3.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-2.4.3.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-2.4.3.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-mathjax-2.4.3.min.js\", \"https://unpkg.com/@holoviz/panel@0.14.4/dist/panel.min.js\"];\n var js_modules = [];\n var css_urls = [\"https://cdn.holoviz.org/panel/0.14.4/dist/css/alerts.css\", \"https://cdn.holoviz.org/panel/0.14.4/dist/css/card.css\", \"https://cdn.holoviz.org/panel/0.14.4/dist/css/dataframe.css\", \"https://cdn.holoviz.org/panel/0.14.4/dist/css/debugger.css\", \"https://cdn.holoviz.org/panel/0.14.4/dist/css/json.css\", \"https://cdn.holoviz.org/panel/0.14.4/dist/css/loading.css\", \"https://cdn.holoviz.org/panel/0.14.4/dist/css/markdown.css\", \"https://cdn.holoviz.org/panel/0.14.4/dist/css/widgets.css\"];\n var inline_js = [ function(Bokeh) {\n inject_raw_css(\"\\n .bk.pn-loading.arc:before {\\n background-image: url(\\\"\\\");\\n background-size: auto calc(min(50%, 400px));\\n }\\n \");\n }, function(Bokeh) {\n Bokeh.set_log_level(\"info\");\n },\nfunction(Bokeh) {} // ensure no trailing comma for IE\n ];\n\n function run_inline_js() {\n if ((root.Bokeh !== undefined) || (force === true)) {\n for (var i = 0; i < inline_js.length; i++) {\n inline_js[i].call(root, root.Bokeh);\n }} else if (Date.now() < root._bokeh_timeout) {\n setTimeout(run_inline_js, 100);\n } else if (!root._bokeh_failed_load) {\n console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n root._bokeh_failed_load = true;\n }\n }\n\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: BokehJS loaded, going straight to plotting\");\n run_inline_js();\n } else {\n load_libs(css_urls, js_urls, js_modules, function() {\n console.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n run_inline_js();\n });\n }\n}(window));",
"application/vnd.holoviews_load.v0+json": ""
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": "\nif ((window.PyViz === undefined) || (window.PyViz instanceof HTMLElement)) {\n window.PyViz = {comms: {}, comm_status:{}, kernels:{}, receivers: {}, plot_index: []}\n}\n\n\n function JupyterCommManager() {\n }\n\n JupyterCommManager.prototype.register_target = function(plot_id, comm_id, msg_handler) {\n if (window.comm_manager || ((window.Jupyter !== undefined) && (Jupyter.notebook.kernel != null))) {\n var comm_manager = window.comm_manager || Jupyter.notebook.kernel.comm_manager;\n comm_manager.register_target(comm_id, function(comm) {\n comm.on_msg(msg_handler);\n });\n } else if ((plot_id in window.PyViz.kernels) && (window.PyViz.kernels[plot_id])) {\n window.PyViz.kernels[plot_id].registerCommTarget(comm_id, function(comm) {\n comm.onMsg = msg_handler;\n });\n } else if (typeof google != 'undefined' && google.colab.kernel != null) {\n google.colab.kernel.comms.registerTarget(comm_id, (comm) => {\n var messages = comm.messages[Symbol.asyncIterator]();\n function processIteratorResult(result) {\n var message = result.value;\n console.log(message)\n var content = {data: message.data, comm_id};\n var buffers = []\n for (var buffer of message.buffers || []) {\n buffers.push(new DataView(buffer))\n }\n var metadata = message.metadata || {};\n var msg = {content, buffers, metadata}\n msg_handler(msg);\n return messages.next().then(processIteratorResult);\n }\n return messages.next().then(processIteratorResult);\n })\n }\n }\n\n JupyterCommManager.prototype.get_client_comm = function(plot_id, comm_id, msg_handler) {\n if (comm_id in window.PyViz.comms) {\n return window.PyViz.comms[comm_id];\n } else if (window.comm_manager || ((window.Jupyter !== undefined) && (Jupyter.notebook.kernel != null))) {\n var comm_manager = window.comm_manager || Jupyter.notebook.kernel.comm_manager;\n var comm = comm_manager.new_comm(comm_id, {}, {}, {}, comm_id);\n if (msg_handler) {\n comm.on_msg(msg_handler);\n }\n } else if ((plot_id in window.PyViz.kernels) && (window.PyViz.kernels[plot_id])) {\n var comm = window.PyViz.kernels[plot_id].connectToComm(comm_id);\n comm.open();\n if (msg_handler) {\n comm.onMsg = msg_handler;\n }\n } else if (typeof google != 'undefined' && google.colab.kernel != null) {\n var comm_promise = google.colab.kernel.comms.open(comm_id)\n comm_promise.then((comm) => {\n window.PyViz.comms[comm_id] = comm;\n if (msg_handler) {\n var messages = comm.messages[Symbol.asyncIterator]();\n function processIteratorResult(result) {\n var message = result.value;\n var content = {data: message.data};\n var metadata = message.metadata || {comm_id};\n var msg = {content, metadata}\n msg_handler(msg);\n return messages.next().then(processIteratorResult);\n }\n return messages.next().then(processIteratorResult);\n }\n }) \n var sendClosure = (data, metadata, buffers, disposeOnDone) => {\n return comm_promise.then((comm) => {\n comm.send(data, metadata, buffers, disposeOnDone);\n });\n };\n var comm = {\n send: sendClosure\n };\n }\n window.PyViz.comms[comm_id] = comm;\n return comm;\n }\n window.PyViz.comm_manager = new JupyterCommManager();\n \n\n\nvar JS_MIME_TYPE = 'application/javascript';\nvar HTML_MIME_TYPE = 'text/html';\nvar EXEC_MIME_TYPE = 'application/vnd.holoviews_exec.v0+json';\nvar CLASS_NAME = 'output';\n\n/**\n * Render data to the DOM node\n */\nfunction render(props, node) {\n var div = document.createElement(\"div\");\n var script = document.createElement(\"script\");\n node.appendChild(div);\n node.appendChild(script);\n}\n\n/**\n * Handle when a new output is added\n */\nfunction handle_add_output(event, handle) {\n var output_area = handle.output_area;\n var output = handle.output;\n if ((output.data == undefined) || (!output.data.hasOwnProperty(EXEC_MIME_TYPE))) {\n return\n }\n var id = output.metadata[EXEC_MIME_TYPE][\"id\"];\n var toinsert = output_area.element.find(\".\" + CLASS_NAME.split(' ')[0]);\n if (id !== undefined) {\n var nchildren = toinsert.length;\n var html_node = toinsert[nchildren-1].children[0];\n html_node.innerHTML = output.data[HTML_MIME_TYPE];\n var scripts = [];\n var nodelist = html_node.querySelectorAll(\"script\");\n for (var i in nodelist) {\n if (nodelist.hasOwnProperty(i)) {\n scripts.push(nodelist[i])\n }\n }\n\n scripts.forEach( function (oldScript) {\n var newScript = document.createElement(\"script\");\n var attrs = [];\n var nodemap = oldScript.attributes;\n for (var j in nodemap) {\n if (nodemap.hasOwnProperty(j)) {\n attrs.push(nodemap[j])\n }\n }\n attrs.forEach(function(attr) { newScript.setAttribute(attr.name, attr.value) });\n newScript.appendChild(document.createTextNode(oldScript.innerHTML));\n oldScript.parentNode.replaceChild(newScript, oldScript);\n });\n if (JS_MIME_TYPE in output.data) {\n toinsert[nchildren-1].children[1].textContent = output.data[JS_MIME_TYPE];\n }\n output_area._hv_plot_id = id;\n if ((window.Bokeh !== undefined) && (id in Bokeh.index)) {\n window.PyViz.plot_index[id] = Bokeh.index[id];\n } else {\n window.PyViz.plot_index[id] = null;\n }\n } else if (output.metadata[EXEC_MIME_TYPE][\"server_id\"] !== undefined) {\n var bk_div = document.createElement(\"div\");\n bk_div.innerHTML = output.data[HTML_MIME_TYPE];\n var script_attrs = bk_div.children[0].attributes;\n for (var i = 0; i < script_attrs.length; i++) {\n toinsert[toinsert.length - 1].childNodes[1].setAttribute(script_attrs[i].name, script_attrs[i].value);\n }\n // store reference to server id on output_area\n output_area._bokeh_server_id = output.metadata[EXEC_MIME_TYPE][\"server_id\"];\n }\n}\n\n/**\n * Handle when an output is cleared or removed\n */\nfunction handle_clear_output(event, handle) {\n var id = handle.cell.output_area._hv_plot_id;\n var server_id = handle.cell.output_area._bokeh_server_id;\n if (((id === undefined) || !(id in PyViz.plot_index)) && (server_id !== undefined)) { return; }\n var comm = window.PyViz.comm_manager.get_client_comm(\"hv-extension-comm\", \"hv-extension-comm\", function () {});\n if (server_id !== null) {\n comm.send({event_type: 'server_delete', 'id': server_id});\n return;\n } else if (comm !== null) {\n comm.send({event_type: 'delete', 'id': id});\n }\n delete PyViz.plot_index[id];\n if ((window.Bokeh !== undefined) & (id in window.Bokeh.index)) {\n var doc = window.Bokeh.index[id].model.document\n doc.clear();\n const i = window.Bokeh.documents.indexOf(doc);\n if (i > -1) {\n window.Bokeh.documents.splice(i, 1);\n }\n }\n}\n\n/**\n * Handle kernel restart event\n */\nfunction handle_kernel_cleanup(event, handle) {\n delete PyViz.comms[\"hv-extension-comm\"];\n window.PyViz.plot_index = {}\n}\n\n/**\n * Handle update_display_data messages\n */\nfunction handle_update_output(event, handle) {\n handle_clear_output(event, {cell: {output_area: handle.output_area}})\n handle_add_output(event, handle)\n}\n\nfunction register_renderer(events, OutputArea) {\n function append_mime(data, metadata, element) {\n // create a DOM node to render to\n var toinsert = this.create_output_subarea(\n metadata,\n CLASS_NAME,\n EXEC_MIME_TYPE\n );\n this.keyboard_manager.register_events(toinsert);\n // Render to node\n var props = {data: data, metadata: metadata[EXEC_MIME_TYPE]};\n render(props, toinsert[0]);\n element.append(toinsert);\n return toinsert\n }\n\n events.on('output_added.OutputArea', handle_add_output);\n events.on('output_updated.OutputArea', handle_update_output);\n events.on('clear_output.CodeCell', handle_clear_output);\n events.on('delete.Cell', handle_clear_output);\n events.on('kernel_ready.Kernel', handle_kernel_cleanup);\n\n OutputArea.prototype.register_mime_type(EXEC_MIME_TYPE, append_mime, {\n safe: true,\n index: 0\n });\n}\n\nif (window.Jupyter !== undefined) {\n try {\n var events = require('base/js/events');\n var OutputArea = require('notebook/js/outputarea').OutputArea;\n if (OutputArea.prototype.mime_types().indexOf(EXEC_MIME_TYPE) == -1) {\n register_renderer(events, OutputArea);\n }\n } catch(err) {\n }\n}\n",
"application/vnd.holoviews_load.v0+json": ""
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<style>.bk-root, .bk-root .bk:before, .bk-root .bk:after {\n",
" font-family: var(--jp-ui-font-size1);\n",
" font-size: var(--jp-ui-font-size1);\n",
" color: var(--jp-ui-font-color1);\n",
"}\n",
"</style>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# 配置 OpenAI KEY\n",
"import os\n",
"import openai\n",
"import sys\n",
"sys.path.append('../..')\n",
"# 使用英文 Prompt 的工具包\n",
"import utils_en\n",
"# 使用中文 Prompt 的工具包\n",
"import utils_zh\n",
"\n",
"import panel as pn # 用于图形化界面\n",
"pn.extension()\n",
"\n",
"from dotenv import load_dotenv, find_dotenv\n",
"_ = load_dotenv(find_dotenv()) # read local .env file\n",
"\n",
"openai.api_key = os.environ['OPENAI_API_KEY']"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# 封装一个访问 OpenAI GPT3.5 的函数\n",
"def get_completion_from_messages(messages, model=\"gpt-3.5-turbo\", temperature=0, max_tokens=500):\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=temperature, \n",
" max_tokens=max_tokens, \n",
" )\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"一个端到端实现问答的函数"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"第一步:输入通过 Moderation 检查\n",
"第二步:抽取出商品列表\n",
"第三步:查找抽取出的商品信息\n",
"第四步:生成用户回答\n",
"第五步:输出经过 Moderation 检查\n",
"第六步:模型评估该回答\n",
"第七步:模型赞同了该回答.\n",
"The SmartX ProPhone is a powerful smartphone with a 6.1-inch display, 128GB storage, 12MP dual camera, and 5G capabilities. The FotoSnap DSLR Camera is a versatile camera with a 24.2MP sensor, 1080p video, 3-inch LCD, and interchangeable lenses. As for our TVs, we have a range of options including the CineView 4K TV with a 55-inch display, 4K resolution, HDR, and smart TV capabilities, the CineView 8K TV with a 65-inch display, 8K resolution, HDR, and smart TV capabilities, and the CineView OLED TV with a 55-inch display, 4K resolution, HDR, and smart TV capabilities. Do you have any specific questions about these products or would you like me to recommend a product based on your needs?\n"
]
}
],
"source": [
"# 对用户信息进行预处理\n",
"def process_user_message(user_input, all_messages, debug=True):\n",
" # user_input : 用户输入\n",
" # all_messages : 历史信息\n",
" # debug : 是否开启 DEBUG 模式,默认开启\n",
"\n",
" # 分隔符\n",
" delimiter = \"```\"\n",
" \n",
" # 第一步: 使用 OpenAI 的 Moderation API 检查用户输入是否合规或者是一个注入的 Prompt\n",
" response = openai.Moderation.create(input=user_input)\n",
" moderation_output = response[\"results\"][0]\n",
"\n",
" # 经过 Moderation API 检查该输入不合规\n",
" if moderation_output[\"flagged\"]:\n",
" print(\"第一步:输入被 Moderation 拒绝\")\n",
" return \"抱歉,您的请求不合规\"\n",
"\n",
" # 如果开启了 DEBUG 模式,打印实时进度\n",
" if debug: print(\"第一步:输入通过 Moderation 检查\")\n",
" \n",
" # 第二步:抽取出商品和对应的目录,类似于之前课程中的方法,做了一个封装\n",
" category_and_product_response = utils_en.find_category_and_product_only(user_input, utils_en.get_products_and_category())\n",
" #print(category_and_product_response)\n",
" # 将抽取出来的字符串转化为列表\n",
" category_and_product_list = utils_en.read_string_to_list(category_and_product_response)\n",
" #print(category_and_product_list)\n",
"\n",
" if debug: print(\"第二步:抽取出商品列表\")\n",
"\n",
" # 第三步:查找商品对应信息\n",
" product_information = utils_en.generate_output_string(category_and_product_list)\n",
" if debug: print(\"第三步:查找抽取出的商品信息\")\n",
"\n",
" # 第四步:根据信息生成回答\n",
" system_message = f\"\"\"\n",
" You are a customer service assistant for a large electronic store. \\\n",
" Respond in a friendly and helpful tone, with concise answers. \\\n",
" Make sure to ask the user relevant follow-up questions.\n",
" \"\"\"\n",
" # 插入 message\n",
" messages = [\n",
" {'role': 'system', 'content': system_message},\n",
" {'role': 'user', 'content': f\"{delimiter}{user_input}{delimiter}\"},\n",
" {'role': 'assistant', 'content': f\"Relevant product information:\\n{product_information}\"}\n",
" ]\n",
" # 获取 GPT3.5 的回答\n",
" # 通过附加 all_messages 实现多轮对话\n",
" final_response = get_completion_from_messages(all_messages + messages)\n",
" if debug:print(\"第四步:生成用户回答\")\n",
" # 将该轮信息加入到历史信息中\n",
" all_messages = all_messages + messages[1:]\n",
"\n",
" # 第五步:基于 Moderation API 检查输出是否合规\n",
" response = openai.Moderation.create(input=final_response)\n",
" moderation_output = response[\"results\"][0]\n",
"\n",
" # 输出不合规\n",
" if moderation_output[\"flagged\"]:\n",
" if debug: print(\"第五步:输出被 Moderation 拒绝\")\n",
" return \"抱歉,我们不能提供该信息\"\n",
"\n",
" if debug: print(\"第五步:输出经过 Moderation 检查\")\n",
"\n",
" # 第六步:模型检查是否很好地回答了用户问题\n",
" user_message = f\"\"\"\n",
" Customer message: {delimiter}{user_input}{delimiter}\n",
" Agent response: {delimiter}{final_response}{delimiter}\n",
"\n",
" Does the response sufficiently answer the question?\n",
" \"\"\"\n",
" messages = [\n",
" {'role': 'system', 'content': system_message},\n",
" {'role': 'user', 'content': user_message}\n",
" ]\n",
" # 要求模型评估回答\n",
" evaluation_response = get_completion_from_messages(messages)\n",
" if debug: print(\"第六步:模型评估该回答\")\n",
"\n",
" # 第七步:如果评估为 Y输出回答如果评估为 N反馈将由人工修正答案\n",
" if \"Y\" in evaluation_response: # 使用 in 来避免模型可能生成 Yes\n",
" if debug: print(\"第七步:模型赞同了该回答.\")\n",
" return final_response, all_messages\n",
" else:\n",
" if debug: print(\"第七步:模型不赞成该回答.\")\n",
" neg_str = \"很抱歉,我无法提供您所需的信息。我将为您转接到一位人工客服代表以获取进一步帮助。\"\n",
" return neg_str, all_messages\n",
"\n",
"user_input = \"tell me about the smartx pro phone and the fotosnap camera, the dslr one. Also what tell me about your tvs\"\n",
"response,_ = process_user_message(user_input,[])\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"第一步:输入通过 Moderation 检查\n",
"第二步:抽取出商品列表\n",
"第三步:查找抽取出的商品信息\n",
"第四步:生成用户回答\n",
"第五步:输出经过 Moderation 检查\n",
"第六步:模型评估该回答\n",
"第七步:模型赞同了该回答.\n",
"关于SmartX ProPhone和FotoSnap相机的信息\n",
"\n",
"SmartX ProPhone是一款功能强大的智能手机具有6.1英寸的显示屏128GB的存储空间12MP的双摄像头和5G网络。售价为899.99美元。\n",
"\n",
"FotoSnap相机系列包括DSLR相机、无反相机和即时相机。DSLR相机具有24.2MP传感器、1080p视频、3英寸LCD和可更换镜头。无反相机具有20.1MP传感器、4K视频、3英寸触摸屏和可更换镜头。即时相机可以即时打印照片具有内置闪光灯、自拍镜和电池供电。售价分别为599.99美元、799.99美元和69.99美元。\n",
"\n",
"关于我们的电视:\n",
"\n",
"我们有多种电视可供选择包括CineView 4K电视、CineView 8K电视和CineView OLED电视。CineView 4K电视具有55英寸的显示屏、4K分辨率、HDR和智能电视功能。CineView 8K电视具有65英寸的显示屏、8K分辨率、HDR和智能电视功能。CineView OLED电视具有55英寸的显示屏、4K分辨率、HDR和智能电视功能。我们还提供SoundMax家庭影院和SoundMax声音栏以提供更好的音频体验。售价从199.99美元到2999.99美元不等保修期为1年或2年。\n"
]
}
],
"source": [
"'''\n",
"中文Prompt\n",
"注意限于模型对中文理解能力较弱中文Prompt可能会随机出现不成功可以多次运行也非常欢迎同学探究更稳定的中文 Prompt\n",
"'''\n",
"# 对用户信息进行预处理\n",
"def process_user_message_ch(user_input, all_messages, debug=True):\n",
" # user_input : 用户输入\n",
" # all_messages : 历史信息\n",
" # debug : 是否开启 DEBUG 模式,默认开启\n",
"\n",
" # 分隔符\n",
" delimiter = \"```\"\n",
" \n",
" # 第一步: 使用 OpenAI 的 Moderation API 检查用户输入是否合规或者是一个注入的 Prompt\n",
" response = openai.Moderation.create(input=user_input)\n",
" moderation_output = response[\"results\"][0]\n",
"\n",
" # 经过 Moderation API 检查该输入不合规\n",
" if moderation_output[\"flagged\"]:\n",
" print(\"第一步:输入被 Moderation 拒绝\")\n",
" return \"抱歉,您的请求不合规\"\n",
"\n",
" # 如果开启了 DEBUG 模式,打印实时进度\n",
" if debug: print(\"第一步:输入通过 Moderation 检查\")\n",
" \n",
" # 第二步:抽取出商品和对应的目录,类似于之前课程中的方法,做了一个封装\n",
" category_and_product_response = utils_zh.find_category_and_product_only(user_input, utils_zh.get_products_and_category())\n",
" #print(category_and_product_response)\n",
" # 将抽取出来的字符串转化为列表\n",
" category_and_product_list = utils_zh.read_string_to_list(category_and_product_response)\n",
" #print(category_and_product_list)\n",
"\n",
" if debug: print(\"第二步:抽取出商品列表\")\n",
"\n",
" # 第三步:查找商品对应信息\n",
" product_information = utils_zh.generate_output_string(category_and_product_list)\n",
" if debug: print(\"第三步:查找抽取出的商品信息\")\n",
"\n",
" # 第四步:根据信息生成回答\n",
" system_message = f\"\"\"\n",
" 您是一家大型电子商店的客户服务助理。\\\n",
" 请以友好和乐于助人的语气回答问题,并提供简洁明了的答案。\\\n",
" 请确保向用户提出相关的后续问题。\n",
" \"\"\"\n",
" # 插入 message\n",
" messages = [\n",
" {'role': 'system', 'content': system_message},\n",
" {'role': 'user', 'content': f\"{delimiter}{user_input}{delimiter}\"},\n",
" {'role': 'assistant', 'content': f\"相关商品信息:\\n{product_information}\"}\n",
" ]\n",
" # 获取 GPT3.5 的回答\n",
" # 通过附加 all_messages 实现多轮对话\n",
" final_response = get_completion_from_messages(all_messages + messages)\n",
" if debug:print(\"第四步:生成用户回答\")\n",
" # 将该轮信息加入到历史信息中\n",
" all_messages = all_messages + messages[1:]\n",
"\n",
" # 第五步:基于 Moderation API 检查输出是否合规\n",
" response = openai.Moderation.create(input=final_response)\n",
" moderation_output = response[\"results\"][0]\n",
"\n",
" # 输出不合规\n",
" if moderation_output[\"flagged\"]:\n",
" if debug: print(\"第五步:输出被 Moderation 拒绝\")\n",
" return \"抱歉,我们不能提供该信息\"\n",
"\n",
" if debug: print(\"第五步:输出经过 Moderation 检查\")\n",
"\n",
" # 第六步:模型检查是否很好地回答了用户问题\n",
" user_message = f\"\"\"\n",
" 用户信息: {delimiter}{user_input}{delimiter}\n",
" 代理回复: {delimiter}{final_response}{delimiter}\n",
"\n",
" 回复是否足够回答问题\n",
" 如果足够,回答 Y\n",
" 如果不足够,回答 N\n",
" 仅回答上述字母即可\n",
" \"\"\"\n",
" # print(final_response)\n",
" messages = [\n",
" {'role': 'system', 'content': system_message},\n",
" {'role': 'user', 'content': user_message}\n",
" ]\n",
" # 要求模型评估回答\n",
" evaluation_response = get_completion_from_messages(messages)\n",
" # print(evaluation_response)\n",
" if debug: print(\"第六步:模型评估该回答\")\n",
"\n",
" # 第七步:如果评估为 Y输出回答如果评估为 N反馈将由人工修正答案\n",
" if \"Y\" in evaluation_response: # 使用 in 来避免模型可能生成 Yes\n",
" if debug: print(\"第七步:模型赞同了该回答.\")\n",
" return final_response, all_messages\n",
" else:\n",
" if debug: print(\"第七步:模型不赞成该回答.\")\n",
" neg_str = \"很抱歉,我无法提供您所需的信息。我将为您转接到一位人工客服代表以获取进一步帮助。\"\n",
" return neg_str, all_messages\n",
"\n",
"user_input = \"请告诉我关于smartx pro phone和the fotosnap camera的信息。另外请告诉我关于你们的tvs的情况。\"\n",
"response,_ = process_user_message_ch(user_input,[])\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"实现一个可视化界面"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"def collect_messages_en(debug=False):\n",
" user_input = inp.value_input\n",
" if debug: print(f\"User Input = {user_input}\")\n",
" if user_input == \"\":\n",
" return\n",
" inp.value = ''\n",
" global context\n",
" # 调用 process_user_message 函数\n",
" #response, context = process_user_message(user_input, context, utils.get_products_and_category(),debug=True)\n",
" response, context = process_user_message(user_input, context, debug=False)\n",
" context.append({'role':'assistant', 'content':f\"{response}\"})\n",
" panels.append(\n",
" pn.Row('User:', pn.pane.Markdown(user_input, width=600)))\n",
" panels.append(\n",
" pn.Row('Assistant:', pn.pane.Markdown(response, width=600, style={'background-color': '#F6F6F6'})))\n",
" \n",
" return pn.Column(*panels)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 调用中文Prompt版本\n",
"def collect_messages_ch(debug=False):\n",
" user_input = inp.value_input\n",
" if debug: print(f\"User Input = {user_input}\")\n",
" if user_input == \"\":\n",
" return\n",
" inp.value = ''\n",
" global context\n",
" # 调用 process_user_message 函数\n",
" #response, context = process_user_message(user_input, context, utils.get_products_and_category(),debug=True)\n",
" response, context = process_user_message_ch(user_input, context, debug=False)\n",
" context.append({'role':'assistant', 'content':f\"{response}\"})\n",
" panels.append(\n",
" pn.Row('User:', pn.pane.Markdown(user_input, width=600)))\n",
" panels.append(\n",
" pn.Row('Assistant:', pn.pane.Markdown(response, width=600, style={'background-color': '#F6F6F6'})))\n",
" \n",
" return pn.Column(*panels)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"<div id='1002'>\n",
" <div class=\"bk-root\" id=\"5bec6a62-17e2-41a3-9212-7126da759786\" data-root-id=\"1002\"></div>\n",
"</div>\n",
"<script type=\"application/javascript\">(function(root) {\n",
" function embed_document(root) {\n",
" var docs_json = {\"052e8379-8146-4aee-8619-78e96d2427ee\":{\"defs\":[{\"extends\":null,\"module\":null,\"name\":\"ReactiveHTML1\",\"overrides\":[],\"properties\":[]},{\"extends\":null,\"module\":null,\"name\":\"FlexBox1\",\"overrides\":[],\"properties\":[{\"default\":\"flex-start\",\"kind\":null,\"name\":\"align_content\"},{\"default\":\"flex-start\",\"kind\":null,\"name\":\"align_items\"},{\"default\":\"row\",\"kind\":null,\"name\":\"flex_direction\"},{\"default\":\"wrap\",\"kind\":null,\"name\":\"flex_wrap\"},{\"default\":\"flex-start\",\"kind\":null,\"name\":\"justify_content\"}]},{\"extends\":null,\"module\":null,\"name\":\"GridStack1\",\"overrides\":[],\"properties\":[{\"default\":\"warn\",\"kind\":null,\"name\":\"mode\"},{\"default\":null,\"kind\":null,\"name\":\"ncols\"},{\"default\":null,\"kind\":null,\"name\":\"nrows\"},{\"default\":true,\"kind\":null,\"name\":\"allow_resize\"},{\"default\":true,\"kind\":null,\"name\":\"allow_drag\"},{\"default\":[],\"kind\":null,\"name\":\"state\"}]},{\"extends\":null,\"module\":null,\"name\":\"click1\",\"overrides\":[],\"properties\":[{\"default\":\"\",\"kind\":null,\"name\":\"terminal_output\"},{\"default\":\"\",\"kind\":null,\"name\":\"debug_name\"},{\"default\":0,\"kind\":null,\"name\":\"clears\"}]},{\"extends\":null,\"module\":null,\"name\":\"NotificationAreaBase1\",\"overrides\":[],\"properties\":[{\"default\":\"bottom-right\",\"kind\":null,\"name\":\"position\"},{\"default\":0,\"kind\":null,\"name\":\"_clear\"}]},{\"extends\":null,\"module\":null,\"name\":\"NotificationArea1\",\"overrides\":[],\"properties\":[{\"default\":[],\"kind\":null,\"name\":\"notifications\"},{\"default\":\"bottom-right\",\"kind\":null,\"name\":\"position\"},{\"default\":0,\"kind\":null,\"name\":\"_clear\"},{\"default\":[{\"background\":\"#ffc107\",\"icon\":{\"className\":\"fas fa-exclamation-triangle\",\"color\":\"white\",\"tagName\":\"i\"},\"type\":\"warning\"},{\"background\":\"#007bff\",\"icon\":{\"className\":\"fas fa-info-circle\",\"color\":\"white\",\"tagName\":\"i\"},\"type\":\"info\"}],\"kind\":null,\"name\":\"types\"}]},{\"extends\":null,\"module\":null,\"name\":\"Notification\",\"overrides\":[],\"properties\":[{\"default\":null,\"kind\":null,\"name\":\"background\"},{\"default\":3000,\"kind\":null,\"name\":\"duration\"},{\"default\":null,\"kind\":null,\"name\":\"icon\"},{\"default\":\"\",\"kind\":null,\"name\":\"message\"},{\"default\":null,\"kind\":null,\"name\":\"notification_type\"},{\"default\":false,\"kind\":null,\"name\":\"_destroyed\"}]},{\"extends\":null,\"module\":null,\"name\":\"TemplateActions1\",\"overrides\":[],\"properties\":[{\"default\":0,\"kind\":null,\"name\":\"open_modal\"},{\"default\":0,\"kind\":null,\"name\":\"close_modal\"}]},{\"extends\":null,\"module\":null,\"name\":\"MaterialTemplateActions1\",\"overrides\":[],\"properties\":[{\"default\":0,\"kind\":null,\"name\":\"open_modal\"},{\"default\":0,\"kind\":null,\"name\":\"close_modal\"}]}],\"roots\":{\"references\":[{\"attributes\":{\"children\":[{\"id\":\"1007\"}],\"height\":300,\"margin\":[0,0,0,0],\"min_height\":300,\"name\":\"Row00110\"},\"id\":\"1006\",\"type\":\"Row\"},{\"attributes\":{\"children\":[{\"id\":\"1003\"},{\"id\":\"1004\"},{\"id\":\"1006\"}],\"margin\":[0,0,0,0],\"name\":\"Column00112\"},\"id\":\"1002\",\"type\":\"Column\"},{\"attributes\":{\"children\":[{\"id\":\"1005\"}],\"margin\":[0,0,0,0],\"name\":\"Row00105\"},\"id\":\"1004\",\"type\":\"Row\"},{\"attributes\":{\"margin\":[5,5,5,5],\"name\":\"Str00108\",\"text\":\"&lt;pre&gt; &lt;/pre&gt;\"},\"id\":\"1007\",\"type\":\"panel.models.markup.HTML\"},{\"attributes\":{\"margin\":[5,10,5,10],\"max_length\":5000,\"placeholder\":\"Enter text here\\u2026\"},\"id\":\"1003\",\"type\":\"TextInput\"},{\"attributes\":{\"args\":{\"bidirectional\":false,\"properties\":{\"event:button_click\":\"loading\"},\"source\":{\"id\":\"1005\"},\"target\":{\"id\":\"1006\"}},\"code\":\"\\n if ('event:button_click'.startsWith('event:')) {\\n var value = true\\n } else {\\n var value = source['event:button_click'];\\n value = value;\\n }\\n if (typeof value !== 'boolean' || source.labels !== ['Loading']) {\\n value = true\\n }\\n var css_classes = target.css_classes.slice()\\n var loading_css = ['pn-loading', 'arc']\\n if (value) {\\n for (var css of loading_css) {\\n if (!(css in css_classes)) {\\n css_classes.push(css)\\n }\\n }\\n } else {\\n for (var css of loading_css) {\\n var index = css_classes.indexOf(css)\\n if (index > -1) {\\n css_classes.splice(index, 1)\\n }\\n }\\n }\\n target['css_classes'] = css_classes\\n \",\"tags\":[[140330220591408,[null,\"event:button_click\"],[null,\"loading\"]]]},\"id\":\"1008\",\"type\":\"CustomJS\"},{\"attributes\":{\"client_comm_id\":\"2a4a5b3205d940a0b2a81401239356fc\",\"comm_id\":\"53327ab16d4d4b5a9937d0a053d6c7e0\",\"plot_id\":\"1002\"},\"id\":\"1009\",\"type\":\"panel.models.comm_manager.CommManager\"},{\"attributes\":{\"reload\":false},\"id\":\"1010\",\"type\":\"panel.models.location.Location\"},{\"attributes\":{\"icon\":null,\"js_event_callbacks\":{\"button_click\":[{\"id\":\"1008\"}]},\"label\":\"Service Assistant\",\"margin\":[5,10,5,10],\"subscribed_events\":[\"button_click\"]},\"id\":\"1005\",\"type\":\"Button\"}],\"root_ids\":[\"1002\",\"1009\",\"1010\"]},\"title\":\"Bokeh Application\",\"version\":\"2.4.3\"}};\n",
" var render_items = [{\"docid\":\"052e8379-8146-4aee-8619-78e96d2427ee\",\"root_ids\":[\"1002\"],\"roots\":{\"1002\":\"5bec6a62-17e2-41a3-9212-7126da759786\"}}];\n",
" root.Bokeh.embed.embed_items_notebook(docs_json, render_items);\n",
" for (const render_item of render_items) {\n",
" for (const root_id of render_item.root_ids) {\n",
"\tconst id_el = document.getElementById(root_id)\n",
"\tif (id_el.children.length && (id_el.children[0].className === 'bk-root')) {\n",
"\t const root_el = id_el.children[0]\n",
"\t root_el.id = root_el.id + '-rendered'\n",
"\t}\n",
" }\n",
" }\n",
" }\n",
" if (root.Bokeh !== undefined && root.Bokeh.Panel !== undefined) {\n",
" embed_document(root);\n",
" } else {\n",
" var attempts = 0;\n",
" var timer = setInterval(function(root) {\n",
" if (root.Bokeh !== undefined && root.Bokeh.Panel !== undefined) {\n",
" clearInterval(timer);\n",
" embed_document(root);\n",
" } else if (document.readyState == \"complete\") {\n",
" attempts++;\n",
" if (attempts > 200) {\n",
" clearInterval(timer);\n",
" console.log(\"Bokeh: ERROR: Unable to run BokehJS code because BokehJS library is missing\");\n",
" }\n",
" }\n",
" }, 25, root)\n",
" }\n",
"})(window);</script>"
],
"text/plain": [
"Column\n",
" [0] TextInput(placeholder='Enter text here…')\n",
" [1] Row\n",
" [0] Button(name='Service Assistant')\n",
" [2] ParamFunction(function, _pane=Str, height=300, loading_indicator=True)"
]
},
"execution_count": 18,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "1002"
}
},
"output_type": "execute_result"
}
],
"source": [
"panels = [] # collect display \n",
"\n",
"# 系统信息\n",
"context = [ {'role':'system', 'content':\"You are Service Assistant\"} ] \n",
"\n",
"inp = pn.widgets.TextInput( placeholder='Enter text here…')\n",
"button_conversation = pn.widgets.Button(name=\"Service Assistant\")\n",
"\n",
"interactive_conversation = pn.bind(collect_messages, button_conversation)\n",
"\n",
"dashboard = pn.Column(\n",
" inp,\n",
" pn.Row(button_conversation),\n",
" pn.panel(interactive_conversation, loading_indicator=True, height=300),\n",
")\n",
"\n",
"dashboard"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"通过监控系统在更多输入上的质量,您可以修改步骤,提高系统的整体性能。\n",
"\n",
"也许我们会发现,对于某些步骤,我们的提示可能更好,也许有些步骤甚至不必要,也许我们会找到更好的检索方法等等。\n",
"\n",
"我们将在下一个视频中进一步讨论这个问题。 "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "zyh_gpt",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}