743 lines
23 KiB
Plaintext
743 lines
23 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "b58204ea",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 第四章 文本概括"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a190d6a1",
|
||
"metadata": {},
|
||
"source": [
|
||
"<div class=\"toc\">\n",
|
||
" <ul class=\"toc-item\">\n",
|
||
" <li><span><a href=\"#一引言\" data-toc-modified-id=\"一、引言\">一、引言</a></span></li>\n",
|
||
" <li>\n",
|
||
" <span><a href=\"#二单一文本概括\" data-toc-modified-id=\"二、单一文本概括实验\">二、单一文本概括实验</a></span>\n",
|
||
" <ul class=\"toc-item\">\n",
|
||
" <li><span><a href=\"#21-限制输出文本长度\" data-toc-modified-id=\"2.1 限制输出文本长度\">2.1 限制输出文本长度</a></span></li> \n",
|
||
" <li><span><a href=\"#22-设置关键角度侧重\" data-toc-modified-id=\"2.2 设置关键角度侧重\">2.2 设置关键角度侧重</a></span></li>\n",
|
||
" <li><span><a href=\"#23-关键信息提取\" data-toc-modified-id=\"2.3 关键信息提取\">2.3 关键信息提取</a></span></li>\n",
|
||
" </ul>\n",
|
||
" </li>\n",
|
||
" <li><span><a href=\"#三同时概括多条文本\" data-toc-modified-id=\"三、同时概括多条文本\">三、同时概括多条文本</a></span></li>\n",
|
||
" </ul>\n",
|
||
"</div>"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "b70ad003",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 一、引言"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "12fa9ea4",
|
||
"metadata": {},
|
||
"source": [
|
||
"当今世界上文本信息浩如烟海,我们很难拥有足够的时间去阅读所有想了解的东西。但欣喜的是,目前LLM在文本概括任务上展现了强大的水准,也已经有不少团队将概括功能实现在多种应用中。\n",
|
||
"\n",
|
||
"本章节将介绍如何使用编程的方式,调用API接口来实现“文本概括”功能。"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "1de4fd1e",
|
||
"metadata": {},
|
||
"source": [
|
||
"首先,我们需要引入 OpenAI 包,加载 API 密钥,定义 getCompletion 函数。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "1b4bfa7f",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import openai\n",
|
||
"# 导入第三方库\n",
|
||
"\n",
|
||
"openai.api_key = \"sk-...\"\n",
|
||
"# 设置 API_KEY, 请替换成您自己的 API_KEY\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"id": "9f679f1f",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def get_completion(prompt, model=\"gpt-3.5-turbo\"): \n",
|
||
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
|
||
" response = openai.ChatCompletion.create(\n",
|
||
" model=model,\n",
|
||
" messages=messages,\n",
|
||
" temperature=0, # 值越低则输出文本随机性越低\n",
|
||
" )\n",
|
||
" return response.choices[0].message[\"content\"]"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "9cca835b",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 二、单一文本概括"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "0c1e1b92",
|
||
"metadata": {},
|
||
"source": [
|
||
"以商品评论的总结任务为例:对于电商平台来说,网站上往往存在着海量的商品评论,这些评论反映了所有客户的想法。如果我们拥有一个工具去概括这些海量、冗长的评论,便能够快速地浏览更多评论,洞悉客户的偏好,从而指导平台与商家提供更优质的服务。"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "9dc2e2bc",
|
||
"metadata": {},
|
||
"source": [
|
||
"**输入文本**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"id": "4d9c0eeb",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"prod_review = \"\"\"\n",
|
||
"Got this panda plush toy for my daughter's birthday, \\\n",
|
||
"who loves it and takes it everywhere. It's soft and \\ \n",
|
||
"super cute, and its face has a friendly look. It's \\ \n",
|
||
"a bit small for what I paid though. I think there \\ \n",
|
||
"might be other options that are bigger for the \\ \n",
|
||
"same price. It arrived a day earlier than expected, \\ \n",
|
||
"so I got to play with it myself before I gave it \\ \n",
|
||
"to her.\n",
|
||
"\"\"\""
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "aad5bd2a",
|
||
"metadata": {},
|
||
"source": [
|
||
"**输入文本(中文翻译)**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"id": "43b5dd25",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"prod_review_zh = \"\"\"\n",
|
||
"这个熊猫公仔是我给女儿的生日礼物,她很喜欢,去哪都带着。\n",
|
||
"公仔很软,超级可爱,面部表情也很和善。但是相比于价钱来说,\n",
|
||
"它有点小,我感觉在别的地方用同样的价钱能买到更大的。\n",
|
||
"快递比预期提前了一天到货,所以在送给女儿之前,我自己玩了会。\n",
|
||
"\"\"\""
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "662c9cd2",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 2.1 限制输出文本长度"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "a6d10814",
|
||
"metadata": {},
|
||
"source": [
|
||
"我们尝试限制文本长度为最多30词。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"id": "02208fbc",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Soft and cute panda plush toy loved by daughter, but a bit small for the price. Arrived early.\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"prompt = f\"\"\"\n",
|
||
"Your task is to generate a short summary of a product \\\n",
|
||
"review from an ecommerce site. \n",
|
||
"\n",
|
||
"Summarize the review below, delimited by triple \n",
|
||
"backticks, in at most 30 words. \n",
|
||
"\n",
|
||
"Review: ```{prod_review}```\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"response = get_completion(prompt)\n",
|
||
"print(response)"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "0df0eb90",
|
||
"metadata": {},
|
||
"source": [
|
||
"中文翻译版本"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"id": "bf4b39f9",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"可爱软熊猫公仔,女儿喜欢,面部表情和善,但价钱有点小贵,快递提前一天到货。\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"prompt = f\"\"\"\n",
|
||
"您的任务是从电子商务网站上生成一个产品评论的简短摘要。\n",
|
||
"\n",
|
||
"请对三个反引号之间的评论文本进行概括,最多30个词汇。\n",
|
||
"\n",
|
||
"评论: ```{prod_review_zh}```\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"response = get_completion(prompt)\n",
|
||
"print(response)"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "e9ab145e",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 2.2 设置关键角度侧重"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "f84d0123",
|
||
"metadata": {},
|
||
"source": [
|
||
"有时,针对不同的业务,我们对文本的侧重会有所不同。例如对于商品评论文本,物流会更关心运输时效,商家更加关心价格与商品质量,平台更关心整体服务体验。\n",
|
||
"\n",
|
||
"我们可以通过增加Prompt提示,来体现对于某个特定角度的侧重。"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "d6f8509a",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 2.2.1 侧重于快递服务"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"id": "9d8a32a6",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"The panda plush toy arrived a day earlier than expected, but the customer felt it was a bit small for the price paid.\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"prompt = f\"\"\"\n",
|
||
"Your task is to generate a short summary of a product \\\n",
|
||
"review from an ecommerce site to give feedback to the \\\n",
|
||
"Shipping deparmtment. \n",
|
||
"\n",
|
||
"Summarize the review below, delimited by triple \n",
|
||
"backticks, in at most 30 words, and focusing on any aspects \\\n",
|
||
"that mention shipping and delivery of the product. \n",
|
||
"\n",
|
||
"Review: ```{prod_review}```\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"response = get_completion(prompt)\n",
|
||
"print(response)"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "0bd4243a",
|
||
"metadata": {},
|
||
"source": [
|
||
"中文翻译版本"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"id": "80636c3e",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"快递提前到货,熊猫公仔软可爱,但有点小,价钱不太划算。\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"prompt = f\"\"\"\n",
|
||
"您的任务是从电子商务网站上生成一个产品评论的简短摘要。\n",
|
||
"\n",
|
||
"请对三个反引号之间的评论文本进行概括,最多30个词汇,并且聚焦在产品运输上。\n",
|
||
"\n",
|
||
"评论: ```{prod_review_zh}```\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"response = get_completion(prompt)\n",
|
||
"print(response)"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "76c97fea",
|
||
"metadata": {},
|
||
"source": [
|
||
"可以看到,输出结果以“快递提前一天到货”开头,体现了对于快递效率的侧重。"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "83275907",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 2.2.2 侧重于价格与质量"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"id": "767f252c",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"The panda plush toy is soft, cute, and loved by the recipient, but the price may be too high for its size compared to other options.\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"prompt = f\"\"\"\n",
|
||
"Your task is to generate a short summary of a product \\\n",
|
||
"review from an ecommerce site to give feedback to the \\\n",
|
||
"pricing deparmtment, responsible for determining the \\\n",
|
||
"price of the product. \n",
|
||
"\n",
|
||
"Summarize the review below, delimited by triple \n",
|
||
"backticks, in at most 30 words, and focusing on any aspects \\\n",
|
||
"that are relevant to the price and perceived value. \n",
|
||
"\n",
|
||
"Review: ```{prod_review}```\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"response = get_completion(prompt)\n",
|
||
"print(response)"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "cf54fac4",
|
||
"metadata": {},
|
||
"source": [
|
||
"中文翻译版本"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"id": "728d6c57",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"可爱软熊猫公仔,面部表情友好,但价钱有点高,尺寸较小。快递提前一天到货。\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"prompt = f\"\"\"\n",
|
||
"您的任务是从电子商务网站上生成一个产品评论的简短摘要。\n",
|
||
"\n",
|
||
"请对三个反引号之间的评论文本进行概括,最多30个词汇,并且聚焦在产品价格和质量上。\n",
|
||
"\n",
|
||
"评论: ```{prod_review_zh}```\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"response = get_completion(prompt)\n",
|
||
"print(response)"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "972dbb1b",
|
||
"metadata": {},
|
||
"source": [
|
||
"可以看到,输出结果以“质量好、价格小贵、尺寸小”开头,体现了对于产品价格与质量的侧重。"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "b3ed53d2",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 2.3 关键信息提取"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "ba6f5c25",
|
||
"metadata": {},
|
||
"source": [
|
||
"在2.2节中,虽然我们通过添加关键角度侧重的 Prompt ,使得文本摘要更侧重于某一特定方面,但是可以发现,结果中也会保留一些其他信息,如偏重价格与质量角度的概括中仍保留了“快递提前到货”的信息。如果我们只想要提取某一角度的信息,并过滤掉其他所有信息,则可以要求 LLM 进行“文本提取( Extract )”而非“概括( Summarize )”"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"id": "2d60dc58",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"\"The product arrived a day earlier than expected.\"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"prompt = f\"\"\"\n",
|
||
"Your task is to extract relevant information from \\ \n",
|
||
"a product review from an ecommerce site to give \\\n",
|
||
"feedback to the Shipping department. \n",
|
||
"\n",
|
||
"From the review below, delimited by triple quotes \\\n",
|
||
"extract the information relevant to shipping and \\ \n",
|
||
"delivery. Limit to 30 words. \n",
|
||
"\n",
|
||
"Review: ```{prod_review}```\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"response = get_completion(prompt)\n",
|
||
"print(response)"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "0339b877",
|
||
"metadata": {},
|
||
"source": [
|
||
"中文翻译版本"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"id": "c845ccab",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"快递比预期提前了一天到货。\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"prompt = f\"\"\"\n",
|
||
"您的任务是从电子商务网站上的产品评论中提取相关信息。\n",
|
||
"\n",
|
||
"请从以下三个反引号之间的评论文本中提取产品运输相关的信息,最多30个词汇。\n",
|
||
"\n",
|
||
"评论: ```{prod_review_zh}```\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"response = get_completion(prompt)\n",
|
||
"print(response)"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "50498a2b",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 三、同时概括多条文本"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"id": "a291541a",
|
||
"metadata": {},
|
||
"source": [
|
||
"在实际的工作流中,我们往往有许许多多的评论文本,以下示例将多条用户评价放进列表,并利用 ```for``` 循环,使用文本概括(Summarize)提示词,将评价概括至小于 20 词,并按顺序打印。当然,在实际生产中,对于不同规模的评论文本,除了使用 ```for``` 循环以外,还可能需要考虑整合评论、分布式等方法提升运算效率。您可以搭建主控面板,来总结大量用户评论,来方便您或他人快速浏览,还可以点击查看原评论。这样您能高效掌握顾客的所有想法。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"id": "ee7caa78",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"review_1 = prod_review \n",
|
||
"\n",
|
||
"# review for a standing lamp\n",
|
||
"review_2 = \"\"\"\n",
|
||
"Needed a nice lamp for my bedroom, and this one \\\n",
|
||
"had additional storage and not too high of a price \\\n",
|
||
"point. Got it fast - arrived in 2 days. The string \\\n",
|
||
"to the lamp broke during the transit and the company \\\n",
|
||
"happily sent over a new one. Came within a few days \\\n",
|
||
"as well. It was easy to put together. Then I had a \\\n",
|
||
"missing part, so I contacted their support and they \\\n",
|
||
"very quickly got me the missing piece! Seems to me \\\n",
|
||
"to be a great company that cares about their customers \\\n",
|
||
"and products. \n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"# review for an electric toothbrush\n",
|
||
"review_3 = \"\"\"\n",
|
||
"My dental hygienist recommended an electric toothbrush, \\\n",
|
||
"which is why I got this. The battery life seems to be \\\n",
|
||
"pretty impressive so far. After initial charging and \\\n",
|
||
"leaving the charger plugged in for the first week to \\\n",
|
||
"condition the battery, I've unplugged the charger and \\\n",
|
||
"been using it for twice daily brushing for the last \\\n",
|
||
"3 weeks all on the same charge. But the toothbrush head \\\n",
|
||
"is too small. I’ve seen baby toothbrushes bigger than \\\n",
|
||
"this one. I wish the head was bigger with different \\\n",
|
||
"length bristles to get between teeth better because \\\n",
|
||
"this one doesn’t. Overall if you can get this one \\\n",
|
||
"around the $50 mark, it's a good deal. The manufactuer's \\\n",
|
||
"replacements heads are pretty expensive, but you can \\\n",
|
||
"get generic ones that're more reasonably priced. This \\\n",
|
||
"toothbrush makes me feel like I've been to the dentist \\\n",
|
||
"every day. My teeth feel sparkly clean! \n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"# review for a blender\n",
|
||
"review_4 = \"\"\"\n",
|
||
"So, they still had the 17 piece system on seasonal \\\n",
|
||
"sale for around $49 in the month of November, about \\\n",
|
||
"half off, but for some reason (call it price gouging) \\\n",
|
||
"around the second week of December the prices all went \\\n",
|
||
"up to about anywhere from between $70-$89 for the same \\\n",
|
||
"system. And the 11 piece system went up around $10 or \\\n",
|
||
"so in price also from the earlier sale price of $29. \\\n",
|
||
"So it looks okay, but if you look at the base, the part \\\n",
|
||
"where the blade locks into place doesn’t look as good \\\n",
|
||
"as in previous editions from a few years ago, but I \\\n",
|
||
"plan to be very gentle with it (example, I crush \\\n",
|
||
"very hard items like beans, ice, rice, etc. in the \\\n",
|
||
"blender first then pulverize them in the serving size \\\n",
|
||
"I want in the blender then switch to the whipping \\\n",
|
||
"blade for a finer flour, and use the cross cutting blade \\\n",
|
||
"first when making smoothies, then use the flat blade \\\n",
|
||
"if I need them finer/less pulpy). Special tip when making \\\n",
|
||
"smoothies, finely cut and freeze the fruits and \\\n",
|
||
"vegetables (if using spinach-lightly stew soften the \\\n",
|
||
"spinach then freeze until ready for use-and if making \\\n",
|
||
"sorbet, use a small to medium sized food processor) \\\n",
|
||
"that you plan to use that way you can avoid adding so \\\n",
|
||
"much ice if at all-when making your smoothie. \\\n",
|
||
"After about a year, the motor was making a funny noise. \\\n",
|
||
"I called customer service but the warranty expired \\\n",
|
||
"already, so I had to buy another one. FYI: The overall \\\n",
|
||
"quality has gone done in these types of products, so \\\n",
|
||
"they are kind of counting on brand recognition and \\\n",
|
||
"consumer loyalty to maintain sales. Got it in about \\\n",
|
||
"two days.\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"reviews = [review_1, review_2, review_3, review_4]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"id": "9d1aa5ac",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"0 Soft and cute panda plush toy loved by daughter, but a bit small for the price. Arrived early. \n",
|
||
"\n",
|
||
"1 Affordable lamp with storage, fast shipping, and excellent customer service. Easy to assemble and missing parts were quickly replaced. \n",
|
||
"\n",
|
||
"2 Good battery life, small toothbrush head, but effective cleaning. Good deal if bought around $50. \n",
|
||
"\n",
|
||
"3 The product was on sale for $49 in November, but the price increased to $70-$89 in December. The base doesn't look as good as previous editions, but the reviewer plans to be gentle with it. A special tip for making smoothies is to freeze the fruits and vegetables beforehand. The motor made a funny noise after a year, and the warranty had expired. Overall quality has decreased. \n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"for i in range(len(reviews)):\n",
|
||
" prompt = f\"\"\"\n",
|
||
" Your task is to generate a short summary of a product \\\n",
|
||
" review from an ecommerce site. \n",
|
||
"\n",
|
||
" Summarize the review below, delimited by triple \\\n",
|
||
" backticks in at most 20 words. \n",
|
||
"\n",
|
||
" Review: ```{reviews[i]}```\n",
|
||
" \"\"\"\n",
|
||
" response = get_completion(prompt)\n",
|
||
" print(i, response, \"\\n\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "eb878522",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"for i in range(len(reviews)):\n",
|
||
" prompt = f\"\"\"\n",
|
||
" 你的任务是从电子商务网站上的产品评论中提取相关信息。\n",
|
||
"\n",
|
||
" 请对三个反引号之间的评论文本进行概括,最多20个词汇。\n",
|
||
"\n",
|
||
" 评论文本: ```{reviews[i]}```\n",
|
||
" \"\"\"\n",
|
||
" response = get_completion(prompt)\n",
|
||
" print(i, response, \"\\n\")\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d757b389",
|
||
"metadata": {},
|
||
"source": [
|
||
"0 概括:可爱的熊猫毛绒玩具,质量好,送货快,但有点小。 \n",
|
||
"\n",
|
||
"1 这个评论是关于一款具有额外储存空间的床头灯,价格适中。客户对公司的服务和产品表示满意。 \n",
|
||
"\n",
|
||
"2 评论概括:电动牙刷电池寿命长,但刷头太小,需要更长的刷毛。价格合理,使用后牙齿感觉干净。 \n",
|
||
"\n",
|
||
"3 评论概括:产品价格在12月份上涨,质量不如以前,但交付速度快。 "
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.8.13"
|
||
},
|
||
"latex_envs": {
|
||
"LaTeX_envs_menu_present": true,
|
||
"autoclose": false,
|
||
"autocomplete": true,
|
||
"bibliofile": "biblio.bib",
|
||
"cite_by": "apalike",
|
||
"current_citInitial": 1,
|
||
"eqLabelWithNumbers": true,
|
||
"eqNumInitial": 1,
|
||
"hotkeys": {
|
||
"equation": "Ctrl-E",
|
||
"itemize": "Ctrl-I"
|
||
},
|
||
"labels_anchors": false,
|
||
"latex_user_defs": false,
|
||
"report_style_numbering": false,
|
||
"user_envs_cfg": false
|
||
},
|
||
"toc": {
|
||
"base_numbering": 1,
|
||
"nav_menu": {},
|
||
"number_sections": true,
|
||
"sideBar": true,
|
||
"skip_h1_title": false,
|
||
"title_cell": "Table of Contents",
|
||
"title_sidebar": "Contents",
|
||
"toc_cell": false,
|
||
"toc_position": {},
|
||
"toc_section_display": true,
|
||
"toc_window_display": true
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|