From 6186bee0ef1783d1c4897914016263e7b326a923 Mon Sep 17 00:00:00 2001 From: Ikko Eltociear Ashimine Date: Fri, 9 Jun 2023 00:43:20 +0900 Subject: [PATCH] =?UTF-8?q?Update=203.=E5=AD=98=E5=82=A8=20.ipynb?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit HuggingFace -> Hugging Face --- content/LangChain for LLM Application Development/3.存储 .ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/LangChain for LLM Application Development/3.存储 .ipynb b/content/LangChain for LLM Application Development/3.存储 .ipynb index 6df6fa0..72f5059 100644 --- a/content/LangChain for LLM Application Development/3.存储 .ipynb +++ b/content/LangChain for LLM Application Development/3.存储 .ipynb @@ -1198,7 +1198,7 @@ "ChatGPT使用一种基于字节对编码(Byte Pair Encoding,BPE)的方法来进行tokenization(将输入文本拆分为token)。 \n", "BPE是一种常见的tokenization技术,它将输入文本分割成较小的子词单元。 \n", "\n", - "OpenAI在其官方GitHub上公开了一个最新的开源Python库:tiktoken,这个库主要是用来计算tokens数量的。相比较HuggingFace的tokenizer,其速度提升了好几倍 \n", + "OpenAI在其官方GitHub上公开了一个最新的开源Python库:tiktoken,这个库主要是用来计算tokens数量的。相比较Hugging Face的tokenizer,其速度提升了好几倍 \n", "\n", "具体token计算方式,特别是汉字和英文单词的token区别,参考 \n" ]