diff --git a/README.md b/README.md index 7c5d780..a18d32a 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# chinese-poetry +# chinese-poetry | [全宋词爬取过程及数据分析](http://jackeygao.io/words/crawl-ci.html) ![中华古诗](https://raw.githubusercontent.com/jackeyGao/chinese-poetry/master/images/full-tang-poetry.png "中华古诗") @@ -18,6 +18,18 @@ ![宋诗作者作品榜](https://raw.githubusercontent.com/jackeyGao/chinese-poetry/master/images/song_author_topK.png "宋诗作者作品榜") +## 宋词作者作品榜 + +![宋词作者作品榜](https://raw.githubusercontent.com/jackeyGao/chinese-poetry/master/images/ci_author_topK.png "宋词作者作品榜") + +## 宋词高频词 + +![宋词高频词](https://raw.githubusercontent.com/jackeyGao/chinese-poetry/master/images/ci_words_topK.png) + +## 两宋喜欢的词牌名 + +![两宋喜欢的词牌名](https://raw.githubusercontent.com/jackeyGao/chinese-poetry/master/images/ci_rhythmic_topK.png) + ## 数据分发形式 @@ -61,3 +73,5 @@ 目前仅有全唐诗记录, 唐后的一些古诗未收集,古诗乃中华文化瑰宝, 如果有靠谱的数据源也欢迎提交PR加入. +## License +[MIT](https://zh.wikipedia.org/wiki/MIT%E8%A8%B1%E5%8F%AF%E8%AD%89) 许可证. diff --git a/ci/README.md b/ci/README.md new file mode 100644 index 0000000..6005782 --- /dev/null +++ b/ci/README.md @@ -0,0 +1,27 @@ +全宋词 +--- + + +```sql +sqlite> .tables +ci ciauthor +sqlite> select count(1) from ci; +21050 +sqlite> select count(1) from ciauthor; +1564 +sqlite> select * from ci limit 1; +1|导引|和岘|气和玉烛,睿化著鸿明。 +缇管一阳生。 +郊盛礼燔柴毕,旋轸凤凰城。 +森罗仪卫振华缨。 +载路溢欢声。 +皇图大业超前古,垂象泰阶平。 +岁时丰衍,九土乐升平。 +睹寰海澄清。 +道高尧舜垂衣治,日月并文明。 +sqlite> select * from ciauthor limit 1; +1|苏轼|苏轼:(1037-1101)北宋文学家、书画家... +``` + + + diff --git a/ci/ci.db b/ci/ci.db new file mode 100644 index 0000000..bfca7ee Binary files /dev/null and b/ci/ci.db differ diff --git a/images/ci_author_topK.png b/images/ci_author_topK.png new file mode 100644 index 0000000..71b288f Binary files /dev/null and b/images/ci_author_topK.png differ diff --git a/images/ci_rhythmic_topK.png b/images/ci_rhythmic_topK.png new file mode 100644 index 0000000..2b08b12 Binary files /dev/null and b/images/ci_rhythmic_topK.png differ diff --git a/images/ci_words_topK.png b/images/ci_words_topK.png new file mode 100644 index 0000000..24387cc Binary files /dev/null and b/images/ci_words_topK.png differ