Merge pull request #90 from madefu/master

This commit is contained in:
Yong Mao
2019-05-27 09:42:42 +08:00
committed by GitHub
23 changed files with 124 additions and 1 deletions

2
.gitignore vendored
View File

@ -102,3 +102,5 @@ venv.bak/
# mypy
.mypy_cache/
/.idea
/datasets/*.bak

View File

@ -31,4 +31,4 @@ y_pred = regressor.predict(X_test)
# regression evaluation
from sklearn.metrics import r2_score
print(r2_score(Y_test,y_pred))
print(r2_score(Y_test, y_pred))

13
Code/KafkaProducer.py Normal file
View File

@ -0,0 +1,13 @@
#!/usr/bin/python
from kafka import KafkaProducer
kafkaHosts=["kafka01.paas.longfor.sit:9092"
,"kafka02.paas.longfor.sit:9092"
,"kafka03.paas.longfor.sit:9092"]
producer = KafkaProducer(bootstrap_servers=kafkaHosts);
for _ in range(20):
producer.send("testapplog_plm-prototype",b"Hello....")
producer.flush();

24
Code/TestKafka.py Normal file
View File

@ -0,0 +1,24 @@
#!/usr/bin/python
from kafka import KafkaConsumer;
kafkaHosts=["kafka01.paas.longfor.sit:9092"
,"kafka02.paas.longfor.sit:9092"
,"kafka03.paas.longfor.sit:9092"]
'''
earliest
当各分区下有已提交的offset时从提交的offset开始消费无提交的offset时从头开始消费
latest
当各分区下有已提交的offset时从提交的offset开始消费无提交的offset时消费新产生的该分区下的数据
none
topic各分区都存在已提交的offset时从offset后开始消费只要有一个分区不存在已提交的offset则抛出异常
'''
consumer = KafkaConsumer(
bootstrap_servers=kafkaHosts,group_id='mdf_group',auto_offset_reset='latest');
consumer.subscribe("testapplog_plm-prototype");
for msg in consumer:
print(msg.value)

View File

@ -0,0 +1,11 @@
Age,Salary
44,72000
27,48000
30,54000
38,61000
40,78000
35,58000
35,52000
48,79000
50,83000
37,67000
1 Age Salary
2 44 72000
3 27 48000
4 30 54000
5 38 61000
6 40 78000
7 35 58000
8 35 52000
9 48 79000
10 50 83000
11 37 67000

6
Code/my/LinerTest.py Normal file
View File

@ -0,0 +1,6 @@
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
dataset = pd.read_csv('Data_age_salary.csv');
dataset.iloc[:1]

View File

@ -0,0 +1,64 @@
# 14张速查表带你玩转 Python 数据科学
译自 DataCamp 的速查表,有兴趣的朋友可以在这里查看[英文原版](https://www.datacamp.com/community/data-science-cheatsheets)。
欢迎扫描二维码关注我的 **呆鸟的Python数据分析** 公众号,虽然现在内容还比较少,但我会不断增加的。
![呆鸟的Python数据分析](https://upload-images.jianshu.io/upload_images/3240514-6a4349fff3b8b019.jpg?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
一、[Python 基础系列](https://www.jianshu.com/p/4574d95755db)
* [Python数据科学速查表 - Python 基础](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python数据科学速查表%20-%20Python%20基础.pdf)
![Python数据科学速查表 - Python 基础](https://upload-images.jianshu.io/upload_images/3240514-a4268e27432411b0.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
* [Python数据科学速查表 - 导入数据](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python数据科学速查表%20-%20导入数据.pdf)
![Python数据科学速查表 - 导入数据](https://upload-images.jianshu.io/upload_images/3240514-d2b38f138e127333.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
* [Python数据科学速查表 - Jupyter Notebook](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python%E6%95%B0%E6%8D%AE%E7%A7%91%E5%AD%A6%E9%80%9F%E6%9F%A5%E8%A1%A8%20-%20Jupyter%20Notebook.pdf)
![Python数据科学速查表 - Jupyter Notebook](https://upload-images.jianshu.io/upload_images/3240514-111052b862105515.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
二、[数据处理系列Numpy、Pandas 及 SciPy](https://www.jianshu.com/p/8d51642dfa26)
* [Python数据科学速查表 - Numpy 基础](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python数据科学速查表%20-%20Numpy%20基础.pdf)
![Python数据科学速查表 - Numpy 基础](https://upload-images.jianshu.io/upload_images/3240514-811ca5b168b84a75.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
* [Python数据科学速查表 - Pandas 基础](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python数据科学速查表%20-%20Pandas%20基础.pdf)
![Python数据科学速查表 - Pandas 基础](https://upload-images.jianshu.io/upload_images/3240514-0711ea1cb1d1fcee.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
* [Python数据科学速查表 - Pandas 进阶](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python数据科学速查表%20-%20Pandas%20进阶.pdf)
![Python数据科学速查表 - Pandas 进阶](https://upload-images.jianshu.io/upload_images/3240514-475cfa9174303f60.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
* [Python数据科学速查表 - SciPy](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python%E6%95%B0%E6%8D%AE%E7%A7%91%E5%AD%A6%E9%80%9F%E6%9F%A5%E8%A1%A8%20-%20SciPy.pdf)
![Python数据科学速查表 - SciPy](https://upload-images.jianshu.io/upload_images/3240514-fc9fa2b69a820041.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
三、[可视化系列Matplotlib、Bokeh、Seaborn](https://www.jianshu.com/p/7e186d43d7f1)
* [Python数据科学速查表 - Matplotlib](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python数据科学速查表%20-%20Matplotlib%20绘图.pdf)
![Python数据科学速查表 - Matplotlib](https://upload-images.jianshu.io/upload_images/3240514-9c57bcf33113cf98.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
* [Python数据科学速查表 - Bokeh](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python数据科学速查表%20-%20Bokeh.pdf)
![Python数据科学速查表 - Bokeh](https://upload-images.jianshu.io/upload_images/3240514-d87990f4a3f7f0ae.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
* [Python数据科学速查表 - Seaborn](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python%E6%95%B0%E6%8D%AE%E7%A7%91%E5%AD%A6%E9%80%9F%E6%9F%A5%E8%A1%A8%20-%20Seaborn.pdf)
![Python数据科学速查表 - Seaborn](https://upload-images.jianshu.io/upload_images/3240514-1c2f56b96f841421.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
四、[机器学习系列Keras、Scikit-learn](https://www.jianshu.com/p/cba49ff5fc97)
* [Python数据科学速查表 - Keras](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python数据科学速查表%20-%20Keras.pdf)
![Python数据科学速查表 - Keras](https://upload-images.jianshu.io/upload_images/3240514-ea9657c97b5e0d19.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
* [Python数据科学速查表 - Scikit-learn](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python%E6%95%B0%E6%8D%AE%E7%A7%91%E5%AD%A6%E9%80%9F%E6%9F%A5%E8%A1%A8%20-%20Scikit-Learn.pdf)
![Python数据科学速查表 - Scikit-learn](https://upload-images.jianshu.io/upload_images/3240514-b8cbecb3a9194b71.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
五、[PySpark系列SQL与RDD](https://www.jianshu.com/p/7dea578c56d8)
* [Python数据科学速查表 - Spark SQL 基础](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python%E6%95%B0%E6%8D%AE%E7%A7%91%E5%AD%A6%E9%80%9F%E6%9F%A5%E8%A1%A8%20-%20Spark%20SQL%20%E5%9F%BA%E7%A1%80.pdf)
![pyspark.png](https://upload-images.jianshu.io/upload_images/3240514-32909e63b67c0837.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
* [Python数据科学速查表 - Spark RDD 基础](https://github.com/jaystone776/python-data-science-cheatsheet/blob/master/Python%E6%95%B0%E6%8D%AE%E7%A7%91%E5%AD%A6%E9%80%9F%E6%9F%A5%E8%A1%A8%20-%20Spark%20RDD%20%E5%9F%BA%E7%A1%80.pdf)
![RDD.png](https://upload-images.jianshu.io/upload_images/3240514-43308e4ca89d8699.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
如果喜欢本文,敬请关注我的简书专题 **[呆鸟的Python数据分析](https://www.jianshu.com/c/38980843c0f2)**
感谢**天善智能**的**Python爱好者社区**公众号一直以来对我的支持,这里也大力推荐,是我学习入门 Python 数据分析入门的引路者,欢迎关注!
![Python爱好者社区](https://upload-images.jianshu.io/upload_images/3240514-29c87337342b0904.jpg?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

View File

@ -24,3 +24,6 @@ Hours,Scores
3.8,35
6.9,76
7.8,86
2.1,93
2.2,93
2.5,93
1 Hours Scores
24 3.8 35
25 6.9 76
26 7.8 86
27 2.1 93
28 2.2 93
29 2.5 93