下面为大家分享一篇使用pandas读取csv文件的指定列方法,具有很好的参考价值,希望对大家有所帮助。一起过来看看吧
根据教程实现了读取csv文件前面的几行数据,一下就想到了是不是可以实现前面几列的数据。经过多番尝试总算试出来了一种方法。
之所以想实现读取前面的几列是因为我手头的一个csv文件恰好有后面几列没有可用数据,但是却一直存在着。原来的数据如下:
GreydeMac-mini:chapter06 greyzhang$ cat data.csv
1,name_01,coment_01,,,,2,name_02,coment_02,,,,3,name_03,coment_03,,,,4,name_04,coment_04,,,,5,name_05,coment_05,,,,6,name_06,coment_06,,,,7,name_07,coment_07,,,,8,name_08,coment_08,,,,9,name_09,coment_09,,,,10,name_10,coment_10,,,,11,name_11,coment_11,,,,12,name_12,coment_12,,,,13,name_13,coment_13,,,,14,name_14,coment_14,,,,15,name_15,coment_15,,,,16,name_16,coment_16,,,,17,name_17,coment_17,,,,18,name_18,coment_18,,,,19,name_19,coment_19,,,,20,name_20,coment_20,,,,21,name_21,coment_21,,,,
点击下载“修复打印机驱动工具”;
如果使用pandas读取出全部的数据,打印的时候会出现以下结果:
In [41]: data = pd.read_csv('data.csv')
In [42]: dataOut[42]: 1 name_01 coment_01 Unnamed: 3 Unnamed: 4 Unnamed: 5 Unnamed: 60 2 name_02 coment_02 NaN NaN NaN NaN1 3 name_03 coment_03 NaN NaN NaN NaN2 4 name_04 coment_04 NaN NaN NaN NaN3 5 name_05 coment_05 NaN NaN NaN NaN4 6 name_06 coment_06 NaN NaN NaN NaN5 7 name_07 coment_07 NaN NaN NaN NaN6 8 name_08 coment_08 NaN NaN NaN NaN7 9 name_09 coment_09 NaN NaN NaN NaN8 10 name_10 coment_10 NaN NaN NaN NaN9 11 name_11 coment_11 NaN NaN NaN NaN10 12 name_12 coment_12 NaN NaN NaN NaN11 13 name_13 coment_13 NaN NaN NaN NaN12 14 name_14 coment_14 NaN NaN NaN NaN13 15 name_15 coment_15 NaN NaN NaN NaN14 16 name_16 coment_16 NaN NaN NaN NaN15 17 name_17 coment_17 NaN NaN NaN NaN16 18 name_18 coment_18 NaN NaN NaN NaN17 19 name_19 coment_19 NaN NaN NaN NaN18 20 name_20 coment_20 NaN NaN NaN NaN19 21 name_21 coment_21 NaN NaN NaN NaN
点击下载“修复打印机驱动工具”;
所说在学习的过程中这并不会给我带来什么障碍,但是在命令行终端界面呆久了总喜欢稍微清爽一点的风格。使用read_csv的参数usecols能够在一定程度上减少这种混乱感。
In [45]: data = pd.read_csv('data.csv',usecols=[0,1,2,3])
In [46]: dataOut[46]: 1 name_01 coment_01 Unnamed: 30 2 name_02 coment_02 NaN1 3 name_03 coment_03 NaN2 4 name_04 coment_04 NaN3 5 name_05 coment_05 NaN4 6 name_06 coment_06 NaN5 7 name_07 coment_07 NaN6 8 name_08 coment_08 NaN7 9 name_09 coment_09 NaN8 10 name_10 coment_10 NaN9 11 name_11 coment_11 NaN10 12 name_12 coment_12 NaN11 13 name_13 coment_13 NaN12 14 name_14 coment_14 NaN13 15 name_15 coment_15 NaN14 16 name_16 coment_16 NaN15 17 name_17 coment_17 NaN16 18 name_18 coment_18 NaN17 19 name_19 coment_19 NaN18 20 name_20 coment_20 NaN19 21 name_21 coment_21 NaN
点击下载“修复打印机驱动工具”;
为了能够看到数据的“边界”,读取的时候显示了第一列无效的数据。正常的使用中,或许我们是想连上面结果中最后一列的信息也去掉的,那只需要在参数重去掉最后一列的列号。
In [47]: data = pd.read_csv('data.csv',usecols=[0,1,2])
In [48]: dataOut[48]: 1 name_01 coment_010 2 name_02 coment_021 3 name_03 coment_032 4 name_04 coment_043 5 name_05 coment_054 6 name_06 coment_065 7 name_07 coment_076 8 name_08 coment_087 9 name_09 coment_098 10 name_10 coment_109 11 name_11 coment_1110 12 name_12 coment_1211 13 name_13 coment_1312 14 name_14 coment_1413 15 name_15 coment_1514 16 name_16 coment_1615 17 name_17 coment_1716 18 name_18 coment_1817 19 name_19 coment_1918 20 name_20 coment_2019 21 name_21 coment_21
点击下载“修复打印机驱动工具”;
相关推荐:
使用实现pandas读取csv文件指定的前几行
点击下载“修复打印机驱动工具”;
免责声明:本站内容仅用于学习参考,信息和图片素材来源于互联网,如内容侵权与违规,请联系我们进行删除,我们将在三个工作日内处理。联系邮箱:chuangshanghai#qq.com(把#换成@)