gpt4 book ai didi

Python Pandas : output dataframe to csv with integers

转载 作者:IT老高 更新时间:2023-10-28 22:15:39 24 4
gpt4 key购买 nike

我有一个希望导出到 CSV 文件的 pandas.DataFrame。但是,pandas 似乎将一些值写为 float 而不是 int 类型。我找不到如何改变这种行为。

构建数据框:

df = pandas.DataFrame(columns=['a','b','c','d'], index=['x','y','z'], dtype=int)
x = pandas.Series([10,10,10], index=['a','b','d'], dtype=int)
y = pandas.Series([1,5,2,3], index=['a','b','c','d'], dtype=int)
z = pandas.Series([1,2,3,4], index=['a','b','c','d'], dtype=int)
df.loc['x']=x; df.loc['y']=y; df.loc['z']=z

查看:

>>> df
a b c d
x 10 10 NaN 10
y 1 5 2 3
z 1 2 3 4

导出:

>>> df.to_csv('test.csv', sep='\t', na_rep='0', dtype=int)
>>> for l in open('test.csv'): print l.strip('\n')
a b c d
x 10.0 10.0 0 10.0
y 1 5 2 3
z 1 2 3 4

为什么十位都有一个点零?

当然,我可以将这个函数粘贴到我的管道中以重新转换整个 CSV 文件,但这似乎没有必要:

def lines_as_integer(path):
handle = open(path)
yield handle.next()
for line in handle:
line = line.split()
label = line[0]
values = map(float, line[1:])
values = map(int, values)
yield label + '\t' + '\t'.join(map(str,values)) + '\n'
handle = open(path_table_int, 'w')
handle.writelines(lines_as_integer(path_table_float))
handle.close()

最佳答案

这是 "gotcha" in pandas (Support for integer NA) ,其中带有 NaN 的整数列被转换为 float 。

This trade-off is made largely for memory and performance reasons, and also so that the resulting Series continues to be “numeric”. One possibility is to use dtype=object arrays instead.

关于Python Pandas : output dataframe to csv with integers,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17092671/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com