gpt4 book ai didi

Python - 将数据帧保存到 CSV "too many indices for array"错误

转载 作者:太空宇宙 更新时间:2023-11-04 08:37:56 28 4
gpt4 key购买 nike

我正在尝试将数据帧保存为 CSV 并收到“数组索引太多”错误。用于保存的代码是-

df.to_csv('CCS_Matrix.csv')

数据框看起来像这样

  Var10  Var100   Var101    
0 0 1 1
1 0 0 1
2 0 1 0

数据集中有 250 列和大约 1000 万行。

数据框的数据类型是

Var10     int64
Var100 int64
Var101 int64
etc.

250 列的所有数据类型都相同。

这里是错误信息的完整输出

---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-16-37cbe55e6c0d> in <module>()
----> 1 df.to_csv('CCS_Matrix.csv', encoding='utf-8')

~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, line_terminator, chunksize, tupleize_cols, date_format, doublequote, escapechar, decimal)
1401 doublequote=doublequote,
1402 escapechar=escapechar, decimal=decimal)
-> 1403 formatter.save()
1404
1405 if path_or_buf is None:

~/anaconda3/lib/python3.6/site-packages/pandas/io/formats/format.py in save(self)
1590 self.writer = csv.writer(f, **writer_kwargs)
1591
-> 1592 self._save()
1593
1594 finally:

~/anaconda3/lib/python3.6/site-packages/pandas/io/formats/format.py in _save(self)
1691 break
1692
-> 1693 self._save_chunk(start_i, end_i)
1694
1695 def _save_chunk(self, start_i, end_i):

~/anaconda3/lib/python3.6/site-packages/pandas/io/formats/format.py in _save_chunk(self, start_i, end_i)
1705 decimal=self.decimal,
1706 date_format=self.date_format,
-> 1707 quoting=self.quoting)
1708
1709 for col_loc, col in zip(b.mgr_locs, d):

~/anaconda3/lib/python3.6/site-packages/pandas/core/internals.py in to_native_types(self, slicer, na_rep, quoting, **kwargs)
611 values = self.values
612 if slicer is not None:
--> 613 values = values[:, slicer]
614 mask = isnull(values)
615

~/anaconda3/lib/python3.6/site-packages/pandas/core/sparse/array.py in __getitem__(self, key)
417 return self._get_val_at(key)
418 elif isinstance(key, tuple):
--> 419 data_slice = self.values[key]
420 else:
421 if isinstance(key, SparseArray):

IndexError: too many indices for array

最佳答案

你能打印出 type(df) 吗?我在 SparseDataFrames here 中注意到了这个问题.

我能够通过在 SparseDataFrame 上调用 .to_dense() 来解决问题,生成一个传统的 DataFrame。之后工作正常。显然,出于内存原因,这并不理想,但至少它在短期内有效。

pandas团队已经回应,确实是bug。

关于Python - 将数据帧保存到 CSV "too many indices for array"错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47342718/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com