python - 过滤掉 Pandas DataFrame 中的 "empty array"值-6ren

python - 过滤掉 Pandas DataFrame 中的 "empty array"值

转载作者：太空宇宙更新时间：2023-11-04 07:51:30

假设我有一个数据框 d，其中有一列包含 Python 数组作为值。

>>> d = pd.DataFrame([['foo', ['bar']], ['biz', []]], columns=['a','b'])
>>> print d

     a      b
0  foo  [bar]
1  biz     []

现在，我想过滤掉那些有空数组的行。

我尝试了各种版本，但到目前为止还没有成功:

尝试将其检查为“真实”值:

>>> d[d['b']]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 2682, in __getitem__
    return self._getitem_array(key)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 2726, in _getitem_array
    indexer = self.loc._convert_to_indexer(key, axis=1)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/indexing.py", line 1314, in _convert_to_indexer
    indexer = check = labels.get_indexer(objarr)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/indexes/base.py", line 3259, in get_indexer
    indexer = self._engine.get_indexer(target._ndarray_values)
  File "pandas/_libs/index.pyx", line 301, in pandas._libs.index.IndexEngine.get_indexer
  File "pandas/_libs/hashtable_class_helper.pxi", line 1544, in pandas._libs.hashtable.PyObjectHashTable.lookup
TypeError: unhashable type: 'list'

尝试显式长度检查。似乎 len() 正在应用于系列，而不是数据的值。

>>> d[ len(d['b']) > 0 ]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 2688, in __getitem__
    return self._getitem_column(key)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/frame.py", line 2695, in _getitem_column
    return self._get_item_cache(key)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/generic.py", line 2489, in _get_item_cache
    values = self._data.get(item)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/internals.py", line 4115, in get
    loc = self.items.get_loc(item)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/indexes/base.py", line 3080, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: True

直接与空数组进行比较，就像我们可能与空字符串进行比较一样(顺便说一句，如果我们使用字符串而不是数组，它确实有效)。

>>> d[ d['b'] == [] ]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/ops.py", line 1283, in wrapper
    res = na_op(values, other)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/ops.py", line 1143, in na_op
    result = _comp_method_OBJECT_ARRAY(op, x, y)
  File "/home/myname/.local/lib/python2.7/site-packages/pandas/core/ops.py", line 1120, in _comp_method_OBJECT_ARRAY
    result = libops.vec_compare(x, y, op)
  File "pandas/_libs/ops.pyx", line 128, in pandas._libs.ops.vec_compare
ValueError: Arrays were different lengths: 2 vs 0

最佳答案

使用字符串访问器.str检查pandas系列中列表的长度:

d[d.b.str.len()>0]

输出:

     a      b
0  foo  [bar]

关于python - 过滤掉 Pandas DataFrame 中的 "empty array"值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54046741/

文章推荐： python - pandas read_csv 函数读取一列作为核苷酸序列的 NaN

文章推荐： java - Java 中的可过期 Kerberos 票证

文章推荐： java - 如何读取旧的word doc文件元数据

文章推荐： python - 连续状态和 Action 空间的强化学习

php - preg_replace 掉 CSS 注释？
我正在编写一个快速的 preg_replace 来从 CSS 中删除注释。 CSS 注释通常有这样的语法: /* Development Classes*/ /* Un-comment me for
SQL COUNT 条记录在表 2 JOINS 掉
使用 MySQL，我有三个表: 项目: ID name 1 "birthday party" 2 "soccer match" 3 "wine tasting evening" 4

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 过滤掉 Pandas DataFrame 中的 "empty array"值