python - Pandas 错误 : __setitem__() doesnt recognize dictionary values as a list of column names-6ren

python - Pandas 错误 : setitem() doesnt recognize dictionary values as a list of column names

转载作者：行者123 更新时间：2023-12-03 17:18:52

25

4

编辑:看起来这是 Pandas 中的一个潜在错误。查看此 GitHub issue @NicMoetsch 注意到使用字典值分配的意外行为与框架的 __setitem__() 之间的差异有关。和 __getitem__() .

在我之前的代码中，我用字典重命名了一些列:

cols_dict = {
     'Long_column_Name': 'first_column',
     'Other_Long_Column_Name': 'second_column',
     'AnotherLongColName': 'third_column'
}
for key, val in cols_dict.items():
    df.rename(columns={key: val}, inplace=True)

(我知道这里不需要循环——在我的实际代码中，我必须在数据帧列表中搜索数据帧的列，并获得字典键的子字符串匹配。)
后来我用 applymap() 做一些清理工作, 用字典值索引，它工作正常

pibs[cols_dict.values()].applymap(
    lambda x: np.nan if ':' in str(x) else x
)

但是当我尝试将切片分配回自身时，我收到一个关键错误(完整错误消息 here )。

pibs[cols_dict.values()] = pibs[cols_dict.values()].applymap(
    lambda x: np.nan if ':' in str(x) else x
)

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/.local/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3079             try:
-> 3080                 return self._engine.get_loc(casted_key)
   3081             except KeyError as err:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: dict_values(['first_column', 'second_column', 'third_column'])

如果我将字典值转换为列表，代码运行良好

pibs[list(cols_dict.values())] = ...

所以我想我只是想知道为什么我能够使用字典值进行切片并运行 applymap()在它上面，但是当我转身并尝试将结果分配回数据框时，我无法使用字典值进行切片。
简单地说:为什么 Pandas 会识别 cols_dict.values()当它用于索引时作为列名列表，而不是用于索引分配时？

最佳答案

该问题似乎与 applymap() 无关。 , 作为使用 aneroid 的例子，没有 applymap() :

import copy

cols_dict = {
     'Long_column_Name': 'first_column',
     'Other_Long_Column_Name': 'second_column',
     'AnotherLongColName': 'third_column'
}

df = pd.DataFrame({'Long_column_Name': range(3),
                   'Other_Long_Column_Name': range(3, 6),
                   'AnotherLongColName': range(15, 10, -2),
})
df.rename(columns=cols_dict, inplace=True)

df[cols_dict.values()] = df[cols_dict.values()]

产生相同的错误。
显然不是操作部分不起作用，而是赋值部分，因为

df = df[cols_dict.values()]

工作正常。
使用不同的 DataFrame 组合表明 3在错误信息中

ValueError: Wrong number of items passed 3, placement implies 1

不是由分配部分引起的，因为尝试分配四列 DataFrame 会引发不同的错误:

df2 = pd.DataFrame({'Long_column_Name': range(3),
                   'Other_Long_Column_Name': range(3, 6),
                   'AnotherLongColName': range(15, 10, -2),
                    'ShtClNm': range(10, 13)})

产量

ValueError: Wrong number of items passed 4, placement implies 1

因此，我尝试只分配一列，以便理论上它只通过 1 个工作正常而不会引发错误的项目。

df[cols_dict.values()] = df2['Long_column_Name']

然而结果不是预期的:

df
   first_column  second_column  third_column (first_column, second_column,third_column)  
0            0              3            15                                          0
1            1              4            13                                          1
2            1              5            11                                          2

所以对我来说，似乎正在发生的事情是 Pandas 无法识别 cols_dict.values()传递给 df[...] =作为列名列表，而是作为一个新列的名称 (first_column, second_column,third_column) .
这就是为什么它试图用传递给赋值的值填充该新列。由于您传递了许多 (3) 列以分配给它破坏的一个新列。
当您使用 list()在 df[list(cols_dict.values())] =它工作正常，因为它随后识别出传递了一个列列表。
深入了解 pandas DataFrames ，我想我已经找到了问题所在。
据我了解，pandas 使用 __setitem__()用于分配和 __getitem__()用于查找。这两个函数都使用了 convert_to_index_sliceable()定义 here . convert_to_index_sliceable() ，如果您传递的任何内容都是可切片的，则返回一个切片，并且 None如果不是。
两者 __getitem__()和 __setitem__()首先检查，是否 convert_to_index_sliceable()返回 None但是如果它没有返回 None ，他们不同。 __getitem__()将索引器转换为 np.intp , 这是 numpy 在返回切片之前的索引日期类型，如下所示:

        # Do we have a slicer (on rows)?
        indexer = convert_to_index_sliceable(self, key)
        if indexer is not None:
            if isinstance(indexer, np.ndarray):
                indexer = lib.maybe_indices_to_slice(
                    indexer.astype(np.intp, copy=False), len(self)
                )
            # either we have a slice or we have a string that can be converted
            #  to a slice for partial-string date indexing
            return self._slice(indexer, axis=0)

__setitem__()另一方面立即返回:

        # see if we can slice the rows
        indexer = convert_to_index_sliceable(self, key)
        if indexer is not None:
            # either we have a slice or we have a string that can be converted
            #  to a slice for partial-string date indexing
            return self._setitem_slice(indexer, value)

假设没有向 __getitem__() 添加不必要的代码，我想 __setitem__()必须缺少该代码，因为两个预返回注释都与 indexer 声明的内容完全相同。可能是。
我将提出一个 GitHub 问题，询问这是否是预期行为。

关于python - Pandas 错误 : __setitem__() doesnt recognize dictionary values as a list of column names，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/66961614/

25

4

0

文章推荐： python - ortools中修正的总线调度问题

文章推荐： php - PHP-返回所有段落直到第一个

python - 内存效率 : One large dictionary or a dictionary of smaller dictionaries?
我正在用 Python (2.6) 编写一个应用程序，需要我使用字典作为数据存储。我很好奇拥有一个大字典是否更节省内存，或者将其分解为许多(很多)较小的字典，然后拥有一个包含对所有较小字典的引用的“
ios - Swift 减少/展平 Dictionary 到 Dictionary
Convert this [ "Cat" : ["A" : 1, "B": 2], "Mat" : ["C" : 3, "D": 4] ] Into [ "A" : 1,
c# - 组合 Dictionary + Dictionary 来创建 Dictionary
有什么很酷的快速方法可以让两个字典创建第三个字典，以内连接方式将第一个字典的键映射到第二个字典的值？ Dictionary dic1 = new Dictionary {{a1,b1},{a2,b2}
c# - 请尝试使用 Dictionary, Dictionary> 的建议
我希望将字典相互嵌套，以便容纳 block 的 xy 坐标。所以我会 IDictionary, IDictionary> 键 Dictionary 包含列、行组合，而值 Dictionary 包含 x
c# - 使用 Dictionary 作为 Dictionary>
在 C# 中，我需要将数据保存在字典对象中，如下所示: Dictionary> MyDict = new Dictionary>(); 现在我意识到，在某些情况下我需要一些其他(不是字典类的)
C#:Dictionary 到 Dictionary> 的转换
第一个Dictionary就像 Dictionary ParentDict = new Dictionary(); ParentDict.Add("A_1", "1")
c# - 使用 LINQ 按内部 Dictionary 值的值对 Dictionary> 进行排序？
我似乎无法理解这个问题。我需要使用 LINQ 按内部字典的值对字典进行排序。有什么想法吗？最佳答案你的意思是你想要所有的值，按内部值排序？ from outerPair in outer from
Swagger 3 : schema for dictionary of dictionaries
我想建模一个模式，其中响应是字典: { 'id1': { 'type': 'type1', 'active': true, }, 'id2': { 'type':
python - dictionary of dictionary - 如果键不存在，如何更新或创建值？
我有以下代码要添加或更新(如果已经存在)dict()-dict 中的值: if id not in self.steps: self.steps[ id ] = step else:
swift - 如何改变 Swift Dictionary of Dictionary
我有一个包含字典的 Swift 字典，我想使用存储的属性来访问键值: var json = [NSObject:AnyObject]() var title: String { get
c# - IEqualityComparer on Dictionary inside Dictionary
我想创建一个 Dictionary>结构，我想提供一个 IEqualityComparer在包含 APerson 的second 字典中作为关键如果我只有内部字典，那就是 var f = new D
Mongodb groupby on Dictionary inside dictionary
我有一个集合，其中包含如下文档:文档 1: { "company": "ABC" "application": { "app-1": {"earning_from_src_A": 50,
swift - swift 中的 Dictionary of Dictionary
我正在快速学习。我发现 dictionary 就像 hash 用于 PHP 或其他一些语言。那我怎么制作dictionary的dictionary呢？？我有这样的数据 key:J name:jh
python - Dictionary of lists 到 Dictionary
这个问题在这里已经有了答案: Explode a dict - Get all combinations of the values in a dictionary (2 个答案) 关闭 5 个月前
dictionary - 如何通过给定的项目值显示 Motobit Multi.Dictionary 中的键？
我是编程新手，所以如果我的问题看起来很愚蠢，我很抱歉。我想问一下有没有办法从 Multi.Dictionary 返回key当我有值(value)？这是我的代码: Dim myDict Set myD
dictionary - Ada 中是否预先实现了 "dictionary"类型？以及如何使用它？
我试图找出标准 Ada 库是否配备了“字典”类型(我的意思是:一种以格式存储值的数据结构，我可以从中检索 value 使用相应的唯一 key)。这样的数据结构存在吗？如果是这样，有人可以提供一个
dictionary - VBScript Dictionary Exists 方法总是返回 True
我究竟做错了什么？根据我的测试，objDic.exists 永远不会给出 False! dim objDic set objDic = createobject("scripting.
dictionary - Julia 中的复合类型 : Dictionaries as a named field?
我想创建一个复合类型，其中包含一个字典作为其命名字段之一。但是明显的语法不起作用。我敢肯定有一些我不明白的基本原理。下面是一个例子: type myType x::Dict() end Jul
dictionary - Julia 错误: map is not defined on dictionaries
julia> hotcell2vocab = Dict([(cell, i-1+vocab_start) for (i,cell) in enumerate(h
dictionary - .NET : ForEach() extension methods and Dictionary
我有一个简单的问题:我对 Dictionary.Value 集合进行了很多次迭代，这让我很烦，我必须调用 .ToList() 然后才能调用 .ForEach()，因为它似乎没有可枚举的Dictiona

首页

博学

6Ren·AI

商城

python - Pandas 错误 : setitem() doesnt recognize dictionary values as a list of column names