gpt4 book ai didi

python - DataFrame.apply 与 str.extract 抛出错误,即使函数适用于每个列系列

转载 作者:行者123 更新时间:2023-11-28 21:31:08 25 4
gpt4 key购买 nike

对于这个 DataFrame 示例:df = pd.DataFrame([['A-3', 'B-4'], ['C-box', 'D1-go']])

在单个列上调用提取作为系列工作正常:

df.iloc[:, 0].str.extract('-(.+)')
df.iloc[:, 1].str.extract('-(.+)')

还有另一个轴:

df.iloc[0, :].str.extract('-(.+)')
df.iloc[1, :].str.extract('-(.+)')

因此,我希望使用 apply 会起作用(通过对每一列应用提取):

df.apply(lambda s: s.str.extract('-(.+)'), axis=0)

但是它抛出这个错误:

Traceback (most recent call last):
File "C:\ProgramData\Miniconda3\envs\py3\lib\site-packages\IPython\core\interactiveshell.py", line 3325, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-588-70b1808d5457>", line 2, in <module>
df.apply(lambda s: s.str.extract('-(.+)'))
File "C:\ProgramData\Miniconda3\envs\py3\lib\site-packages\pandas\core\frame.py", line 6487, in apply
return op.get_result()
File "C:\ProgramData\Miniconda3\envs\py3\lib\site-packages\pandas\core\apply.py", line 151, in get_result
return self.apply_standard()
File "C:\ProgramData\Miniconda3\envs\py3\lib\site-packages\pandas\core\apply.py", line 260, in apply_standard
return self.wrap_results()
File "C:\ProgramData\Miniconda3\envs\py3\lib\site-packages\pandas\core\apply.py", line 308, in wrap_results
return self.wrap_results_for_axis()
File "C:\ProgramData\Miniconda3\envs\py3\lib\site-packages\pandas\core\apply.py", line 340, in wrap_results_for_axis
result = self.obj._constructor(data=results)
File "C:\ProgramData\Miniconda3\envs\py3\lib\site-packages\pandas\core\frame.py", line 392, in __init__
mgr = init_dict(data, index, columns, dtype=dtype)
File "C:\ProgramData\Miniconda3\envs\py3\lib\site-packages\pandas\core\internals\construction.py", line 212, in init_dict
return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
File "C:\ProgramData\Miniconda3\envs\py3\lib\site-packages\pandas\core\internals\construction.py", line 51, in arrays_to_mgr
index = extract_index(arrays)
File "C:\ProgramData\Miniconda3\envs\py3\lib\site-packages\pandas\core\internals\construction.py", line 308, in extract_index
raise ValueError('If using all scalar values, you must pass'
ValueError: If using all scalar values, you must pass an index

使用 axis=1 会产生意想不到的结果,一个 Series,每一行都是一个 Series:

Out[2]: 
0 0
0 3
1 4
1 0
0 box
1 go
dtype: object

我正在使用 apply,因为我认为这会导致最快的执行时间,但对其他建议持开放态度

最佳答案

您可以使用 split相反。

df.apply(lambda s: s.str.split('-', expand=True)[1])

Out[1]:
0 1
0 3 4
1 box go

关于python - DataFrame.apply 与 str.extract 抛出错误,即使函数适用于每个列系列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58794687/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com