gpt4 book ai didi

python - 使用 pd.read_clipboard 加载列跨越多行的数据帧

转载 作者:太空宇宙 更新时间:2023-11-03 11:42:43 25 4
gpt4 key购买 nike

给定来自 another question 的数据集:

    user                             item  \
0 b80344d063b5ccb3212f76538f3d9e43d87dca9e The Cove - Jack Johnson
1 b80344d063b5ccb3212f76538f3d9e43d87dca9e Entre Dos Aguas - Paco De Lucia
2 b80344d063b5ccb3212f76538f3d9e43d87dca9e Stronger - Kanye West
3 b80344d063b5ccb3212f76538f3d9e43d87dca9e Constellations - Jack Johnson
4 b80344d063b5ccb3212f76538f3d9e43d87dca9e Learn To Fly - Foo Fighters

rating
0 1
1 2
2 1
3 1
4 1

有没有办法以预期的格式加载此类数据,而无需手动将所有内容移动到同一行?

最佳答案

其中一种方法是根据 \n\n 进行拆分,然后创建单独的数据帧,然后将它们连接起来。即

#Bit of code from https://stackoverflow.com/questions/45740537/copying-multiindex-dataframes-with-pd-read-clipboard

def read_clipboard_split(index_names_row=None, **kwargs):
encoding = kwargs.pop('encoding', 'utf-8')

# only utf-8 is valid for passed value because that's what clipboard
# supports
if encoding is not None and encoding.lower().replace('-', '') != 'utf8':
raise NotImplementedError(
'reading from clipboard only supports utf-8 encoding')

from pandas import compat, read_fwf
from pandas.io.clipboard import clipboard_get
from pandas.io.common import StringIO

data = clipboard_get()
items = data.split("\n\n")
k = []
for i in items:
k.append(read_fwf(StringIO(i), **kwargs))
df = pd.concat(k,axis=1)
return df

read_clipboard_split()

示例运行:

     user                       \      0  b80344d063b5ccb3212f76538f3d9e43d87dca9e1  b80344d063b5ccb3212f76538f3d9e43d87dca9e  2  b80344d063b5ccb3212f76538f3d9e43d87dca9e   3  b80344d063b5ccb3212f76538f3d9e43d87dca9e   4  b80344d063b5ccb3212f76538f3d9e43d87dca9e      rating  0       1  1       2  2       1  3       1  4       1 

输出:

   Unnamed: 0              user                       \  Unnamed: 0  rating0  0           b80344d063b5ccb3212f76538f3d9e43d87dca9e  0           1     1  1           b80344d063b5ccb3212f76538f3d9e43d87dca9e  1           2     2  2           b80344d063b5ccb3212f76538f3d9e43d87dca9e  2           1     3  3           b80344d063b5ccb3212f76538f3d9e43d87dca9e  3           1     4  4           b80344d063b5ccb3212f76538f3d9e43d87dca9e  4           1     

关于python - 使用 pd.read_clipboard 加载列跨越多行的数据帧,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45883042/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com