gpt4 book ai didi

python - 使用多个变量转换 Panda 文档

转载 作者:太空宇宙 更新时间:2023-11-04 04:49:27 37 4
gpt4 key购买 nike

我正在处理我的 Pandafile,但我仍然没有想出如何解决这个问题。

我有以下 Pandas 对象:

pandaFile = pd.DataFrame([{'var1': 'Restaurant A','var2':'4.5','var3':
['AA','BB','CC'],'var4':['User1','User2','User3'],'var5':['Review 1','Review
2','Review 3']},{'var1': 'Restaurant B','var2':'5.0','var3':
['AA','BB','CC'],'var4':['User1','User2','User3'], 'var5':['Review 1','Review
2','Review 3']}])
print(pandaFile)

看起来像这样:

   var1            var2   var3           var4                  var5  
0 Restaurant A 4.5 [AA, BB, CC] [User1, User2, User3] [Review 1, Review 2, Review 3]
1 Restaurant B 5.0 [AA, BB, CC] [User1, User2, User3] [Review 1, Review 2, Review 3]

我想得到以下输出:

         var1 var2          var3   var4      var5
0 Restaurant A 4.5 [AA, BB, CC] User1 Review 1
1 Restaurant A 4.5 [AA, BB, CC] User2 Review 2
2 Restaurant A 4.5 [AA, BB, CC] User3 Review 3
3 Restaurant B 5.0 [AA, BB, CC] User1 Review 1
4 Restaurant B 5.0 [AA, BB, CC] User2 Review 2
5 Restaurant B 5.0 [AA, BB, CC] User3 Review 3

但我得到以下输出:

        var1 var2          var3   var4      var5
0 Restaurant A 4.5 [AA, BB, CC] User1 Review 1
1 Restaurant A 4.5 [AA, BB, CC] User1 Review 2
2 Restaurant A 4.5 [AA, BB, CC] User1 Review 3
3 Restaurant A 4.5 [AA, BB, CC] User2 Review 1
4 Restaurant A 4.5 [AA, BB, CC] User2 Review 2
5 Restaurant A 4.5 [AA, BB, CC] User2 Review 3
6 Restaurant A 4.5 [AA, BB, CC] User3 Review 1
7 Restaurant A 4.5 [AA, BB, CC] User3 Review 2
8 Restaurant A 4.5 [AA, BB, CC] User3 Review 3
9 Restaurant B 5.0 [AA, BB, CC] User1 Review 1
10 Restaurant B 5.0 [AA, BB, CC] User1 Review 2
11 Restaurant B 5.0 [AA, BB, CC] User1 Review 3
12 Restaurant B 5.0 [AA, BB, CC] User2 Review 1
13 Restaurant B 5.0 [AA, BB, CC] User2 Review 2
14 Restaurant B 5.0 [AA, BB, CC] User2 Review 3
15 Restaurant B 5.0 [AA, BB, CC] User3 Review 1
16 Restaurant B 5.0 [AA, BB, CC] User3 Review 2
17 Restaurant B 5.0 [AA, BB, CC] User3 Review 3

获取用户和评论的多行是错误的。

我试图用下面的代码解决这个问题:

mva_cols = ['var4', 'var5']
counter = 0

for x in zip(mva_cols):
pandaFile = pd.DataFrame({col:np.repeat(pandaFile[col].values,
pandaFile[mva_cols[counter]].str.len()) for col in
pandaFile.columns.difference([mva_cols[counter]])}).assign(**
{mva_cols[counter]:np.concatenate(pandaFile[mva_cols[counter]].values)})
[pandaFile.columns.tolist()]
counter = counter + 1
print(counter)
print(str(pandaFile).encode('utf-8'))

最佳答案

或者你可以试试

new_df=df.reindex(df.index.repeat(df.var5.str.len()))
new_df.assign(var4=df.var4.sum(),var5=df.var5.sum())
Out[1022]:
var1 var2 var3 var4 var5
0 Restaurant A 4.5 [AA, BB, CC] User1 Review 1
0 Restaurant A 4.5 [AA, BB, CC] User2 Review 2
0 Restaurant A 4.5 [AA, BB, CC] User3 Review 3
1 Restaurant B 5.0 [AA, BB, CC] User1 Review 1
1 Restaurant B 5.0 [AA, BB, CC] User2 Review 2
1 Restaurant B 5.0 [AA, BB, CC] User3 Review 3

关于python - 使用多个变量转换 Panda 文档,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48732021/

37 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com