python - 如何遍历 Pandas 数据透视表？ (多索引数据框？)-6ren

python - 如何遍历 Pandas 数据透视表？ (多索引数据框？)

转载作者：太空宇宙更新时间：2023-11-04 05:55:23

我有一个要迭代的数据透视表，以存储在数据库中。

                                           age  weekly_income
category_weekly_income category_age
High income            Middle aged   45.527721   15015.463667
                       Old           70.041456   14998.104486
                       Young         14.995210   15003.750822
Low income             Middle aged   45.548155    1497.228548
                       Old           70.049987    1505.655319
                       Young         15.013538    1501.718198
Middle income          Middle aged   45.516583    6514.830294
                       Old           69.977657    6494.626962
                       Young         15.020688    6487.661554

我玩过 reshape、melt、各种 for 循环、黑暗中的语法刺、堆栈链、unstacks、reset_indexes 等。我得到的最接近的语法是:

crosstab[1:2].age

有了这个，我可以提取单个值单元格，但是我无法获取索引的值。

最佳答案

不需要迭代dataframe，pandas已经提供了dataframe转sql的方法DataFrame.to_sql(...) .

或者，如果您想手动将数据插入数据库，您可以使用 Pandas 的 to_csv() ，例如:

我有一个这样的 df:

df
                     A         B
first second                    
bar   one     0.826425 -1.126757
      two     0.682297  0.875014
baz   one    -1.714757 -0.436622
      two    -0.366858  0.341702
foo   one    -1.068390 -1.074582
      two     0.863934  0.043367
qux   one    -0.510881  0.215230
      two     0.760373  0.274389

# set header=False, and index=True to get the MultiIndex from pivot    
print df.to_csv(header=False, index=True)

bar,one,0.8264252111679552,-1.1267570930327846
bar,two,0.6822970851678805,0.8750144682657339
baz,one,-1.7147570530422946,-0.43662238320911956
baz,two,-0.3668584476904599,0.341701643567155
foo,one,-1.068390451744478,-1.0745823278191735
foo,two,0.8639343368644695,0.043366628502542914
qux,one,-0.5108806384876237,0.21522973766619563
qux,two,0.7603733646419842,0.2743886250125428

这将为您提供一个很好的逗号分隔格式，可以很容易地在 sql 执行查询中使用，例如:

data = []
for line in df.to_csv(header=False, index=True).split('\n'):
    if line:
        data.append(tuple(line.split(',')))

data

[('bar', 'one', '0.8264252111679552', '-1.1267570930327846'),
 ('bar', 'two', '0.6822970851678805', '0.8750144682657339'),
 ('baz', 'one', '-1.7147570530422946', '-0.43662238320911956'),
 ('baz', 'two', '-0.3668584476904599', '0.341701643567155'),
 ('foo', 'one', '-1.068390451744478', '-1.0745823278191735'),
 ('foo', 'two', '0.8639343368644695', '0.043366628502542914'),
 ('qux', 'one', '-0.5108806384876237', '0.21522973766619563'),
 ('qux', 'two', '0.7603733646419842', '0.2743886250125428')]

那么只需要执行一个executemany:

...
stmt = "INSERT INTO table (first, second, A, B) VALUES (%s, %s, %s, %s)"
cursor.executemany(stmt, data)
...

希望这对您有所帮助。

关于python - 如何遍历 Pandas 数据透视表？ (多索引数据框？)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28097319/

文章推荐： c - 格式化代码以访问结构类型的数组索引

文章推荐： javascript - Slick Slider fade true 不适用于垂直 slider

文章推荐： python - SAML AuthnRequest 的 XML 命名空间重要吗？

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 如何遍历 Pandas 数据透视表？ (多索引数据框？)