gpt4 book ai didi

python - '%s 不在索引中' % objarr[掩码]

转载 作者:行者123 更新时间:2023-11-30 22:35:32 25 4
gpt4 key购买 nike

我有以下代码:

##Overall reported expertise men vs women

import sys, re
import numpy as np
import smtplib
import matplotlib.pyplot as plt
from random import randint
import csv
import pylab as pl
import math
import pandas as pd
from pandas.tools.plotting import scatter_matrix
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('-inVar', '--x', help = 'independent variable')

if len(sys.argv) == 1:
parser.print_help()
sys.exit(1)

args = parser.parse_args()

##Manipulating data so it can be graphed more easily
df1 = pd.read_csv('atc17-pcinfo.csv')
df1['Gender'] = df1['Gender'].replace(['M'], int(1))
df1['Gender'] = df1['Gender'].replace(['F'], int(2))
df1['Gender'] = df1['Gender'].convert_objects(convert_numeric = True)

x = df1['Gender']
y = df1['topic: Big data infrastructure']
print list(df1)

ax = df1.plot.scatter(x = x, y = y)
labels = [item.get_text() for item in ax.get_xticklabels()]
labels[1] = 'M'
labels[6] = 'F'
ax.set_xticklabels(labels)
#ax.set_title(y + ' vs. ' + x, fontsize=20)
plt.xlabel(x, fontsize=16)
plt.ylabel(y, fontsize=16)
plt.show()

我正在尝试使用 df1 中的数据创建散点图。在我处理数据后,我有:

 Gender    topic: Big data infrastructure
0 2 NaN
1 1 -1
2 1 -1
3 1 -1
4 2 1
5 1 NaN
6 1 NaN
7 1 NaN
8 1 -2
9 1 1
10 2 1
11 1 NaN
12 1 1
13 1 -1
14 1 1
15 1 NaN
16 1 NaN
17 1 NaN
18 1 -1
19 1 -2
20 2 1
21 1 NaN
22 1 NaN
23 2 2
24 1 -2
25 2 2
26 1 NaN
27 1 2
28 1 1
29 1 NaN
30 1 2
31 1 NaN
32 1 NaN
33 2 2
34 1 2

但我收到此错误:

     KeyError('%s not in index' % objarr[mask])
KeyError: '[ nan -1. -1. -1. 1. nan nan nan -2. 1. 1. nan 1. -1. 1.\n nan nan nan -1. -2. 1. nan nan 2. -2. 2. nan 2. 1. nan\n 2. nan nan 2. 2.] not in index'

有人能帮我找出原因吗?我查看了一些资料来源,但我不明白我的例子与他们的例子有什么关系。

最佳答案

我认为你有两个问题:

第一个是您滥用了 scatter 方法的 xy 参数。应向它们传递所需列的列名称,而不是实际值!因此,应该这样使用:

ax = df1.plot.scatter(x = "Gender", y = "topic: Big data infrastructure")

您的第二个问题是您没有像处理“性别”列那样将“大数据”列转换为数值。

这应该可以完成工作:

df1['topic: Big data infrastructure'] = df1['topic: Big data infrastructure'].convert_objects(convert_numeric = True)

由于您在 DataFrame 操作期间会大量使用列名称,因此我建议您使用更短、更容易的名称...

下面是一个工作示例:

# Read your copied df, saved as test.csv
df1 = pd.read_csv("test.csv",sep=",")

#rename df for easier work
df1.columns = ["Gender","Big_Data"]

# convert strings into floats/integers
df1 = df1.convert_objects(convert_numeric=True)

#Create figure by selecting desired columns as input x and y
ax = df1.plot.scatter(x = "Gender", y ="Big_Data")
fig = ax.get_figure()
fig.savefig('its_working.png')

关于python - '%s 不在索引中' % objarr[掩码],我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44549755/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com