作者热门文章
- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我正在读取多个 .csv 文件作为具有相同形状的 panda DataFrame。对于某些索引,某些值为零,因此我想选择具有相同形状的每个索引的值,并为相同的索引放置零值并删除零以成为相同的形状:
a = pd.DataFrame(pd.read_csv("path_a",index_col=0))
b = pd.DataFrame(pd.read_csv("path_b",index_col=0))
c = pd.DataFrame(pd.read_csv("path_c",index_col=0))
print a,"\n",b,"\n",c
L = np.array(a.shape)
X = L[0]
d = a.index.values
a = np.array(a)
b = np.array(b)
c = np.array(c)
for i in range (0,X):
xdata = a[i]
xdata1 = b[i]
xdata2 = c[i]
xdata = np.where(xdata2==0,0,xdata)
xdata1 = np.where(xdata2==0,0,xdata1)
xdata1 = np.where(xdata==0,0,xdata1)
xdata2 = np.where(xdata==0,0,xdata2)
xdata = np.where(xdata1==0,0,xdata)
xdata2 = np.where(xdata1==0,0,xdata2)
indexX = np.argwhere(xdata==0)
index1X = np.argwhere(xdata1==0)
index2X = np.argwhere(xdata2==0)
xdata = np.delete(xdata,indexX)
xdata1 = np.delete(xdata1,index1X)
xdata2 = np.delete(xdata2,index2X)
print d[i],"\n",xdata,"\n",xdata1,"\n",xdata2
1980 1985 1990 1995 2000 2005 2010
ISO3
AFG 0.0 0.0 3.8 0.0 0.0 9.8 0.0
AGO 2.0 0.0 3.0 4.0 0.0 0.0 0.0
ALB 0.0 0.2 0.5 0.2 1.3 1.6 2.7
AND 0.0 0.0 0.0 0.0 0.0 0.0 0.0
ARE 0.7 0.8 0.9 1.7 2.3 2.7 3.0
ARG 3.1 6.7 5.3 15.1 17.2 18.2 18.7
ARM 0.4 0.5 0.5 0.5 0.4 1.2 1.3
1980 1985 1990 1995 2000 2005 2010
ISO3
AFG 2.5 0.0 0.0 4.7 0.0 0.0 0.0
AGO 13.1 14.9 15.8 16.4 16.9 17.6 18.1
ALB 1.4 1.5 1.6 1.6 1.6 1.6 1.7
AND 0.2 0.2 0.2 0.2 0.1 0.4 0.6
ARE 0.0 0.0 0.0 0.0 0.0 0.0 0.0
ARG 1.8 1.8 1.7 1.8 1.8 1.9 1.9
ARM 1.8 1.8 1.7 0.0 1.8 1.9 1.5
1980 1985 1990 1995 2000 2005 2010
ISO3
AFG 0.0 0.0 0.0 0.0 0.0 0.0 0.0
AGO 0.0 0.0 4.7 5.8 6.0 0.0 0.0
ALB 0.0 0.2 0.5 0.2 1.3 1.6 2.7
AND 1.4 1.8 2.3 3.7 0.0 0.0 5.4
ARE 0.7 0.8 0.9 1.7 2.3 2.7 3.0
ARG 3.1 6.7 5.3 15.1 17.2 18.2 18.7
ARM 0.4 0.5 0.5 0.5 0.4 1.2 1.3
AFG
[]
[]
[]
AGO
[ 3. 4.]
[ 15.8 16.4]
[ 4.7 5.8]
ALB
[ 0.2 0.5 0.2 1.3 1.6 2.7]
[ 1.5 1.6 1.6 1.6 1.6 1.7]
[ 0.2 0.5 0.2 1.3 1.6 2.7]
AND
[]
[]
[]
ARE
[]
[]
[]
ARG
[ 3.1 6.7 5.3 15.1 17.2 18.2 18.7]
[ 1.8 1.8 1.7 1.8 1.8 1.9 1.9]
[ 3.1 6.7 5.3 15.1 17.2 18.2 18.7]
ARM
[ 0.4 0.5 0.5 0.4 1.2 1.3]
[ 1.8 1.8 1.7 1.8 1.9 1.5]
[ 0.4 0.5 0.5 0.4 1.2 1.3]
这段代码可以工作,但这是一种尝试性的方法,当数据量很大时效率不高。您能否建议我一种更有效的方法以及如何根据最小长度索引选择数据?
最佳答案
一个想法是乘以所有 3 个数组,然后测试它是否不是 0
,也可以在列表 L1
中使用 3 个数组循环。然后逻辑也发生了变化 - 选择与掩码不匹配的值,而不是 np.argwhere
和 np.delete
:
L = np.array(a.shape)
X = L[0]
d = a.index.values
a = np.array(a)
b = np.array(b)
c = np.array(c)
m = (a * b * c) != 0
L1 = [a,b,c]
for i in range (0,X):
for arr in L1:
xdata = arr[i][m[i]]
print (xdata)
如果使用 pandas 0.24+,那么转换为 numpy 数组的更好方法是使用 to_numpy
:
L = np.array(a.shape)
X = L[0]
d = a.index.to_numpy()
a = a.to_numpy()
b = b.to_numpy()
c = c.to_numpy()
m = (a * b * c) != 0
L1 = [a,b,c]
for i in range (0,X):
for arr in L1:
xdata = arr[i][m[i]]
print (xdata)
编辑:
L = np.array(a.shape)
X = L[0]
d = a.index.to_numpy()
a = a.to_numpy()
b = b.to_numpy()
c = c.to_numpy()
m = (a * b * c) != 0
L1 = [a,b,c]
for i in range (0,X):
out = []
for arr in L1:
xdata = arr[i][m[i]]
out.append(xdata)
data = np.vstack((out))
print (data)
[]
[[ 3. 4. ]
[15.8 16.4]
[ 4.7 5.8]]
[[0.2 0.5 0.2 1.3 1.6 2.7]
[1.5 1.6 1.6 1.6 1.6 1.7]
[0.2 0.5 0.2 1.3 1.6 2.7]]
[]
[]
[[ 3.1 6.7 5.3 15.1 17.2 18.2 18.7]
[ 1.8 1.8 1.7 1.8 1.8 1.9 1.9]
[ 3.1 6.7 5.3 15.1 17.2 18.2 18.7]]
[[0.4 0.5 0.5 0.4 1.2 1.3]
[1.8 1.8 1.7 1.8 1.9 1.5]
[0.4 0.5 0.5 0.4 1.2 1.3]]
关于python - 如何将选定的数据转换为相同的长度(形状),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59464135/
我是一名优秀的程序员,十分优秀!