gpt4 book ai didi

Python PCA - 投影到低维空间

转载 作者:行者123 更新时间:2023-11-28 22:37:13 26 4
gpt4 key购买 nike

我正在尝试实现 PCA,它在特征值和特征向量等中间结果方面运行良好。然而,当我尝试将数据(3 维)投影到 2D 主成分空间时,结果是错误的。我花了很多时间将我的代码与其他实现进行比较,例如:

http://sebastianraschka.com/Articles/2014_pca_step_by_step.html

可是弄了半天没有任何进展,也找不到错误。由于正确的中间结果,我认为问题是一个简单的编码错误。预先感谢真正阅读此问题的任何人,更要感谢那些提供有用评论/答案的人。

我的代码如下:

import numpy as np

class PCA():
def __init__(self, X):
#center the data
X = X - X.mean(axis=0)
#calculate covariance matrix based on X where data points are represented in rows
C = np.cov(X, rowvar=False)
#get eigenvectors and eigenvalues
d,u = np.linalg.eigh(C)
#sort both eigenvectors and eigenvalues descending regarding the eigenvalue
#the output of np.linalg.eigh is sorted ascending, therefore both are turned around to reach a descending order
self.U = np.asarray(u).T[::-1]
self.D = d[::-1]

**problem starts here**

def project(self, X, m):
#use the top m eigenvectors with the highest eigenvalues for the transformation matrix
Z = np.dot(X,np.asmatrix(self.U[:m]).T)
return Z

我的代码的结果是:

 myresult
([[ 0.03463706, -2.65447128],
[-1.52656731, 0.20025725],
[-3.82672364, 0.88865609],
[ 2.22969475, 0.05126909],
[-1.56296316, -2.22932369],
[ 1.59059825, 0.63988429],
[ 0.62786254, -0.61449831],
[ 0.59657118, 0.51004927]])

correct result - such as by sklearn.PCA
([[ 0.26424835, -2.25344912],
[-1.29695602, 0.60127941],
[-3.59711235, 1.28967825],
[ 2.45930604, 0.45229125],
[-1.33335186, -1.82830153],
[ 1.82020954, 1.04090645],
[ 0.85747383, -0.21347615],
[ 0.82618248, 0.91107143]])

The input is defined as follows:
X = np.array([
[-2.133268233289599,0.903819474847349,2.217823388231679,-0.444779660856219,-0.661480010318842,-0.163814281248453,-0.608167714051449, 0.949391996219125],
[-1.273486742804804,-1.270450725314960,-2.873297536940942, 1.819616794091556,-2.617784834189455, 1.706200163080549,0.196983250752276,0.501491995499840],
[-0.935406638147949,0.298594472836292,1.520579082270122,-1.390457671168661,-1.180253547776717,-0.194988736923602,-0.645052874385757,-1.400566775105519]]).T

最佳答案

在将数据投影到新的基础上之前,您需要通过减去平均值来使数据居中:

mu = X.mean(0)
C = np.cov(X - mu, rowvar=False)
d, u = np.linalg.eigh(C)
U = u.T[::-1]
Z = np.dot(X - mu, U[:2].T)

print(Z)
# [[ 0.26424835 -2.25344912]
# [-1.29695602 0.60127941]
# [-3.59711235 1.28967825]
# [ 2.45930604 0.45229125]
# [-1.33335186 -1.82830153]
# [ 1.82020954 1.04090645]
# [ 0.85747383 -0.21347615]
# [ 0.82618248 0.91107143]]

关于Python PCA - 投影到低维空间,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36771525/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com