gpt4 book ai didi

r - 在 R 中将分类变量转换为数值

转载 作者:行者123 更新时间:2023-12-01 21:34:25 26 4
gpt4 key购买 nike

我有一个巨大的数据库,并且有许多分类变量。您可以在这里观看:

> M=data.frame(Type_peau,PEAU_CORPS,SENSIBILITE,IMPERFECTIONS,BRILLANCE ,GRAIN_PEAU,RIDES_VISAGE,ALLERGIES,MAINS,
+ INTERET_ALIM_NATURELLE,INTERET_ORIGINE_GEO,INTERET_VACANCES,INTERET_COMPOSITION,DataQuest1,Priorite2,
+ Priorite1,DataQuest4,Age,Nbre_gift,w,Nbre_achat)
> # pour voir s'il y a des données manquantes
> str(M)
'data.frame': 836 obs. of 21 variables:
$ Type_peau : Factor w/ 5 levels "","Grasse","Mixte",..: 3 4 5 3 4 3 3 3 2 3 ...
$ PEAU_CORPS : Factor w/ 4 levels "","Normale","Sèche",..: 2 3 3 2 2 2 3 2 3 2 ...
$ SENSIBILITE : Factor w/ 4 levels "","Aucune","Fréquente",..: 4 4 4 2 4 3 4 2 4 4 ...
$ IMPERFECTIONS : Factor w/ 4 levels "","Fréquente",..: 3 4 3 4 3 2 3 4 3 3 ...
$ BRILLANCE : Factor w/ 4 levels "","Aucune","Partout",..: 4 2 2 4 4 4 4 4 3 4 ...
$ GRAIN_PEAU : Factor w/ 4 levels "","Dilaté","Fin",..: 4 4 4 2 4 2 4 4 2 4 ...
$ RIDES_VISAGE : Factor w/ 4 levels "","Aucune","Très visibles",..: 2 2 2 4 4 2 4 2 4 2 ...
$ ALLERGIES : Factor w/ 4 levels "","Non","Oui",..: 2 2 2 2 2 2 2 2 2 2 ...
$ MAINS : Factor w/ 4 levels "","Moites","Normales",..: 3 4 4 3 3 3 3 4 4 4 ...
$ INTERET_ALIM_NATURELLE: Factor w/ 4 levels "","Beaucoup",..: 2 4 4 4 2 2 2 4 4 2 ...
$ INTERET_ORIGINE_GEO : Factor w/ 5 levels "","Beaucoup",..: 2 4 2 5 2 2 2 2 2 2 ...
$ INTERET_VACANCES : Factor w/ 6 levels "","À la mer",..: 3 4 2 2 3 2 3 2 3 2 ...
$ INTERET_COMPOSITION : Factor w/ 4 levels "","Beaucoup",..: 2 2 2 4 2 2 2 2 4 2 ...
$ DataQuest1 : Factor w/ 4 levels "-20","20-30",..: 4 3 4 4 4 3 3 2 3 2 ...
$ Priorite2 : Factor w/ 7 levels "éclatante","hydratée",..: 3 1 3 4 3 2 7 1 4 6 ...
$ Priorite1 : Factor w/ 7 levels "éclatante","hydratée",..: 4 6 1 5 1 6 1 2 6 4 ...
$ DataQuest4 : Factor w/ 2 levels "nature","urbain": 2 2 2 2 2 1 2 2 2 2 ...
$ Age : int 32 37 23 44 33 30 43 43 60 31 ...
$ Nbre_gift : int 1 4 1 1 2 1 1 1 1 1 ...
$ w : num 0.25 0.25 0.5 0.25 0.5 0 0 0 0 0.75 ...
$ Nbre_achat : int 3 4 7 3 6 9 22 13 7 16 ...

我需要自动将所有分类变量转换为数字。例如,对于变量 Type_peau,它是:

 head(Type_peau)
[1] Mixte Normale Sèche Mixte Normale Mixte
Levels: Grasse Mixte Normale Sèche

我想要它:

head(Type_peau)
[1] 2 3 4 2 3 2
Levels: 1 2 3 4

如何对所有分类变量自动执行此操作?

最佳答案

您可以使用unclass()来显示因子变量的数值:

Type_peau<-as.factor(c("Mixte","Normale","Sèche","Mixte","Normale","Mixte"))
Type_peau
unclass(Type_peau)

要对所有分类变量执行此操作,您可以使用 sapply() :

must_convert<-sapply(M,is.factor)       # logical vector telling if a variable needs to be displayed as numeric
M2<-sapply(M[,must_convert],unclass) # data.frame of all categorical variables now displayed as numeric
out<-cbind(M[,!must_convert],M2) # complete data.frame with all variables put together

编辑:A5C1D2H2I1M1N2O1R2T1's solution一步完成:

out<-data.matrix(M)

只有当你的 data.frame 不包含任何字符变量时它才有效(否则,它们将被放入 NA)。

关于r - 在 R 中将分类变量转换为数值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47922184/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com