gpt4 book ai didi

R(arules)将数据框转换为事务并删除 NA

转载 作者:行者123 更新时间:2023-12-02 20:42:14 24 4
gpt4 key购买 nike

我有一组数据框。我的目的是将数据框转换为交易数据,以便使用 R 中的 Arules 包进行购物篮分析。我在网上做了一些关于将数据框转换为交易数据的研究,例如( How to prep transaction data into basket for arules )和( Transform csv into transactions for arules ),但我得到的结果是不同的。

输入(df)

structure(list(Transaction_ID = c("A001", "A002", "A003", "A004", "A005", "A006"), 
Fruits = c(NA, "Apple", "Orange", NA, "Pear", "Grape"),
Vegetables = c(NA, NA, NA, "Potato", NA, "Yam"),
Personal = c("ToothP", "ToothP", NA, "ToothB", "ToothB", NA),
Drink = c("Coff", NA, "Coff", "Milk", "Milk", "Coff"),
Other = c(NA, NA, NA, NA, "Promo", NA)),
.Names = c("Transaction_ID", "Fruits", "Vegetables", "Personal", "Drink", "Other"),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L))

下面是我的数据框结构

Transaction_ID  Fruits  Vegetables  Personal  Drink  Other
A001 NA NA ToothP Coff NA
A002 Apple NA ToothP NA NA
A003 Orange NA NA Coff NA
A004 NA Potato ToothB Milk NA
A005 Pear NA ToothB Milk Promo
A006 Grape Yam NA Coff NA

每列的类

sapply(df, class)
Transaction_ID Fruits Vegetables Personal Drink Other
"character" "character" "character" "character" "character" "character"

将数据帧转换为交易数据

data <- as(split(df[,"Fruits"], df[,"Vegetables"],df[,"Personal"], df[,"Drink"], df[,"Other"]), "transactions")
inspect(data)

我得到的结果

[1] {NA,NA,ToothP,Coff,NA}
[2] {Apple,NA,ToothP,NA,NA}
[3] {Orange,NA,NA,Coff,NA}
[4] {NA,Potato,ToothB,Milk,NA}
[5] {Pear,NA,ToothB,Milk,Promo}
[6] {Grape,Yam,NA,Coff,NA}

交易数据转换成功,但请问有什么方法可以去掉NA的元素吗?因为如果它们仍然保留在交易列表中,NA 会将其视为一个项目。

最佳答案

奥古斯塔里是对的。这是同时处理交易 ID 的完整代码。

library("arules")
library("dplyr") ### for dbl_df
df <- structure(list(Transaction_ID = c("A001", "A002", "A003", "A004", "A005", "A006"),
Fruits = c(NA, "Apple", "Orange", NA, "Pear", "Grape"),
Vegetables = c(NA, NA, NA, "Potato", NA, "Yam"),
Personal = c("ToothP", "ToothP", NA, "ToothB", "ToothB", NA),
Drink = c("Coff", NA, "Coff", "Milk", "Milk", "Coff"),
Other = c(NA, NA, NA, NA, "Promo", NA)),
.Names = c("Transaction_ID", "Fruits", "Vegetables", "Personal", "Drink", "Other"),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L))

### remove transaction IDs
tid <- as.character(df[["Transaction_ID"]])
df <- df[,-1]

### make all columns factors
for(i in 1:ncol(df)) df[[i]] <- as.factor(df[[i]])

trans <- as(df, "transactions")

### set transactionIDs
transactionInfo(trans)[["transactionID"]] <- tid

inspect(trans)

items transactionID
[1] {Personal=ToothP,Drink=Coff} A001
[2] {Personal=ToothP} A002
[3] {Drink=Coff} A003
[4] {Vegetables=Potato,Personal=ToothB,Drink=Milk} A004
[5] {Personal=ToothB,Drink=Milk,Other=Promo} A005
[6] {Vegetables=Yam,Drink=Coff} A006

关于R(arules)将数据框转换为事务并删除 NA,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45773861/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com