- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我正在处理看起来像这张表但更大的遗传数据:
ID allele.a allele.b
A 115 90
A 115 90
A 116 90
B 120 82
B 120 82
B 120 82M
library(xlsx)
library(openxlsx)
library(tidyverse)
# Small data.frame
dframe <- data.frame(ID = c("A", "A", "A", "B", "B", "B"),
allele.a = c("115", "115", "116", "120", "120", "120"),
allele.b = c("90", "90", "90", "82", "82", "82M"),
stringsAsFactors = F)
# Bigger data.frame for speed test
# dframe <- data.frame(ID = rep(letters, each = 30),
# allele.a = rep(as.character(round(rnorm(n = 30, mean = 100, sd = 0.3), 0)), 26),
# allele.b = rep(as.character(round(rnorm(n = 30, mean = 90, sd = 0.3), 0)), 26),
# allele.c = rep(as.character(round(rnorm(n = 30, mean = 80, sd = 0.3), 0)), 26),
# allele.d = rep(as.character(round(rnorm(n = 30, mean = 70, sd = 0.3), 0)), 26),
# allele.e = rep(as.character(round(rnorm(n = 30, mean = 60, sd = 0.3), 0)), 26),
# allele.f = rep(as.character(round(rnorm(n = 30, mean = 50, sd = 0.3), 0)), 26),
# allele.g = rep(as.character(round(rnorm(n = 30, mean = 40, sd = 0.3), 0)), 26),
# allele.h = rep(as.character(round(rnorm(n = 30, mean = 30, sd = 0.3), 0)), 26),
# allele.i = rep(as.character(round(rnorm(n = 30, mean = 20, sd = 0.3), 0)), 26),
# allele.j = rep(as.character(round(rnorm(n = 30, mean = 10, sd = 0.3), 0)), 26),
# stringsAsFactors = F)
# Create a new excel workbook ----
wb <- createWorkbook()
# Add a worksheets
addWorksheet(wb, sheet = 1, gridLines = TRUE)
# add the data to the worksheet
writeData(wb, sheet = 1, dframe, rowNames = FALSE)
# Create a style to show alleles that do not match the first row.
style_Red_NoMatch <- createStyle(fontColour = "#FFFFFF", # white text
bgFill = "#CC0000", # Dark red background
textDecoration = c("BOLD")) # bold text
Groups <- unique(dframe$ID)
start_time <- Sys.time()
# For each unique group,
for(i in 1:length(Groups)){
# Print a message telling us where the script is processing in the file.
print(paste("Formatting unique group ", i, "/", length(Groups), sep = ""))
# What are the allele values of the *first* individual in the group?
Allele.values <- dframe %>%
filter(ID == Groups[i]) %>%
slice(1) %>%
select(2:ncol(dframe)) %>%
as.character()
# for each column that has allele values in it,
for (j in 1:length(Allele.values)){
# format the rest of the rows so that a value that does not match the first value gets red style
conditionalFormatting(wb, sheet = 1,
style_Red_NoMatch,
rows = (which(dframe$ID == Groups[i]) + 1),
cols = 1+j, rule=paste("<>\"", Allele.values[j], "\"", sep = ""))
}
}
end_time <- Sys.time()
end_time - start_time
saveWorkbook(wb, "Example.xlsx", overwrite = TRUE)
最佳答案
我想改进的一种方法是申请 conditionalFormatting
在整个列上,而不必遍历每个单元格。
这是一种方法。这种方法的一个缺点是它创建了 TRUE
的逻辑向量。和 FALSE
用于conditionalFormatting
.可以使用 setColWidths
隐藏这些列功能。
资料
library(openxlsx)
dframe <- data.frame(ID = rep(letters, each = 30),
allele.a = rep(as.character(round(rnorm(n = 30, mean = 100, sd = 0.3), 0)), 26),
allele.b = rep(as.character(round(rnorm(n = 30, mean = 90, sd = 0.3), 0)), 26),
allele.c = rep(as.character(round(rnorm(n = 30, mean = 80, sd = 0.3), 0)), 26),
allele.d = rep(as.character(round(rnorm(n = 30, mean = 70, sd = 0.3), 0)), 26),
allele.e = rep(as.character(round(rnorm(n = 30, mean = 60, sd = 0.3), 0)), 26),
allele.f = rep(as.character(round(rnorm(n = 30, mean = 50, sd = 0.3), 0)), 26),
allele.g = rep(as.character(round(rnorm(n = 30, mean = 40, sd = 0.3), 0)), 26),
allele.h = rep(as.character(round(rnorm(n = 30, mean = 30, sd = 0.3), 0)), 26),
allele.i = rep(as.character(round(rnorm(n = 30, mean = 20, sd = 0.3), 0)), 26),
allele.j = rep(as.character(round(rnorm(n = 30, mean = 10, sd = 0.3), 0)), 26),
stringsAsFactors = F)
脚本的第一部分没有改变。
# Create a new excel workbook ----
wb <- createWorkbook()
# Add a worksheets
addWorksheet(wb, sheet = 1, gridLines = TRUE)
# Create a style to show alleles that do not match the first row.
style_Red_NoMatch <- createStyle(fontColour = "#FFFFFF", # white text
bgFill = "#CC0000", # Dark red background
textDecoration = c("BOLD")) # bold text
然后识别每个 ID 的第一行并合并到原始数据集中。然后检查任何单元格中是否有任何变化(循环通过每一列)。
# selects first row for each ID which will be used as benchmark
first_row <- dframe[!duplicated(dframe$ID), ]
# Creating new df with the first_row columns added
dframe_chk <- merge(dframe, first_row, by = "ID", all.x = TRUE, suffixes = c("", "_first"))
# Adding TRUE/FALSE factor for each column to see if it matches or not (-1 to exclude ID column)
for (j in names(dframe)[-1]) {
dframe_chk[, paste0(j, "_chk")] <- dframe_chk[, j] == dframe_chk[, paste0(j, "_first")]
}
# Remove _first columns when exporting into Excel
cols <- names(dframe_chk)[!grepl("_first", names(dframe_chk))]
# add the data to the worksheet
writeData(wb, sheet = 1, dframe_chk[, cols], rowNames = FALSE)
# This is for conditional Formatting
# first_row is header
row_start <- 2
# Need to add 1 to cover full range (as first row is header)
row_end <- nrow(dframe) + 1
# first column is ID
col_start <- 2
# last column as per the original dataset
col_end <- ncol(dframe)
# this is to point to the _chk column.
# Note if you have columns more than A-Z then this needs to be adjusted
rule_col <- LETTERS[col_end + 1]
# Using the _chk column to apply conditional formula
conditionalFormatting(wb, sheet = 1,
style_Red_NoMatch,
rows = row_start:row_end,
cols = col_start:col_end,
rule = paste0(rule_col, "2 = FALSE"))
# Exported file includes _chk columns. Hide these columns.
setColWidths(wb, sheet = 1, cols = (col_end + 1):length(cols), hidden = TRUE)
saveWorkbook(wb, "Example2.xlsx", overwrite = TRUE)
关于r - 在 R 中使用 openxlsx 进行条件格式化的 Tidyverse/更快的解决方案?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50992957/
我正在尝试安装 tidyverse 包以便在我的脚本中使用 gather 函数。每次我尝试安装软件包时都会收到以下消息: * installing *source* package ‘curl’ ..
当我加载tidyVerse时,我收到以下错误。几分钟前,当我运行我的shinyapp时,一切都很好。我该如何解决这个问题呢?
没有名为“tidyverse”的包是我在执行此操作后收到的错误消息: install.packages('tidyverse', dependencies = T); install.packages
我想在不重复的数据帧行之间创建所有可能的对(即 A_B 与 B_A 相同)。 在 tidyverse 中是否有一种优雅的方式来做到这一点? 示例数据: df df_pairs # A tibble:
我想加入两个数据框,我需要将“by”列作为动态列传递。我试图在此处遵循此解决方案 ( How to pass column names for inner join by 2 column sets
我正在尝试编写一些代码来检查字符串是否包含术语列表中包含的任何单词,以便在数据框中创建一个新列。 这是术语列表: vehicles % mutate( asset_type = case_
我有以下数据框: dat % rowwise() %>% mutate(my_ranks = list(rank(c_across(starts_with("x"))))) 但是当我尝试取消嵌
我有一个包含多个变量的数据集,其中两个是日期(开始日期、结束日期)。有时日期间隔已被拆分为序列,例如,您将: 开始:1990-12-12,停止:1990-12-13开始:1990-12-13,停止:1
我正在尝试在 R 中进行库存计算,这需要对每个 Mat-Plant 组合进行逐行计算。这是一个测试数据集 - df 300K 行,所以希望用 tidyverse 做到这一点以获得更优雅和更快的方法。尝
我有我想与我只有开始日期的事件匹配的日期。作为一个简化的代表,假设我想弄清楚在某些事件中谁是总统,但我只有就职日期。 pres % left_join(pres, by = c("date
我想创建具有中间函数的 tidyverse。我有一个结构 temp1 = sapply(df, function(x) .....) temp2 = sapply(temp1, function(x)
是否可以relocate 行 在 tidyverse框架就像可以用于带有 dplyr 的列一样relocate ? 在这个例子中,我想将第 1 行重新定位到位置 5(数据帧的结尾) 我的数据框: df
我想知道是否有人知道 dplyr 扩展包( dbplyr 和 dtplyr )是否允许在通常的 dplyr 工作流程中进行非对等连接?我很少需要 data.table ,但快速非 equi 连接是我总
我想在分组后汇总时,计算另一个因素的特定级别的数量。 在下面的工作示例中,我想计算每个组中 "male" 级别的数量。我已经尝试了很多计数、计数等方法,但找不到一种简单明了的方法来做到这一点。 df
我有一个数据框,其中包含如下所示的数据: df % group_by(group1,group2,one) %>% summarise(n()).有什么方法可以汇总所有三列,然后将它们全部绑定(bin
当涉及到输出表格时,我正在将统计分析脚本从 SPSS 转换为 R,尽管我不断遇到问题。我最近开始使用 tidyverse 包,因此理想情况下希望找到一个与之兼容的解决方案,但更一般地说,我希望能够针对
我想以编程方式rename() 我的data 中的一些变量,这样我就可以在某个时候通过map 访问它。 我正在寻找等同于, library(tidyverse) mtcars %>% rename(
使用examples从 Wickhams 对 R for data science 的 purrr 的介绍中,我正在尝试创建一个双重嵌套列表。 library(gapminder) library(p
我有一些每周收集的数据,其中的一个片段是这样的,通过 dput: p % gather(time,value,railroad, measure, category) %>%
我有数据,我想使用 tidyverse 方法获取多列的一堆汇总统计信息。但是,利用 tidyverse 的 summarize函数,它会将每个列统计信息创建为一个新列,而我更愿意将列名称视为行,将每个
我是一名优秀的程序员,十分优秀!