R:circlize circos plot - 如何以最小的重叠绘制扇区之间的未连接区域-6ren

R:circlize circos plot - 如何以最小的重叠绘制扇区之间的未连接区域

转载作者：行者123 更新时间：2023-12-04 12:12:28

我有一个数据框，其中包含 4 组患者和细胞类型之间的共同特征。我有很多不同的功能，但共享的(存在于超过 1 个组中)只是少数。

我想制作一个 circos 图，以反射(reflect)患者组和细胞类型之间的共享特征之间的少量联系，同时了解每个组中有多少非共享特征。

按照我的想法，它应该是一个包含 4 个扇区(每组患者和细胞类型一个)的图，它们之间有一些连接。每个扇区大小应反射(reflect)组中的要素总数，并且该区域的大部分不应与其他组相连，而是空的。

这是我目前所拥有的，但我不希望扇区专门用于每个功能，只针对每组患者和细胞类型。

MWE:

library(circlize)

patients <- c(rep("patient1",20), rep("patient2",10))
cell.types <- c(rep("cell1",12), rep("cell2",8),rep("cell1",6), rep("cell2",4))
features <- c(paste("feature",1:12,sep="_"), paste("feature",9:16,sep="_"), paste("feature",c(1,2,9,10,17,18),sep="_"), paste("feature",c(1,18,19,20),sep="_"))
dat <- data.frame(patient=patients, cell.type=cell.types, feature=features)
dat
dat <- with(dat, table(paste(patient,cell.type,sep='|'), feature))
dat

chordDiagram(as.data.frame(dat), transparency = 0.5)

编辑!!

@m-dz 在他的答案中显示的实际上是我正在寻找的格式，4 个不同的患者/细胞类型组合的 4 个扇区，仅显示连接，而非连接的功能，虽然未显示，应该考虑行业的规模。

但是，我意识到我的情况比上述 MWE 中的情况更复杂。

一个特征被认为出现在 2 个患者/细胞类型组中，不仅当它在 2 个组中相同，而且当它相似时。 ..(高于阈值的序列标识)。这样，我就有了裁员……

patient1-cell1 中的功能 A 可以连接到 patient2-cell1 中的功能 A，也可以连接到功能 B...对于 patient1-cell1，功能 A 应该只计算一次(唯一计数)，并扩展到 2 个不同的功能在患者 2 细胞 1 中。

请参阅下面的示例，了解我的实际数据如何更精确，看看使用这个示例是否可以得到最终的 circos 图!谢谢!!

##MWE
#NON OVERLAPPING SETS!

#1: non-shared features
nonshared <- data.frame(patient=c(rep("pat1",20), rep("pat2",10)), cell.type=c(rep("cell1",12), rep("cell2",8),rep("cell1",6), rep("cell2",4)), feature=paste("a",1:30,sep=''))
nonshared

#2: features shared between cell types within same patient
sharedcells <- data.frame(patient=c(rep("pat1",3), rep("pat2",4)), cell.types=c(rep("cell1||cell2",3),rep("cell1||cell2",4)), features=c("b1||b1","b1||b1","b1||b1","b2||b2","b3||b3","b4||b4","b4||b5"))
sharedcells

#3: features shared between patients within same cell types
sharedpats <- data.frame(patients=c(rep("pat1||pat2",2), rep("pat1||pat2",6)), cell.type=c(rep("cell1",2),rep("cell2",6)), features=c("c1||c1","c2||c1","c3||c3","c3||c4","c3||c5","c6||c5","c7||c7","c8||c8"))
sharedpats

#4: features shared between patients and cell types
#4.1: shared across pat1-cell1, pat1-cell2, pat2-cell1, pat2-cell2
sharedall1 <- data.frame(both=c(rep("pat1-cell1||pat1-cell2||pat2-cell1||pat2-cell2",4)), features=c("d1||d1||d1||d1","d2||d2||d2||d3","d4||d4||d3||d3","d5||d5||d5||d5"))
#4.2: shared across pat1-cell1, pat1-cell2, pat2-cell1
sharedall2 <- data.frame(both=c(rep("pat1-cell1||pat1-cell2||pat2-cell1",2)), features=c("d6||d6||d6","d7||d7||d7"))
#4.3: shared across pat1-cell1, pat1-cell2, pat2-cell2
sharedall3 <- data.frame(both="pat1-cell1||pat1-cell2||pat2-cell2", features="d8||d8||d9")
#4.4: shared across pat1-cell1, pat2-cell1, pat2-cell2
sharedall4 <- data.frame(both="pat1-cell1||pat2-cell1||pat2-cell2", features="d10||d10||d9")
#4.5: shared across pat1-cell2, pat2-cell1, pat2-cell2
sharedall5 <- data.frame(both=c(rep("pat1-cell2||pat2-cell1||pat2-cell2",3)), features=c("d11||d11||d11","d12||d13||d13","d12||d14||d14"))
#4.6: shared across pat1-cell1, pat2-cell2
sharedall6 <- data.frame()
#4.7: shared across pat1-cell2, pat2-cell1
sharedall7 <- data.frame(both=c(rep("pat1-cell2||pat2-cell1",2)), features=c("d15||d16","d17||d17"))

sharedall <- rbind(sharedall1, sharedall2, sharedall3, sharedall4, sharedall5, sharedall6, sharedall7)
sharedall
#you see there might be overlaps between the different subsets of sharedall, but not between sharedall, sharedparts, sharedcells, and nonshared

#I NEED A CIRCOS PLOT THAT SHOWS ALL THE CONNECTIONS. THE NON-CONNECTED (nonshared) FEATURES SHOULD NOT BE SHOWN, BUT THE SHOULD COUNT TO THE SIZE OF THE SECTOR (CORRESPONDING TO A PATIENT-CELL COMBINATION)

#THE FEATURES SHOULD BE COUNT UNIQUELY, SO IF THERE ARE ENTRIES LIKE:
#3 pat1||pat2     cell2   c3||c3
#4 pat1||pat2     cell2   c3||c4
#5 pat1||pat2     cell2   c3||c5
#THE FEATURE c3 SHOULD BE COUNT ONCE FOR pat1, AND EXPAND TO 3 DIFFERENT FEATURES IN pat2

最佳答案

关于预期结果的旁注:目的是创建一个简单地显示有多少特征共享的图，忽略单个特征(下面的第一个图)或共享特征重叠(例如，在第二个图中，看起来相同的特征是在所有组之间共享，从第一个图来看这是不正确的，但这里重要的是组之间共享特征的比率。

下面的代码产生以下两个图(左图1供引用):

所有单独的功能

独特和共享功能的简单计数

其中一个应该符合预期。

# Prep. data --------------------------------------------------------------

nonshared <- data.frame(patient=c(rep("pat1",20), rep("pat2",10)), cell.type=c(rep("cell1",12), rep("cell2",8),rep("cell1",6), rep("cell2",4)), feature=paste("a",1:30,sep=''))
sharedcells <- data.frame(patient=c(rep("pat1",3), rep("pat2",4)), cell.types=c(rep("cell1||cell2",3),rep("cell1||cell2",4)), features=c("b1||b1","b1||b1","b1||b1","b2||b2","b3||b3","b4||b4","b4||b5"))
sharedpats <- data.frame(patients=c(rep("pat1||pat2",2), rep("pat1||pat2",6)), cell.type=c(rep("cell1",2),rep("cell2",6)), features=c("c1||c1","c2||c1","c3||c3","c3||c4","c3||c5","c6||c5","c7||c7","c8||c8"))
sharedall1 <- data.frame(both=c(rep("pat1-cell1||pat1-cell2||pat2-cell1||pat2-cell2",4)), features=c("d1||d1||d1||d1","d2||d2||d2||d3","d4||d4||d3||d3","d5||d5||d5||d5"))
sharedall2 <- data.frame(both=c(rep("pat1-cell1||pat1-cell2||pat2-cell1",2)), features=c("d6||d6||d6","d7||d7||d7"))
sharedall3 <- data.frame(both="pat1-cell1||pat1-cell2||pat2-cell2", features="d8||d8||d9")
sharedall4 <- data.frame(both="pat1-cell1||pat2-cell1||pat2-cell2", features="d10||d10||d9")
sharedall5 <- data.frame(both=c(rep("pat1-cell2||pat2-cell1||pat2-cell2",3)), features=c("d11||d11||d11","d12||d13||d13","d12||d14||d14"))
sharedall6 <- data.frame()
sharedall7 <- data.frame(both=c(rep("pat1-cell2||pat2-cell1",2)), features=c("d15||d16","d17||d17"))
sharedall <- rbind(sharedall1, sharedall2, sharedall3, sharedall4, sharedall5, sharedall6, sharedall7)

#I NEED A CIRCOS PLOT THAT SHOWS ALL THE CONNECTIONS. THE NON-CONNECTED (nonshared) FEATURES SHOULD NOT BE SHOWN, BUT THE SHOULD COUNT TO THE SIZE OF THE SECTOR (CORRESPONDING TO A PATIENT-CELL COMBINATION)

#THE FEATURES SHOULD BE COUNT UNIQUELY, SO IF THERE ARE ENTRIES LIKE:
#3 pat1||pat2     cell2   c3||c3
#4 pat1||pat2     cell2   c3||c4
#5 pat1||pat2     cell2   c3||c5
#THE FEATURE c3 SHOULD BE COUNT ONCE FOR pat1, AND EXPAND TO 3 DIFFERENT FEATURES IN pat2



# Start -------------------------------------------------------------------

library(circlize)
library(data.table)
library(magrittr)
library(stringr)
library(RColorBrewer)

# Split and pad with 0 ----------------------------------------------------
fun <- function(x) unlist(tstrsplit(x, split = '||', fixed = TRUE))

nonshared %>% setDT()
sharedcells %>% setDT()
sharedpats %>% setDT()
sharedall %>% setDT()

nonshared <- nonshared[, .(group = paste(patient, cell.type, sep = '-'), feature)][, feature := paste0('a', str_pad(str_extract(feature, '[0-9]+'), 2, 'left', '0'))]
sharedcells <- sharedcells[, lapply(.SD, fun), by = 1:nrow(sharedcells)][, .(group = paste(patient, cell.types, sep = '-'), feature = features)][, feature := paste0('b', str_pad(str_extract(feature, '[0-9]+'), 2, 'left', '0'))]
sharedpats <- sharedpats[, lapply(.SD, fun), by = 1:nrow(sharedpats)][, .(group = paste(patients, cell.type, sep = '-'), feature = features)][, feature := paste0('c', str_pad(str_extract(feature, '[0-9]+'), 2, 'left', '0'))]
sharedall <- sharedall[, lapply(.SD, fun), by = 1:nrow(sharedall)][, .(group = both, feature = features)][, feature := paste0('d', str_pad(str_extract(feature, '[0-9]+'), 2, 'left', '0'))]

dt_split <- rbindlist(
  list(
    nonshared,
    sharedcells,
    sharedpats,
    sharedall
  )
)

# Set key and self join to find shared features ---------------------------
setkey(dt_split, feature)
dt_join <- dt_split[dt_split, .(group, i.group, feature), allow.cartesian = TRUE] %>%
  .[group != i.group, ]

# Create a "sorted key" ---------------------------------------------------
# key := paste(sort(.SD)...
# To leave only unique combinations of groups and features
dt_join <-
  dt_join[,
          key := paste(sort(.SD), collapse = '|'),
          by = 1:nrow(dt_join),
          .SDcols = c('group', 'i.group')
          ] %>%
  setorder(feature, key) %>%
  unique(by = c('key', 'feature')) %>%
  .[, .(
    group_from = i.group,
    group_to = group,
    feature = feature)]

# Rename and key ----------------------------------------------------------

dt_split %>% setnames(old = 'group', new = 'group_from') %>% setkey(group_from, feature)
dt_join %>% setkey(group_from, feature)



# Individual features -----------------------------------------------------

# Features without connections --------------------------------------------

dt_singles <- dt_split[, .(group_from, group_to = group_from, feature)] %>%
  .[, N := .N, by = feature] %>%
  .[!(N > 1 & group_from == group_to), !c('N')]

# Bind all, add some columns etc. -----------------------------------------

dt_bind <- rbind(dt_singles, dt_join) %>% setorder(group_from, feature, group_to)

dt_bind[, ':='(
  group_from_f = paste(group_from, feature, sep = '.'),
  group_to_f = paste(group_to, feature, sep = '.'))]
dt_bind[, feature := NULL]  # feature can be removed

# Colour
dt_bind[, colour := ifelse(group_from_f == group_to_f, "#FFFFFF00", '#00000050')]  # Change first to #FF0000FF to show red blobs

# Prep. sectors -----------------------------------------------------------

sectors_f <- union(dt_bind[, group_from_f], dt_bind[, group_to_f]) %>% sort()

colour_lookup <-
  union(dt_bind[, group_from], dt_bind[, group_to]) %>% sort() %>%
  structure(seq_along(.) + 1, names = .)
sector_colours <- str_replace_all(sectors_f, '.[a-d][0-9]+', '') %>%
  colour_lookup[.]

# Gaps between sectors ----------------------------------------------------

gap_sizes <- c(0.0, 1.0)
gap_degree <-
  sapply(table(names(sector_colours)), function(i) c(rep(gap_sizes[1], i-1), gap_sizes[2])) %>%
  unlist() %>% unname()
# gap_degree <- rep(0, length(sectors_f))  # Or no gap



# Plot! -------------------------------------------------------------------

# Each "sector" is a separate patient/cell/feature combination

circos.par(gap.degree = gap_degree)
circos.initialize(sectors_f, xlim = c(0, 1))
circos.trackPlotRegion(ylim = c(0, 1), track.height = 0.05, bg.col = sector_colours, bg.border = NA)

for(i in 1:nrow(dt_bind)) {
  row_i <- dt_bind[i, ]
  circos.link(
    row_i[['group_from_f']], c(0, 1),
    row_i[['group_to_f']], c(0, 1),
    border = NA, col = row_i[['colour']]
  )
}

# "Feature" labels
circos.trackPlotRegion(track.index = 2, ylim = c(0, 1), panel.fun = function(x, y) {
  sector.index = get.cell.meta.data("sector.index")
  circos.text(0.5, 0.25, sector.index, col = "white", cex = 0.6, facing = "clockwise", niceFacing = TRUE)
}, bg.border = NA)

# "Patient/cell" labels
for(s in names(colour_lookup)) {
  sectors <- sectors_f %>% { .[str_detect(., s)] }
  highlight.sector(
    sector.index = sectors, track.index = 1, col = colour_lookup[s],
    text = s, text.vjust = -1, niceFacing = TRUE)
}

circos.clear()



# counts of unique and shared features ------------------------------------

xlims <- dt_split[, .N, by = group_from][, .(x_from = 0, x_to = N)] %>% as.matrix()
links <- dt_join[, .N, by = .(group_from, group_to)]
colours <- dt_split[, unique(group_from)] %>% structure(seq_along(.) + 1, names = .)

library(circlize)

sectors = names(colours)
circos.par(cell.padding = c(0, 0, 0, 0))
circos.initialize(sectors, xlim = xlims)
circos.trackPlotRegion(ylim = c(0, 1), track.height = 0.05, bg.col = colours, bg.border = NA)

for(i in 1:nrow(links)) {
  link <- links[i, ]
  circos.link(link[[1]], c(0, link[[3]]), link[[2]], c(0, link[[3]]), col = '#00000025', border = NA)
}

# "Patient/cell" labels
for(s in sectors) {
  highlight.sector(
    sector.index = s, track.index = 1, col = colours[s], 
    text = s, text.vjust = -1, niceFacing = TRUE)
}

circos.clear()

编辑:仅添加已删除评论中的链接:见 this answer一个很好的标签示例!

关于R:circlize circos plot - 如何以最小的重叠绘制扇区之间的未连接区域，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42917349/

文章推荐： r - 在循环内使用 ggplot2 的直方图

文章推荐： asp.net - 启动 asp.net 输出缓存

文章推荐： r - 包装 R 的绘图函数(或 ggplot2)以防止绘制大数据集

文章推荐： scala - APL 可以作为 DSL 在 Scala 中实现吗？

plotly - plotly 标签
我想更改 plotly(_express) 图中的构面标签。剧情如下: import plotly.express as px tips = px.data.tips() fig = px.scatt
plotly - Plotly 中所有地理范围的列表
我正在尝试使用 plotly.js 在 map 上绘制数据。我知道您可以通过以下方式获得一个国家/地区的 map : layout = dict( title = '',
plotly - Plotly 的默认色标是什么？
关于 this page暗示他们有一些默认的色标，例如“Viridis”。我终其一生都找不到一个网页来记录这些命名的色标是什么。最佳答案问题是我是英国人并且正确拼写了颜色。色标可在 https:/
r - 如何在 plotly 中为子 plotly 中的所有 plotly 强制使用相同的颜色？
在下面的示例中，我在一个 plotly 子图中有四个箱形图。此示例中的四个箱形图中的每一个都有 3 个变量:股票、债券和现金。在每个箱线图中，我希望股票以相同的颜色(例如蓝色)显示，债券以相同的颜色(
javascript - plotly:删除 plot 并在同一个 div 中创建一个新的 plot
我有一个 plotly plot，当数据发生变化时，我想删除 plot 并生成一个新 plot。为此，我这样做: $('#heatmap2').empty() 然后我重新生成我的 plotly 。但是
python - Plotly:如何使用 plotly.graph_objects 和 plotly.express 在图形中定义颜色？
有许多问题和答案以一种或另一种方式涉及这个主题。有了这个贡献，我想清楚地说明为什么一个简单的方法，比如 marker = {'color' : 'red'}将适用于 plotly.graph_obje
python - 为什么 matplotlib .plot(kind ='bar' ) plot 与 .plot() 如此不同
这可能是一个非常愚蠢的问题，但是当使用 .plot() 绘制 Pandas DataFrame 时，它非常快并且会生成具有适当索引的图形。一旦我尝试将其更改为条形图，它似乎就失去了所有格式并且索引
python - plotly dash - 使用 plotly 生成图像，在本地保护它并使用 plotly dash 显示它
我用 plotly (express) 生成了很多图像，并将它们以 png 格式保存在本地目录中。我想创建一个带有 plotly dash 的仪表板。我生成的图像有很多依赖关系:这就是我不想将代码包含
python - 交互式 plotly 的 plotly 表达与Altair/Vega-Lite的 plotly 比较
最近，我正在学习Plotly express和Altair/Vega-Lite进行交互式绘图。他们两个都令人印象深刻，我想知道他们的优点和缺点是什么。尤其是对于创建交互式地块，它们之间有什么大差异，何
plotly:从直方图中获取值/plotly:从轨迹中获取值
在 plotly 中，我可以创建一个直方图，例如in this example code from the documentation : import plotly.express as px df
plot - Julia plot 函数数组问题
来自 Matlab 我正在努力弄清楚为什么以下不起作用: plot(x=rand(10),y=rand(10)) 正确生成图表。 x=rand(10) y=rand(10) plot(x,y) 产生错
plot - 自定义图例标签 - geopandas.plot()
我和一位同事一直在尝试设置自定义图例标签，但到目前为止都失败了。下面的代码和详细信息 - 非常感谢任何想法! 笔记本:toy example uploaded here 目标:将图例中使用的默认比率值
plotly - 如何使用 Plotly 控制哪些跟踪图位于顶部？
我正在使用 Plotly python 库生成一个带有几个 fiddle 图和几个填充散点图的图形。无论什么订单我都有个人fig.add_trace在我的代码中调用， fiddle 图总是在散点图后面
plot - 删除 Plotly 中图表之间的差距
我将图表的大小配置为 Shiny 但图表之间仍有空白区域它们在配置高度和宽度之前显示为旧区域这是我的代码 plot1_reactive % layout(xaxis = xaxis,
plotly - 如何组织一个有 plotly 的破折号项目？
我想弄清楚如何组织一个包含多个应用程序的破折号项目。所有示例都是单页应用程序，我希望将多个破折号组织为一个项目，由 gunicorn 运行(在 docker 容器内): dash-project/
Julia Plotly 不显示带有子图的 plotly
我之前做了一些解决方法来在 Julia Plotly 中实现精彩的子图，但目前正在努力解决一个更复杂的问题。下面有三种方法可以完成这项工作。 draw1 完美地完成了，但不适用于我的情况，draw2
plotly - 删除 Plotly 中子图之间的空间？
我的子图之间有很大的空间。在 matplotlib 中，有一种称为紧密布局的布局可以消除这种情况。 plotly 有没有类似的布局？我正在 iPython 笔记本中绘图，因此空间有限。请参阅下图中的空
plotly - plot.ly 热图色标不起作用
我正在尝试获取我提前生成的 cbrewer Reds 颜色图。但是，当我尝试使用它时，我仍然得到一些默认的颜色图。我究竟做错了什么？这是 plotly :https://plot.ly/~smirno
plotly - Plot.ly - 图例中同一键的多条轨迹
我一直在使用 plot.ly 并希望将多个跟踪分组到图例中的同一个键。我有显示有关特定用户的数据的子图。我想让每个键代表一个用户，而不是 user.data1、user.data2 等。这是我现在
plot - 如何从 Plotly 中删除轴和数字
我有下面这张图，我想把除点和三角形以外的所有东西都去掉，意思是横纵轴上的数字和小竖线，我该怎么做？这是图片: 这是我的代码: x0 = np.average(triangleEdges,axis=0

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

R:circlize circos plot - 如何以最小的重叠绘制扇区之间的未连接区域