r - 合并两个数据帧 : specifically merge a selection of columns based on two conditions?-6ren

r - 合并两个数据帧 : specifically merge a selection of columns based on two conditions?

转载作者：行者123 更新时间：2023-12-04 07:24:20

25

4

我在同一 2 名患者上有两个数据集。对于第二个数据集，我想向第一个数据集添加新信息，但我似乎无法正确获取代码。
我的第一个(不完整)数据集有患者 ID、测量时间(T0 或 FU1)、出生年份、CT 扫描日期和两个结果(legs_mass 和 total_mass):

library(tidyverse)
library(dplyr)
library(magrittr)
library(lubridate)

df1 <- structure(list(ID = c(115, 115, 370, 370), time = structure(c(1L, 
6L, 1L, 6L), .Label = c("T0", "T1M0", "T1M6", "T1M12", "T2M0", 
"FU1"), class = "factor"), year_of_birth = c(1970, 1970, 1961, 
1961), date_ct = structure(c(16651, 17842, 16651, 18535), class = "Date"), 
    legs_mass = c(9.1, NA, NA, NA), total_mass = c(14.5, NA, 
    NA, NA)), row.names = c(NA, -4L), class = c("tbl_df", "tbl", 
"data.frame"))

# Which gives the following dataframe
df1

# A tibble: 4 x 6
     ID time  year_of_birth date_ct    legs_mass total_mass
  <dbl> <fct>         <dbl> <date>         <dbl>      <dbl>
1   115 T0             1970 2015-08-04       9.1       14.5
2   115 FU1            1970 2018-11-07      NA         NA  
3   370 T0             1961 2015-08-04      NA         NA  
4   370 FU1            1961 2020-09-30      NA         NA

第二个数据集添加到legs_mass 和total_mass 列中:

df2 <- structure(list(ID = c(115, 370), date_ct = structure(c(17842, 
18535), class = "Date"), ctscan_label = c("PXE115_CT_20181107_xxxxx-3.tif", 
"PXE370_CT_20200930_xxxxx-403.tif"), legs_mass = c(956.1, 21.3
), total_mass = c(1015.9, 21.3)), row.names = c(NA, -2L), class = c("tbl_df", 
"tbl", "data.frame"))

# Which gives the following dataframe:
df2

# A tibble: 2 x 5
     ID date_ct    ctscan_label                     legs_mass total_mass
  <dbl> <date>     <chr>                                <dbl>      <dbl>
1   115 2018-11-07 PXE115_CT_20181107_xxxxx-3.tif       956.      1016. 
2   370 2020-09-30 PXE370_CT_20200930_xxxxx-403.tif      21.3       21.3

我正在尝试做的是...

根据 ID 号和 date_ct，将 leg_mass 和 total_mass 列值从 df2 添加到 df1。

将 df2 的新列(不在 df1 中的列；ctscan_label)添加到 df1，同样基于 ct 和患者 ID 的日期。
这样最终的数据集 df3 如下所示:

df3 <- structure(list(ID = c(115, 115, 370, 370), time = structure(c(1L, 
6L, 1L, 6L), .Label = c("T0", "T1M0", "T1M6", "T1M12", "T2M0", 
"FU1"), class = "factor"), year_of_birth = c(1970, 1970, 1961, 
1961), date_ct = structure(c(16651, 17842, 16651, 18535), class = "Date"), 
    legs_mass = c(9.1, 956.1, NA, 21.3), total_mass = c(14.5, 
    1015.9, NA, 21.3)), row.names = c(NA, -4L), class = c("tbl_df", 
"tbl", "data.frame"))

# Corresponding to the following tibble:
# A tibble: 4 x 6
     ID time  year_of_birth date_ct    legs_mass total_mass
  <dbl> <fct>         <dbl> <date>         <dbl>      <dbl>
1   115 T0             1970 2015-08-04       9.1       14.5
2   115 FU1            1970 2018-11-07     956.      1016. 
3   370 T0             1961 2015-08-04      NA         NA  
4   370 FU1            1961 2020-09-30      21.3       21.3

我已经尝试了来自 baseR 的合并功能和 rbind ，以及来自 dplyr 的 bind_rows但似乎无法做对。
有什么帮助吗？

最佳答案

您可以连接两个数据集并使用 coalesce从两个数据集中保留一个非 NA 值。

library(dplyr)

left_join(df1, df2, by = c("ID", "date_ct")) %>%
  mutate(leg_mass = coalesce(legs_mass.x , legs_mass.y), 
         total_mass = coalesce(total_mass.x, total_mass.y)) %>%
  select(-matches('\\.x|\\.y'), -ctscan_label)

#     ID time  year_of_birth date_ct    leg_mass total_mass
#  <dbl> <fct>         <dbl> <date>        <dbl>      <dbl>
#1   115 T0             1970 2015-08-04      9.1       14.5
#2   115 FU1            1970 2018-11-07    956.      1016. 
#3   370 T0             1961 2015-08-04     NA         NA  
#4   370 FU1            1961 2020-09-30     21.3       21.3

关于r - 合并两个数据帧 : specifically merge a selection of columns based on two conditions?，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/68300310/

25

4

0

文章推荐： pdfbox - 尽管移除了 PDAnnotation，但文本显示为蓝色

文章推荐： swift - 在 swift 脚本中运行 shell 命令

javascript - react JSX : selecting "selected" on selected
我试图通过用空格填充文本来创建下拉列表中的列效果，如下例所示: [Aux1+1] [*] [Aux1+1] [@Tn=PP] [Main] [*] [Main A
select - JPA - MAX of COUNT 或 SELECT FROM SELECT
我为 MySQL 编写了以下查询: SELECT subquery.t1_column1, subquery.t2_id, MAX(subquery.val) FROM ( S
jquery - 为什么我们用 select 标签编写 .attr ('selected' ,'selected' )
为什么我们要用 select 标签来编写.attr('selected','selected') 例如: $('#countryList option').filter(function () {
select - "selected"选项上的标签在选择具有FormControlname时不起作用
Lokalizacja: Gdańsk Rzeszów Wrocław 不知道发生了什么，但在那种情况下没有选择的选项，我必须从列表中选择一些东西。当我从选
jquery - removeAttr ("selected") 和 .attr ('selected' ,'selected' ) 无法正常工作
我的表单中有两个选择字段。第一个是单选，另一个是多选。现在我想做的是根据单选中所选的选项，使用给定的数据选择多选中的选项。为此，我在单选更改时触发 ajax 请求: $.ajax({ type
select - Firefox 5 在页面刷新时不使用 select ="selected"值，保留旧值
我在 Firefox 5 中发现了一个奇怪的错误(我现在无法访问 4)。但是，我认为它可能在 Firefox 4 中工作，因为我刚买了一台新电脑，而且我不记得以前见过这个错误。我有几个选择框。所选值
MySQL: select * from table 和 select * from (select* from table) 的区别
此 SQL 有何不同: 第一个: select * from table_1 a join table_2 b on a.id = b.acc_id 第二个: select * f
html - HTML <选项> : selected VS selected ="selected" 的最佳实践
预选的最佳做法是什么？在？根据不同的网站，两者都有效。但是哪个更好呢？最兼容？ Foo Bar 最佳答案如果您正在编写 XHTML，则 selected="selected" 是必需的。如
javascript - Angular JS : "Select All" options of "multi-select select box"
我使用 Angular JS 创建了一个多选选择框:下面是相同的代码: JS: $scope.foobars = [{ 'foobar_id': 'foobar01', 'name':
select - 在列上使用 defaultValue 属性(但不是
我正在编写一个小脚本来测试表单在提交之前是否已被更改。所以我可以使用普通输入(文本、文本区域等): if(element.defaultValue != element.value) { al
javascript - 将选项从 - 选项 select 属性被破坏了吗？
我正在尝试为 Prototype 编写一个插件，用户在其中单击下拉菜单并将其替换为多选元素。我快完成了。在用户选择他们想要显示的内容并将表单提交到同一页面之前，一切都很好。我正在使用 PHP 来使用
Mongodb select with condition is selected result must in sub select query
你如何在 MongoDB 中进行嵌套选择，类似于 SELECT id FROM table1 WHERE id IN (SELECT id FROM table2) 最佳答案 MongoDB 尚不具备
Angular 2 : Select dropdown not selecting option despite "selected" attribute
我有以下用于选择下拉列表的代码: {{unit.Text}} UnitOfMeasurements 数组中的每一项看起来像这样: Selected: false Text: "lb" Va
Use [ngValue] and [selected] in select tag(在选择标记中使用[ngValue]和[selected])
我正在尝试使用[选定]和[ngValue]来设置表单中包含对象的选择标记的默认值。但出于某种原因，它们似乎无法相提并论。。示例代码：。这段代码最终只显示空白作为缺省值。如果删除[ngValue]，它就

首页

博学

6Ren·AI

商城

r - 合并两个数据帧 : specifically merge a selection of columns based on two conditions?