gpt4 book ai didi

R 中的滚动连接 data.table

转载 作者:行者123 更新时间:2023-12-03 06:54:18 25 4
gpt4 key购买 nike

我试图更多地了解滚动连接的工作方式,但有些困惑,我希望有人能为我澄清这一点。举个具体的例子:

dt1 <- data.table(id=rep(1:5, 10), t=1:50, val1=1:50, key="id,t")
dt2 <- data.table(id=rep(1:5, 2), t=1:10, val2=1:10, key="id,t")

我预计这会产生一个很长的 data.table,其中滚动 dt2 中的值:

dt1[dt2,roll=TRUE]

相反,正确的方法似乎是:

dt2[dt1,roll=TRUE]

有人可以向我解释一下如何加入 data.table 的工作原理吗,因为我显然没有正确理解它。我认为 dt1[dt2,roll=TRUE] 对应于 select * from dt1 right join dt2 on (dt1.id = dt2.id and dt1.t = dt2. t),除了添加的功能 locf。

此外,文档还说:

X[Y] is a join, looking up X's rows using Y (or Y's key if it has one) 
as an index.

这使得看起来只有 X 中的内容应该返回,并且正在进行的连接是内部连接,而不是外部连接。如果 roll=T 但该特定 id 不存在于 dt1 中,该怎么办?再玩一会儿我就无法理解该列中放入了什么值。

最佳答案

文档中的引用似乎来自 FAQ 1.12 X[Y] 和 merge(X,Y) 之间有什么区别。您在 ?data.table 中找到以下内容吗?它有帮助吗?

roll Applies to the last join column, generally a date but can be any ordered variable, irregular and including gaps. If roll=TRUE and i's row matches to all but the last x join column, and its value in the last i join column falls in a gap (including after the last observation in x for that group), then the prevailing value in x is rolled forward. This operation is particularly fast using a modified binary search. The operation is also known as last observation carried forward (LOCF). Usually, there should be no duplicates in x's key, the last key column is a date (or time, or datetime) and all the columns of x's key are joined to. A common idiom is to select a contemporaneous regular time series (dts) across a set of identifiers (ids): DT[CJ(ids,dts),roll=TRUE] where DT has a 2-column key (id,date) and CJ stands for cross join.

rolltolast Like roll but the data is not rolled forward past the last observation within each group defined by the join columns. The value of i must fall in a gap in x but not after the end of the data, for that group defined by all but the last join column. roll and rolltolast may not both be TRUE.

就 SQL 连接的左/右类比而言,我更喜欢在 FAQ 2.14 的上下文中考虑这一点 您能否进一步解释为什么 data.table 受到 A[B] 语法的启发在基地。这是一个很长的答案,所以我不会将其粘贴在这里。

关于R 中的滚动连接 data.table,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12030932/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com