How to effectively code MATCH INDEX in r (equivalent to excel)?(如何在r中有效编码匹配索引(相当于EXCEL)？)-6ren

How to effectively code MATCH INDEX in r (equivalent to excel)?(如何在r中有效编码匹配索引(相当于EXCEL)？)

转载作者：bug小助手更新时间：2023-10-24 17:54:25

I have two different excel spreadsheets (input and output files). Instead of using the match index functions directly in excel, I would like to use a r script to look up for the values in my matrix in the input file and store the values in the correct cell in my output file.

我有两个不同的Excel电子表格(输入和输出文件)。我不想直接在EXCEL中使用匹配索引函数，而是使用r脚本在输入文件中查找矩阵中的值，并将这些值存储在输出文件的正确单元格中。

My input file consists of a matrix (columns and rows) and my output file, too. However, the matrix in the output file is "transposed" and the names of the columns and rows may be arranged in a different order. Hence, I need to use a two-dimensional lookup to find the values in the input file and store them in my output file.

我的输入文件也由一个矩阵(列和行)和输出文件组成。但是，输出文件中的矩阵被“转置”，列和行的名称可能会以不同的顺序排列。因此，我需要使用二维查找来查找输入文件中的值，并将它们存储在输出文件中。

Suppose this this my input file (fictitious numbers for illustration):

假设这是我的输入文件(为了说明，虚构的数字)：

Suppose this is my output file:

假设这是我的输出文件：

How can I implement the lookup in r such that the values from my input file are correctly entered in the corresponding cells in the output file using r script? I've stored both my excel files as dataframes.

如何实现r中的查找，以便使用r脚本将输入文件中的值正确地输入到输出文件的相应单元格中？我已经将我的两个EXCEL文件存储为数据帧。

Your help is highly valuable. Thank you!

你的帮助非常宝贵。谢谢!


#clearing workspace
rm(list=ls())

# Load required libraries
library(openxlsx)

# get username 
username <- Sys.getenv("USER")

# Load input and output Excel files
input_file <- paste0("/Users/", username, "/Downloads/input_file.xlsx", collapse = "")
output_file <- paste0("/Users/", username, "/Desktop/output_file.xlsx", collapse = "")

# Load the input and output matrices
input_matrix <- read_xlsx(input_file, sheet = "KLICKHERE")
output_matrix <- read_xlsx(output_file, sheet = "ENTERHERE")
class(input_matrix)

# Transpose the dataframe
transposed_input_matrix <- t(input_matrix)

# Convert the column names to Date objects
colnames(output_matrix) <- as.Date(colnames(output_matrix), format = "%YYYY/%mm/%dd")

# Function to perform the two-dimensional lookup
lookup_and_update <- function(transposed_input_matrix, output_matrix) {
  for (i in 1:nrow(output_matrix)) {
    for (j in 1:ncol(output_matrix)) {
      # Get the row and column names in the output matrix
      row_name <- rownames(output_matrix)[i]
      col_name <- colnames(output_matrix)[j]
      
      # Find the corresponding value in the input matrix
      value <- transposed_input_matrix[row_name, col_name]
      
      # Update the value in the output matrix
      output_matrix[i, j] <- value
    }
  }
  return(output_matrix)
}

# Call the lookup function
updated_output_matrix <- lookup_and_update(transposed_input_matrix, output_matrix)

# Save the updated output matrix back to the output Excel file
write.xlsx(updated_output_matrix, output_file, sheetName = "ENTERHERE")

> dput(input_matrix)
structure(list(quarter = structure(c(1640995200, 1648771200, 
1656633600, 1664582400, 1672531200, 1680307200, 1688169600, 1696118400, 
1704067200, 1711929600, 1719792000, 1727740800, 1735689600, 1743465600, 
1751328000, 1759276800, 1767225600, 1775001600, 1782864000, 1790812800, 
1798761600, 1806537600, 1814400000, 1822348800), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), portugal = c(3.2, 1.2617029893181, 
2.60440314593473, 0.205747170894448, 2.99742239259666, 0.454981287908458, 
0.812500920203167, 3.53979030628357, 2.203045423758, 0.054471200265702, 
2.92803826928382, 0.718964340034683, 1.60951470750129, 5.07871970749977, 
5.69403126006479, 1.22925310502368, 3.66396581660635, 2.37878419177338, 
2.29467033332622, 5.03595630837856, 2.25374064291613, 1.69444882698869, 
4.16205429572283, 4.50132478373478), Switzerland = c(4, 2.38038947850657, 
5.47668679859636, 5.91361388434538, 4.77394394868853, 0.51390066344242, 
5.01921886848812, 2.50248783131655, 4.01832050488102, 5.41622706832583, 
5.30149956216031, 3.16778787833323, 2.199973116468, 5.01366343788224, 
4.29923192879718, 4.74615956273584, 1.28422990972834, 0.284477581237545, 
2.08538425170424, 0.463401565316672, 5.19591972413863, 1.48139690105528, 
3.72116283773825, 2.88215533537597), UK1 = c(3, 5.86873632407074, 
5.00564172969994, 4.53205722786764, 2.21527468771027, 4.01342647825025, 
5.38033314419433, 3.94260225784184, 3.32679878460482, 4.44258374317064, 
0.912140741259649, 3.31029041858673, 3.54577260155724, 5.47399328355281, 
2.87960737852272, 0.333399757849791, 1.68600300552304, 0.761656675816694, 
5.60117991518305, 2.41681043343095, 1.47930439097793, 1.96253624751877, 
2.04852072952451, 3.00458221738878), UK2 = c(-1, 3.35979319893751, 
3.41085866605616, 0.560088392935827, 5.13880709708747, 4.12321867925324, 
0.678575131657537, 5.05445686032681, 1.91810878862458, 1.3819304062605, 
0.80241487254838, 5.88840619656107, 1.4643177661779, 1.30971606465739, 
5.27065656469845, 2.59430512488464, 2.43626303990699, 2.6781401256743, 
2.92798363758913, 3.82250194049481, 3.53273150832144, 2.88313585242345, 
2.2629948322944, 1.45945340574197)), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -24L))
>

> dput(output_matrix)
structure(list(c("Portugal", "Switzerland", "UK"), c(NA, NA, 
NA), c(NA, NA, NA), c(NA, NA, NA), c(NA, NA, NA), c(NA, NA, NA
), c(NA, NA, NA), c(NA, NA, NA), c(NA, NA, NA)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -3L), .Names = c(NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_))
>

更多回答

hi TFT have you imported the excel files as dataframes?

嗨，TFT你把EXCEL文件作为数据帧导入了吗？

also, what is it which is transposing your data? Post your code if it's R. It's much to keep things in the same state, than to transpose the data back and forth

另外，是什么在颠覆您的数据？如果代码是R，则将其发布。与来回调换数据相比，保持代码的状态不变要难得多

Hi Mark. Thanks for your quick feedback. Yes, I've imported both my excel files as dataframes and I've transposed my input file. I've just posted my r code in my initial request. Thanks in advance for your help!

嗨，马克。感谢您的快速反馈。是的，我已经将我的两个EXCEL文件作为数据帧导入，并且我已经调换了我的输入文件。我刚刚在我最初的请求中发布了我的r代码。提前感谢您的帮助！

great! thanks for doing that! :-) now one more thing you can do that would be great - run dput(input_matrix), and add the results to your question as well

太棒了！谢谢你这么做！：-)现在你还可以做一件很棒的事情--运行dput(INPUT_MATRATE)，然后把结果加到你的问题中

Just done it! :) Thanks for your help and patience!

就这么做吧！：)谢谢你的帮助和耐心！

优秀答案推荐

Steps:

步骤：

Turn the quarter into a date (it was a datetime)

Make the data long, turning all of the country columns into their own rows, with the country names put into a column called "country"

Clean the "country" column - if it includes "UK", make it "UK", otherwise, make it title case

Make it wider again, using the quarter dates as column names, and the values as the values. Because we now have multiple values for the UK row, we turn them into strings using an anonymous function. collapse = "," means it joins multiple ones with a comma in the middle of the values.

Write it out to a csv file

Code:

代码：

pacman::p_load(tidyverse)

input_matrix |>
  mutate(quarter = as.Date(quarter)) |>
  pivot_longer(-quarter, names_to = "country", values_to = "value") |>
  mutate(country = ifelse(str_detect(country, "UK"), "UK", str_to_title(country))) |>
  pivot_wider(names_from = "quarter", values_from = "value", values_fn = ~paste0(.x, collapse = ",")) |>
  write_csv("output.csv")

Output:

产出：

country,2022-01-01,2022-04-01,2022-07-01,2022-10-01,2023-01-01,2023-04-01,2023-07-01,2023-10-01,2024-01-01,2024-04-01,2024-07-01,2024-10-01,2025-01-01,2025-04-01,2025-07-01,2025-10-01,2026-01-01,2026-04-01,2026-07-01,2026-10-01,2027-01-01,2027-04-01,2027-07-01,2027-10-01
Portugal,3.2,1.2617029893181,2.60440314593473,0.205747170894448,2.99742239259666,0.454981287908458,0.812500920203167,3.53979030628357,2.203045423758,0.054471200265702,2.92803826928382,0.718964340034683,1.60951470750129,5.07871970749977,5.69403126006479,1.22925310502368,3.66396581660635,2.37878419177338,2.29467033332622,5.03595630837856,2.25374064291613,1.69444882698869,4.16205429572283,4.50132478373478
Switzerland,4,2.38038947850657,5.47668679859636,5.91361388434538,4.77394394868853,0.51390066344242,5.01921886848812,2.50248783131655,4.01832050488102,5.41622706832583,5.30149956216031,3.16778787833323,2.199973116468,5.01366343788224,4.29923192879718,4.74615956273584,1.28422990972834,0.284477581237545,2.08538425170424,0.463401565316672,5.19591972413863,1.48139690105528,3.72116283773825,2.88215533537597
UK,"3,-1","5.86873632407074,3.35979319893751","5.00564172969994,3.41085866605616","4.53205722786764,0.560088392935827","2.21527468771027,5.13880709708747","4.01342647825025,4.12321867925324","5.38033314419433,0.678575131657537","3.94260225784184,5.05445686032681","3.32679878460482,1.91810878862458","4.44258374317064,1.3819304062605","0.912140741259649,0.80241487254838","3.31029041858673,5.88840619656107","3.54577260155724,1.4643177661779","5.47399328355281,1.30971606465739","2.87960737852272,5.27065656469845","0.333399757849791,2.59430512488464","1.68600300552304,2.43626303990699","0.761656675816694,2.6781401256743","5.60117991518305,2.92798363758913","2.41681043343095,3.82250194049481","1.47930439097793,3.53273150832144","1.96253624751877,2.88313585242345","2.04852072952451,2.2629948322944","3.00458221738878,1.45945340574197"

Maybe you can just transpose your input_matrix into an output_matrix?

也许你可以把你的输入矩阵转换成输出矩阵？

like:

比如：

output_df <- data.frame(t(input_matrix))[-1, ]

with some adjustions:

有一些调整：

colnames(output_df) <- input_matrix$quarter
output_df$time <- names(input_matrix)[-1]
output_df <- output_df[, c(ncol(output_df), 1:(ncol(output_df) - 1))]

PS: Any instructions on how to combine UK1 and UK2?

PS：有关于如何组合UK1和UK2的说明吗？

更多回答

hi Near Lin! check my chat with OP ! :-)

嗨，在林附近！查看我与OP的聊天！：-)

java - 为什么 `index = index++` 不增加 `index` ？
这个问题已经有答案了: 已关闭14 年前。 ** 重复:What's the difference between X = X++; vs X++;? ** 所以，即使我知道你永远不会在代码中真正做到
c - 这条语句背后的逻辑是什么: for (--index; index >= 0; --index)?
我在一本C语言的书上找到了这个例子。此代码转换输入数字基数并将其存储在数组中。 #include int main(void) { const char base_digits[16] =
flutter - 未处理的异常 : RangeError (index): Index out of range: index should be less than
尝试使用“pdf_dart”库保存 pdf 时遇到问题。我认为问题与我从互联网下载以尝试附加到 pdf 的图像有关，但我不确定它是什么。代码 import 'dart:io'; import 'p
linux - 访问某些 index.php 或 index.html 时出现 Apache 403 错误，尽管每个 index.php 或 index.html 具有相似的权限
我的 Apache 服务器曾经可以正常工作，但它随机开始对几乎每个目录发出 403 错误。两个目录仍然有效，我怎样才能使/srv/www/htdocs 中的所有目录正常工作？我查看了两个可用目录的权
PHP 数组索引 : $array[$index] vs $array ["$index"] vs $array ["{$index}"]
这些索引到 PHP 数组的方法之间有什么区别(如果有的话): $array[$index] $array["$index"] $array["{$index}"] 我对性能和功能上的差异都感兴趣。更
indexing - 实现 Index 特征以返回一个不是引用的值
我有一个简单的结构，我想为其实现 Index，但作为 Rust 的新手，我在借用检查器方面遇到了很多麻烦。我的结构非常简单，我想让它存储一个开始值和步长值，然后当被 usize 索引时它应该返回 st
indexing - marklogic 中的 element-range-index 和 field-range-index 有什么区别？
我对 MarkLogic 中的 element-range-index 和 field-range-index 感到困惑。请借助示例来解释差异。最佳答案这两个都是标量索引:特定类型的基于值的排序
indexing - marklogic 中的 element-range-index 和 field-range-index 有什么区别？
我对 MarkLogic 中的 element-range-index 和 field-range-index 感到困惑。请借助示例来解释差异。最佳答案这两个都是标量索引:特定类型的基于值的排序
python - Pandas .at 抛出 ValueError : At based indexing on an integer index can only have integer indexers
所以我有一个 df，我在其中提取一个值以将其存储在另一个 df 中: import pandas as pd # Create data set d = {'foo':[100, 111, 222],
php - ci : google indexing address with index. php 但站点中没有与 index.php 的链接
我有一个由 codeigniter 编写的网站，我已经通过 htaccess 从地址中删除了 index.php RewriteCond $1 !^(index\.php|resources|robo
sql - MySQL: `... ADD INDEX(a); ... ADD INDEX(b);` 和 `... ADD INDEX(a,b);` 之间的区别？
谁能告诉我这两者有什么区别: ALTER TABLE x1 ADD INDEX(a); ALTER TABLE x1 ADD INDEX(b); 和 ALTER TABLE x1 ADD INDEX(
javascript - Firefox 上的嵌套 z-index 问题，较高的 z-index 落后于较低的 z-index
我在 Firefox 和其他浏览器上遇到嵌套 z-index 的问题，我有一个 div，z-index 为 30000，位于 label 下方> zindex 为 9000。我认为这是由 z-inde
c++ - 如果 index == 0，为什么 v [index] < v [index - 1] 返回 true？
Link to the function image编写了一个函数来查找中枢元素(起始/最低)的索引排序和旋转数组。我解决了这个问题并正在检查边缘情况，它甚至适用于索引为零的情况。任何人都可以解
python - 类型错误 : cannot perform __sub__ with this index type:
我正在尝试运行有关成人人口普查数据的示例代码。当我运行这段代码时: X_train, X_test, y_train, y_test = cross_validation.train_test_spl
apache - 如何 htaccess 将 index.html 重定向到 index.php 并将 index.php 重定向到/
我最近将我的 index.html 更改为 index.php - 我希望能够进行重定向以反射(reflect)这一点，然后还进行重写以强制 foo.com/index.php 成为 foo.com/
apache - 如何 htaccess 将 index.html 重定向到 index.php 并将 index.php 重定向到/
我最近将我的 index.html 更改为 index.php - 我希望能够进行重定向以反射(reflect)这一点，然后还进行重写以强制 foo.com/index.php 成为 foo.com/
python - <类 'pandas.indexes.numeric.Int64Index'> 的类型错误 : cannot do slice indexing on with these indexers [(2, )]
我有一个用户定义的函数，如下所示:- def genre(option,option_type,*limit): option_based = rank_data.loc[rank_data[
python - 减去索引 - TypeError : cannot perform __sub__ with this index type:
我有两个巨大的数据框我正在合并它们，但我不想有重复的列，因此我通过减去它们来选择列: cols_to_use=df_fin.columns-df_peers.columns.difference(['
javascript - 如何在 React Native 中使用 index.js 而不是 (index.ios.js, index.android.js) 进行跨平台应用程序？
感谢您从现在开始的回答，我是React Native的新手，我想做一个跨平台的应用所以我创建了index.js: import React from 'react'; import { Co
indexing - Field.Index.NOT_ANALYZED_NO_NORMS 是什么意思
我知道 not_analyzed 是什么意思。简而言之，该字段不会被指定的分析器标记化。然而，什么是 NO_NORMS 方法？我看到了文档，但请用简单的英语解释我。什么是索引时间字段和文档提升和字段

bug小助手

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

How to effectively code MATCH INDEX in r (equivalent to excel)?(如何在r中有效编码匹配索引(相当于EXCEL)？)