gpt4 book ai didi

python - Pandas DataFrame to Reticulate 结果出现 IndexError

转载 作者:行者123 更新时间:2023-12-01 08:16:38 25 4
gpt4 key购买 nike

我有一个用 Python 3.7 编写的函数,它返回一个 Pandas DataFrame。示例:

import pandas

df = pandas.DataFrame({'foo':[1,2,3], 'bar':['one', 'two', 'three'], 'baz':['apple', 'banana', 'strawberry']})

def returnMyDF():

return df

此 Python 文件可能名为 my_dataframe.py

然后在 R 中,我使用 Reticulate 库和 Tidyverse 将 Pandas DataFrame 插入 Tibble。

执行此操作的代码位于 app.R 中,如下所示:

library(tidyverse)
library(reticulate)

use_python("C:/ProgramData/Anaconda3", required = TRUE)
source_python("C:/the/path/to/my_dataframe.py")

df = returnMyDF()
glimpse(df)

返回以下错误:

观察结果:3
py_call_impl(callable,dots$args,dots$keywords)中的错误:
IndexError:索引 3 超出尺寸为 3 的轴 0 的范围

一些事实:我在GitHub上发现了这个问题:https://github.com/rstudio/reticulate/issues/101我认为这可能会解决。使用 devtools::install_github("rstudio/reticulate") 更新到最新版本的 Reticulate

session 信息:

sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] forcats_0.3.0 stringr_1.4.0 dplyr_0.7.8 purrr_0.3.0 readr_1.3.1 tidyr_0.8.2 tibble_2.0.1
[8] ggplot2_3.1.0 tidyverse_1.2.1 reticulate_1.10

loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 cellranger_1.1.0 pillar_1.3.1 compiler_3.5.2 plyr_1.8.4 bindr_0.1.1
[7] tools_3.5.2 lubridate_1.7.4 jsonlite_1.6 nlme_3.1-137 gtable_0.2.0 lattice_0.20-38
[13] pkgconfig_2.0.2 rlang_0.3.1 Matrix_1.2-15 cli_1.0.1 rstudioapi_0.9.0 yaml_2.2.0
[19] haven_2.0.0 bindrcpp_0.2.2 withr_2.1.2 xml2_1.2.0 httr_1.4.0 hms_0.4.2
[25] generics_0.0.2 grid_3.5.2 tidyselect_0.2.5 glue_1.3.0 R6_2.3.0 readxl_1.2.0
[31] modelr_0.1.3 magrittr_1.5 backports_1.1.3 scales_1.0.0 rvest_0.3.2 assertthat_0.2.0
[37] colorspace_1.4-0 stringi_1.2.4 lazyeval_0.2.1 munsell_0.5.0 broom_0.5.1 crayon_1.3.4`

为了检查 NumPy 是否有效,我可以更改 my_dataframe.py (并适本地更改 app.R)并导入 NumPy 数组...这不会导致任何问题:

import numpy

my_array = numpy.array([42, 2.38, 42])

def returnMyArray():

return my_array

我的问题是:如何将 Pandas DataFrame 引入 R 等效项中?

最佳答案

我主要是 R 用户,并开始涉足 Python,但可能的解决方案可能是在 Rmarkdown 中编写 Python 代码。您可以在此处互换编写 pythonr 代码 - 这是一个很好的入门资源 https://cran.r-project.org/web/packages/reticulate/vignettes/r_markdown.html

如果您不熟悉 r markdown,我可以提供更多相关信息。

---
title: "test"
output: html_document
---

```{r setup, include=FALSE}

knitr::opts_chunk$set(echo = TRUE)
#engine to run python
library(reticulate)

```



```{python}
#python code R knows this is python code because you specified
# this above "```{python}"

import pandas

df = pandas.DataFrame({'foo':[1,2,3], 'bar':['one', 'two', 'three'], 'baz':['apple', 'banana', 'strawberry']})

print(df)

```



```{r}
#r code
#refer to get python objects in R code you have to type py$objectname
df2 <- py$df
class(df2)
#data.frame - python equivalent to pandas.DataFrame
```

关于python - Pandas DataFrame to Reticulate 结果出现 IndexError,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54951351/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com