gpt4 book ai didi

c++ - for循环在Rcpp中崩溃

转载 作者:行者123 更新时间:2023-11-28 04:54:11 26 4
gpt4 key购买 nike

我正在尝试在 Rcpp 中复制以下代码(来自以下链接的原始 pandas 源代码- https://engineering.upside.com/a-beginners-guide-to-optimizing-pandas-code-for-speed-c09ef2c6a4d6:

library(data.table)
library(microbenchmark)
deg2rad <- function(deg) {(deg * pi) / (180)}

haversine = function(lat1, lon1, lat2, lon2) {
MILES = 3959
lat1 = deg2rad(lat1)
lon1 = deg2rad(lon1)
lat2 = deg2rad(lat2)
lon2 = deg2rad(lon2)
dlat = lat2 - lat1
dlon = lon2 - lon1
a = sin(dlat/2)^2 + cos(lat1) * cos(lat2) * sin(dlon/2)^2
c = 2 * asin(sqrt(a))
total_miles = MILES * c
return(total_miles)
}

# get data from here
download.file("https://raw.githubusercontent.com/sversh/pycon2017-
optimizing-pandas/master/new_york_hotels.csv","new_york_hotels.csv")
nyc_hotels = fread("new_york_hotels.csv", na.strings = c("NA", "N/A",
"NULL"))

summary(microbenchmark({
nyc_hotels[, greater_circle := haversine(40.671, -73.985, latitude,
longitude)]
},times=1000))[,-1]
# min lq mean median uq max neval
# 290.161 318.559 366.6786 329.491 345.0295 4365.697 1000
##########
#version 2 - invoke update differently, no change to function
summary(microbenchmark({
set(nyc_hotels,j="greater_circle",value=haversine(40.671, -73.985,
nyc_hotels[['latitude']], nyc_hotels[['longitude']]))
},times=1000))[,-1]
# min lq mean median uq max neval
# 81.395 89.5985 123.2211 96.1635 103.476 3670.193 1000

我创建了一个

haversine.cpp

我家目录下的文件如下:

#include <Rcpp.h>
#include <iostream>
using namespace Rcpp;


// [[Rcpp::export]]
NumericVector haversine_cpp_fun(double lat1_cpp,double lon1_cpp,NumericVector lat2_cpp,NumericVector lon2_cpp){
double Miles = 3959.0;
int n = lat2_cpp.size();
NumericVector dlat_cpp;
NumericVector dlon_cpp;
NumericVector a_cpp;
NumericVector c_cpp;
NumericVector total_mile_cpp;
lat1_cpp = (lat1_cpp*3.14159)/180.0;
lon1_cpp = (lon1_cpp*3.14159)/180.0;
for (int i=0 ; i<n ; ++i){
lat2_cpp[i] = (lat2_cpp[i]*3.14159)/180.0;
lon2_cpp[i] = (lon2_cpp[i]*3.14159)/180.0;
dlat_cpp[i] = lat2_cpp[i] - lat1_cpp;
dlon_cpp[i] = lon2_cpp[i] - lon1_cpp;
a_cpp[i] = pow(sin(dlat_cpp[i]/2.0),2.0) + cos(lat1_cpp) * cos(lat2_cpp[i]) * pow(sin(dlon_cpp[i]/2.0),2.0);
c_cpp[i] = 2 * asin(sqrt(a_cpp[i]));
total_mile_cpp[i] = Miles * c_cpp[i];
}
return total_mile_cpp;

}
/***R
# Approach 1: Trying to use the set statement from data.table--- fails without giving error. The session just crashes
summary(microbenchmark({
set(nyc_hotels,j="greater_circle",value=haversine_cpp_fun(40.671, -73.985,
nyc_hotels[['latitude']], nyc_hotels[['longitude']]))
},times=1000))[,-1]
# Approach 2: Without using the set statement from data.table and doing thing in a simple way by a simple function call--- again fails without giving error. The R session just crashes again.
microbenchmark({
nyc_hotels[, greater_circle := haversine_cpp_fun(40.671, -73.985, latitude,
longitude)]
})
*/

并使用 sourceCpp 调用它

 sourceCpp('./haversine.cpp')

在我看来,导致它崩溃的 for 循环有问题,但我似乎无法找出它是什么。我说这是因为当我在没有循环的情况下进行空运行并且只有索引 0 处的 vector 的单个元素时,rcpp 函数运行了。我发现唯一有用的链接是 for 循环没有正确编写的地方(Rcpp function crashes),但不知何故我已经尝试了它所说的一切,但仍然无法找出崩溃原因。请帮忙!

最佳答案

您的 session 崩溃是因为您创建了长度为零的 NumericVector 对象,然后尝试使用不安全的括号 ([i]) 表示法为它们赋值。如果您使用正确的长度初始化 NumericVectors,您的代码就会运行(不过我还没有检查它的准确性):

#include <Rcpp.h>
#include <iostream>
using namespace Rcpp;


// [[Rcpp::export]]
NumericVector haversine_cpp_fun(double lat1_cpp, double lon1_cpp,
NumericVector lat2_cpp, NumericVector lon2_cpp){
double Miles = 3959.0;
int n = lat2_cpp.size();
NumericVector dlat_cpp(n);
NumericVector dlon_cpp(n);
NumericVector a_cpp(n);
NumericVector c_cpp(n);
NumericVector total_mile_cpp(n);
lat1_cpp = (lat1_cpp*3.14159)/180.0;
lon1_cpp = (lon1_cpp*3.14159)/180.0;
for (int i=0 ; i<n ; ++i){
lat2_cpp[i] = (lat2_cpp[i]*3.14159)/180.0;
lon2_cpp[i] = (lon2_cpp[i]*3.14159)/180.0;
dlat_cpp[i] = lat2_cpp[i] - lat1_cpp;
dlon_cpp[i] = lon2_cpp[i] - lon1_cpp;
a_cpp[i] = pow(sin(dlat_cpp[i]/2.0),2.0) + cos(lat1_cpp) * cos(lat2_cpp[i]) * pow(sin(dlon_cpp[i]/2.0),2.0);
c_cpp[i] = 2 * asin(sqrt(a_cpp[i]));
total_mile_cpp[i] = Miles * c_cpp[i];
}
return total_mile_cpp;
}

更一般的说明:使用更安全的 .at(i) 方法可以使您的代码更优雅地失败。

关于c++ - for循环在Rcpp中崩溃,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47541680/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com