gpt4 book ai didi

dataframe - Julia:将 DataFrame 传递给函数会创建一个指向 DataFrame 的指针吗?

转载 作者:行者123 更新时间:2023-12-04 10:50:01 27 4
gpt4 key购买 nike

我有一个函数,可以在其中规范化 DataFrame 的前 N ​​列。我想返回规范化的 DataFrame,但不要管原来的。然而,该函数似乎也会对传递的 DataFrame 进行变异!

using DataFrames

function normalize(input_df::DataFrame, cols::Array{Int})
norm_df = input_df
for i in cols
norm_df[i] = (input_df[i] - minimum(input_df[i])) /
(maximum(input_df[i]) - minimum(input_df[i]))
end
norm_df
end

using RDatasets
iris = dataset("datasets", "iris")
println("original df:\n", head(iris))

norm_df = normalize(iris, [1:4]);
println("should be the same:\n", head(iris))

输出:
original df:
6x5 DataFrame
| Row | SepalLength | SepalWidth | PetalLength | PetalWidth | Species |
|-----|-------------|------------|-------------|------------|----------|
| 1 | 5.1 | 3.5 | 1.4 | 0.2 | "setosa" |
| 2 | 4.9 | 3.0 | 1.4 | 0.2 | "setosa" |
| 3 | 4.7 | 3.2 | 1.3 | 0.2 | "setosa" |
| 4 | 4.6 | 3.1 | 1.5 | 0.2 | "setosa" |
| 5 | 5.0 | 3.6 | 1.4 | 0.2 | "setosa" |
| 6 | 5.4 | 3.9 | 1.7 | 0.4 | "setosa" |

should be the same:
6x5 DataFrame
| Row | SepalLength | SepalWidth | PetalLength | PetalWidth | Species |
|-----|-------------|------------|-------------|------------|----------|
| 1 | 0.222222 | 0.625 | 0.0677966 | 0.0416667 | "setosa" |
| 2 | 0.166667 | 0.416667 | 0.0677966 | 0.0416667 | "setosa" |
| 3 | 0.111111 | 0.5 | 0.0508475 | 0.0416667 | "setosa" |
| 4 | 0.0833333 | 0.458333 | 0.0847458 | 0.0416667 | "setosa" |
| 5 | 0.194444 | 0.666667 | 0.0677966 | 0.0416667 | "setosa" |
| 6 | 0.305556 | 0.791667 | 0.118644 | 0.125 | "setosa" |

最佳答案

Julia 使用一种称为“传递共享”的行为。从文档(强调我的):

Julia function arguments follow a convention sometimes called “pass-by-sharing”, which means that values are not copied when they are passed to functions. Function arguments themselves act as new variable bindings (new locations that can refer to values), but the values they refer to are identical to the passed values. Modifications to mutable values (such as Arrays) made within a function will be visible to the caller. This is the same behavior found in Scheme, most Lisps, Python, Ruby and Perl, among other dynamic languages.


在您的特定情况下,您似乎想要做的是为您的规范化操作创建一个全新且独立的 DataFrame。您可以使用 deepcopy 来做到这一点,例如
norm_df = deepcopy(input_df)
Julia 通常会要求你明确地做这些事情,因为创建一个大型数据帧的独立副本可能在计算上很昂贵,而且 Julia 是一种面向性能的语言。
再次从文档中,注意 copydeepcopy 之间的以下重要区别:

copy(x): Create a shallow copy of x: the outer structure is copied, but not all internal values. For example, copying an array produces a new array with identically-same elements as the original.

deepcopy(x): Create a deep copy of x: everything is copied recursively, resulting in a fully independent object. For example, deep-copying an array produces a new array whose elements are deep-copies of the original elements.

DataFrame 类型类似于数组,因此 deepcopy 是必要的。
一个相关的 SO 问题是 here

关于dataframe - Julia:将 DataFrame 传递给函数会创建一个指向 DataFrame 的指针吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28037348/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com