gpt4 book ai didi

python - 如何将 Python pandas 转换为 Julia DataFrame(使用 PyJulia)并返回到 Python Pandas

转载 作者:行者123 更新时间:2023-12-02 19:15:46 27 4
gpt4 key购买 nike

我想使用 PyJulia 来加速某些部分的代码

import numpy as np
import julia
import pandas as pd
import random
from julia import Base
from julia import Main
from julia import DataFrames

n = 100000
randomlist = []
for i in range(0,n):
num = random.randint(1,100)
randomlist.append(num)

data = {
'Score': list(randomlist),
'ScoreBin': list(np.zeros(n))
}
df = pd.DataFrame(data, columns = ['Score', 'ScoreBin'])
Main.dfj = df

Main.eval("""
for i = 1:10
#println(i)
if dfj.Score[i] >= 10
println(dfj.Score[i])
end
end
"""
)

但是我收到以下错误消息:

JuliaError: Exception 'TypeError: non-boolean (PyObject) used in boolean context' occurred while calling julia code:

此外还有以下命令:

Main.eval(""" 
println(dfj.Score[1])
"""
)

给出输出(看起来不是 Julia DataFrame):

PyObject 84

有没有办法将 pandas DataFrame 转换为 Julia DataFrame?

编辑 1

感谢@PrzemyslawSzufel 的回答,下面的代码现在可以工作了:

import numpy as np
import julia
import pandas as pd
import random
import copy
from julia import Base
from julia import Main
from julia import DataFrames
from julia import Pandas
#julia.install(DataFrame)
%load_ext julia.magic

n = 100000
randomlist = []
for i in range(0,n):
num = random.randint(1,100)
randomlist.append(num)

data = {
'Score': list(randomlist),
'ScoreBin': list(np.zeros(n))
}
df = pd.DataFrame(data, columns = ['Score', 'ScoreBin'])
Main.df = df;

Main.eval("""
dfj = df |> Pandas.DataFrame|> DataFrames.DataFrame;
""")

但是,尽管我在该行的末尾放置了一个 ;,但我总是从 dfj 得到一个不需要的长打印输出(100000 行)并且需要大约一秒钟。有没有办法避免打印输出?

此外,如果我现在在 Julia 中修改数据框(这比在 python 中修改数据框和整个问题的目标要快得多)并希望它把它转换回 python pandas,我也会得到一个错误

Main.eval(""" 
for i = 1:length(dfj[:, :Score])
if dfj[i, :Score] > 50
dfj[i, :ScoreBin] = 1
end
end
"""
)

dfjpy = pd.DataFrame(Main.dfj)
dfjpy


RuntimeError: Julia exception: MethodError: no method matching iterate(::DataFrames.DataFrame)
Closest candidates are:
iterate(!Matched::Core.SimpleVector) at essentials.jl:568
iterate(!Matched::Core.SimpleVector, !Matched::Any) at essentials.jl:568
iterate(!Matched::ExponentialBackOff) at error.jl:199
...
Stacktrace:
[1] jlwrap_iterator(::DataFrames.DataFrame) at /Users/mymac/.julia/packages/PyCall/zqDXB/src/pyiterator.jl:144
[2] pyjlwrap_getiter(::Ptr{PyCall.PyObject_struct}) at /Users/mymac/.julia/packages/PyCall/zqDXB/src/pyiterator.jl:125

顺便说一下,命令 type(dfjpy)PyCall.jlwrap 作为输出

编辑2

为了将 julia Dataframe 转换为 Python Pandas,您必须先将其转换为 Julia Pandas。是最新的工作代码

n = 100000
randomlist = []
for i in range(0,n):
num = random.randint(1,100)
randomlist.append(num)

data = {
'Score': list(randomlist),
'ScoreBin': list(np.zeros(n))
}
df = pd.DataFrame(data, columns = ['Score', 'ScoreBin'])
Main.df = df;

Main.eval("""
dfj = df |> Pandas.DataFrame|> DataFrames.DataFrame;

for i = 1:length(dfj[:, :Score])
if dfj[i, :Score] > 50
dfj[i, :ScoreBin] = 1
end
end

dfjp = dfj |> Pandas.DataFrame;
"""
)

dfjpy = Main.dfjp
dfjpy

最佳答案

您需要安装 Pandas.jl。该库将使用 Julia 处理您的 Python pandas 数据框以确保完整性,然后您可以将其转换为 DataFrames.jl

这是 Julia 代码(假设 dfj 是您的 Python 变量):

import DataFrames
import Pandas
juliandf = dfj |> Pandas.DataFrame |> DataFrames.DataFrame;

注意最后一行也可以写成:

C= DataFrames.DataFrame(Pandas.DataFrame(dfj));

要转换回 Pandas.DataFrame(juliandf) 应该可以。

关于python - 如何将 Python pandas 转换为 Julia DataFrame(使用 PyJulia)并返回到 Python Pandas,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63731550/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com