gpt4 book ai didi

dataframe - 加入多个数据帧

转载 作者:行者123 更新时间:2023-12-04 02:28:40 24 4
gpt4 key购买 nike

我想知道 Julia DataFrames 是否有办法一次性加入多个数据帧,

 using DataFrames

employer = DataFrame(
ID = Array{Int64}([01,02,03,04,05,09,11,20]),
name = Array{String}(["Matthews","Daniella", "Kofi", "Vladmir", "Jean", "James", "Ayo", "Bill"])
)

salary = DataFrame(
ID = Array{Int64}([01,02,03,04,05,06,08,23]),
amount = Array{Int64}([2050,3000,3500,3500,2500,3400,2700,4500])
)

hours = DataFrame(
ID = Array{Int64}([01,02,03,04,05,08,09,23]),
time = Array{Int64}([40,40,40,40,40,38,45,50])
)

# I tried adding them in an array but ofcoures that results in an error
empSalHrs = innerjoin([employer,salary,hours], on = :ID)

# In python you can achieve this using
import pandas as pd
from functools import reduce

df = reduce(lambda l,r : pd.merge(l,r, on = "ID"), [employer, salary, hours])
在 julia 中是否有类似的方法可以做到这一点?

最佳答案

你快到了。正如 DataFrames.jl manual 中所写的那样你只需要传递一个以上的数据帧作为参数。

using DataFrames

employer = DataFrame(
ID = [01,02,03,04,05,09,11,20],
name = ["Matthews","Daniella", "Kofi", "Vladmir", "Jean", "James", "Ayo", "Bill"])


salary = DataFrame(
ID = [01,02,03,04,05,06,08,23],
amount = [2050,3000,3500,3500,2500,3400,2700,4500])


hours = DataFrame(
ID = [01,02,03,04,05,08,09,23],
time = [40,40,40,40,40,38,45,50]
)

empSalHrs = innerjoin(employer,salary,hours, on = :ID)
如果出于某种原因您需要将数据帧放入 Vector ,您可以使用拆分来实现相同的结果
empSalHrs = innerjoin([employer,salary,hours]..., on = :ID)
另外,请注意,我稍微更改了数据框的定义。自 Array{Int}是一种抽象类型,不应该用于变量声明,因为它不利于性能。在这种特殊情况下,这可能并不重要,但最好从一开始就养成良好的习惯。而不是 Array{Int}一个可以使用
  • Array{Int, 1}([1, 2, 3, 4])
  • Vector{Int}([1, 2, 3, 4])
  • Int[1, 2, 3]
  • [1, 2, 3]

  • 最后一个是合法的,因为在这个简单的场景中,Julia 可以自己推断容器的类型。

    关于dataframe - 加入多个数据帧,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65649732/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com