gpt4 book ai didi

dataframe - 在 Julia 中将 DataFrame 重新采样为每小时 15 分钟和 5 分钟

转载 作者:行者123 更新时间:2023-12-02 22:54:27 24 4
gpt4 key购买 nike

我对 Julia 还很陌生,但我正在尝试一下,因为基准测试声称它比 Python 快得多。

我正在尝试使用 ["unixtime", "price", "amount"] 格式的一些股票报价数据

我设法加载数据并将 unixtime 转换为 Julia 中的日期,但现在我需要重新采样数据以使用 olhc(开盘价、最高价、最低价、收盘价)作为价格和金额总和, Julia 的特定时段(每小时、15 分钟、5 分钟等...):

julia> head(btc_raw_data)
6x3 DataFrame:
date price amount
[1,] 2011-09-13T13:53:36 UTC 5.8 1.0
[2,] 2011-09-13T13:53:44 UTC 5.83 3.0
[3,] 2011-09-13T13:53:49 UTC 5.9 1.0
[4,] 2011-09-13T13:53:54 UTC 6.0 20.0
[5,] 2011-09-13T14:32:53 UTC 5.95 12.4521
[6,] 2011-09-13T14:35:04 UTC 5.88 7.458

我看到有一个名为 Resampling 的包,但它似乎不接受时间段,只接受我希望输出数据具有的行数。

还有其他选择吗?

最佳答案

您可以使用https://github.com/femtotrader/TimeSeriesIO.jl将DataFrame(来自DataFrames.jl)转换为TimeArray(来自TimeSeries.jl)

using TimeSeriesIO: TimeArray
ta = TimeArray(df, colnames=[:price], timestamp=:date)

您可以使用 TimeSeriesResampler https://github.com/femtotrader/TimeSeriesResampler.jl 重新采样时间序列(来自 TimeSeries.jl 的 TimeArray)和时间范围 https://github.com/femtotrader/TimeFrames.jl

using TimeSeriesResampler: resample, mean, ohlc, sum, TimeFrame

# Define a sample timeseries (prices for example)
idx = DateTime(2010,1,1):Dates.Minute(1):DateTime(2011,1,1)
idx = idx[1:end-1]
N = length(idx)
y = rand(-1.0:0.01:1.0, N)
y = 1000 + cumsum(y)
#df = DataFrame(Date=idx, y=y)
ta = TimeArray(collect(idx), y, ["y"])
println("ta=")
println(ta)

# Define how datetime should be grouped (timeframe)
tf = TimeFrame(dt -> floor(dt, Dates.Minute(15)))

# resample using OHLC values
ta_ohlc = ohlc(resample(ta, tf))
println("ta_ohlc=")
println(ta_ohlc)

# resample using mean values
ta_mean = mean(resample(ta, tf))
println("ta_mean=")
println(ta_mean)

# Define an other sample timeseries (volume for example)
vol = rand(0:0.01:1.0, N)
ta_vol = TimeArray(collect(idx), vol, ["vol"])
println("ta_vol=")
println(ta_vol)

# resample using sum values
ta_vol_sum = sum(resample(ta_vol, tf))
println("ta_vol_sum=")
println(ta_vol_sum)

你应该得到:

julia> ta
525600x1 TimeSeries.TimeArray{Float64,1,DateTime,Array{Float64,1}} 2010-01-01T00:00:00 to 2010-12-31T23:59:00

y
2010-01-01T00:00:00 | 1000.16
2010-01-01T00:01:00 | 1000.1
2010-01-01T00:02:00 | 1000.98
2010-01-01T00:03:00 | 1001.38

2010-12-31T23:56:00 | 972.3
2010-12-31T23:57:00 | 972.85
2010-12-31T23:58:00 | 973.74
2010-12-31T23:59:00 | 972.8


julia> ta_ohlc
35040x4 TimeSeries.TimeArray{Float64,2,DateTime,Array{Float64,2}} 2010-01-01T00:00:00 to 2010-12-31T23:45:00

Open High Low Close
2010-01-01T00:00:00 | 1000.16 1002.5 1000.1 1001.54
2010-01-01T00:15:00 | 1001.57 1002.64 999.38 999.38
2010-01-01T00:30:00 | 999.13 1000.91 998.91 1000.91
2010-01-01T00:45:00 | 1001.0 1006.42 1001.0 1006.42

2010-12-31T23:00:00 | 980.84 981.56 976.53 976.53
2010-12-31T23:15:00 | 975.74 977.46 974.71 975.31
2010-12-31T23:30:00 | 974.72 974.9 971.73 972.07
2010-12-31T23:45:00 | 972.33 973.74 971.49 972.8


julia> ta_mean
35040x1 TimeSeries.TimeArray{Float64,1,DateTime,Array{Float64,1}} 2010-01-01T00:00:00 to 2010-12-31T23:45:00

y
2010-01-01T00:00:00 | 1001.1047
2010-01-01T00:15:00 | 1001.686
2010-01-01T00:30:00 | 999.628
2010-01-01T00:45:00 | 1003.5267

2010-12-31T23:00:00 | 979.1773
2010-12-31T23:15:00 | 975.746
2010-12-31T23:30:00 | 973.482
2010-12-31T23:45:00 | 972.3427

julia> ta_vol
525600x1 TimeSeries.TimeArray{Float64,1,DateTime,Array{Float64,1}} 2010-01-01T00:00:00 to 2010-12-31T23:59:00

vol
2010-01-01T00:00:00 | 0.37
2010-01-01T00:01:00 | 0.67
2010-01-01T00:02:00 | 0.29
2010-01-01T00:03:00 | 0.28

2010-12-31T23:56:00 | 0.74
2010-12-31T23:57:00 | 0.66
2010-12-31T23:58:00 | 0.22
2010-12-31T23:59:00 | 0.47


julia> ta_vol_sum
35040x1 TimeSeries.TimeArray{Float64,1,DateTime,Array{Float64,1}} 2010-01-01T00:00:00 to 2010-12-31T23:45:00

vol
2010-01-01T00:00:00 | 7.13
2010-01-01T00:15:00 | 6.99
2010-01-01T00:30:00 | 8.73
2010-01-01T00:45:00 | 8.27

2010-12-31T23:00:00 | 6.11
2010-12-31T23:15:00 | 7.49
2010-12-31T23:30:00 | 5.75
2010-12-31T23:45:00 | 8.36

关于dataframe - 在 Julia 中将 DataFrame 重新采样为每小时 15 分钟和 5 分钟,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20816813/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com