gpt4 book ai didi

python - 使用可在内部转换为数值的非数值目标构建回归量

转载 作者:行者123 更新时间:2023-12-04 10:09:49 26 4
gpt4 key购买 nike

我想构建一个线性回归模型,该模型接受时间戳作为目标,并在内部使用自 1970-01-01 00:00:00 ( pd.Timestamp(0) ) 以来的秒数。 predict应该返回时间戳。

我试图用 TransformedTargetRegressor 完成这项工作.但是,我遇到了 TypeError: invalid type promotion我无法解决。

演示代码:

import pandas as pd
import numpy as np
from sklearn.compose import TransformedTargetRegressor
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import FunctionTransformer

# helper function to convert a 2D numpy array of seconds to a 2D array of timestamps
def _to_timestamp(seconds: np.ndarray):
return pd.DataFrame(seconds).apply(pd.to_datetime, unit='s').values

# helper function to convert a 2D numpy array of timestamps to a 2D array of seconds
def _to_float(timestamps):
deltas = pd.DataFrame(timestamps).sub(pd.Timestamp(0))
return deltas.apply(lambda s: s.dt.total_seconds()).values

# build transformer from helper functions
TimeTransformer = FunctionTransformer(
func=_to_float,
inverse_func=_to_timestamp,
validate=True,
check_inverse=True
)

# make a LinearRegression chained with a TimeTransformer
def TimeTargetLinearRegression():
return TransformedTargetRegressor(
regressor=LinearRegression(),
transformer=TimeTransformer
)

# test run
if __name__ == '__main__':
model = TimeTargetLinearRegression()
X = np.array([[1], [2], [3]], dtype=float)
y = pd.date_range(start=0, periods=3, freq='s')
model.fit(X=X, y=y) # raises TypeError

输出:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.3.3\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.3.3\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/actualpanda/.PyCharmCE2019.3/config/scratches/scratch2.py", line 36, in <module>
model.fit(X=X, y=y) # raises TypeError
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\compose\_target.py", line 185, in fit
self._fit_transformer(y_2d)
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\compose\_target.py", line 139, in _fit_transformer
self.transformer_.fit(y)
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\preprocessing\_function_transformer.py", line 125, in fit
self._check_inverse_transform(X)
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\preprocessing\_function_transformer.py", line 102, in _check_inverse_transform
if not _allclose_dense_sparse(X[idx_selected], X_round_trip):
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\utils\validation.py", line 1288, in _allclose_dense_sparse
return np.allclose(x, y, rtol=rtol, atol=atol)
File "<__array_function__ internals>", line 5, in allclose
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\numpy\core\numeric.py", line 2159, in allclose
res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
File "<__array_function__ internals>", line 5, in isclose
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\numpy\core\numeric.py", line 2254, in isclose
dt = multiarray.result_type(y, 1.)
File "<__array_function__ internals>", line 5, in result_type
TypeError: invalid type promotion

我正在寻找解释/解决 TypeError 的答案并且 - 如果我的方法有缺陷 - 建议一种构建回归器的方法,该回归器可以处理非数字目标(给定的转换和逆转换函数)。

我知道我可以在回归器之外进行变换和逆变换,但我想将该过程封装在一个整洁的、用户友好的模型中,不会泄露其内部结构。

最佳答案

运行输出y通过变换函数后跟逆变换函数比较原y .

当您设置 check_inverse=True 时会发生这种“往返”比较。 ,然后传递给 np.isclose .这是产生错误。

y = pd.date_range(start=0, periods=3, freq='s')
y_ = TimeTransformer.inverse_func(TimeTransformer.func(y))

np.isclose(y, y_)
# raises:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-68-90ab2804af58> in <module>
----> 1 np.isclose(y, y_)

<__array_function__ internals> in isclose(*args, **kwargs)

C:\Anaconda3\lib\site-packages\numpy\core\numeric.py in isclose(a, b, rtol, atol, equal_nan)
2264 # This will cause casting of x later. Also, make sure to allow subclasses
2265 # (e.g., for numpy.ma).
-> 2266 dt = multiarray.result_type(y, 1.)
2267 y = array(y, dtype=dt, copy=False, subok=True)
2268

<__array_function__ internals> in result_type(*args, **kwargs)

TypeError: invalid type promotion

实际错误来自 result_type C-功能。这将检查操作的结果类型,并用于分配新数组。
y2 = np.array(y)
y_ == y2
# returns:
array([[ True],
[ True],
[ True]])

np.isclose(yy, y_)
# raises: TypeError: invalid type promotion

np.core.multiarray.result_type(y_, 1.)
# raises: TypeError: invalid type promotion

我的猜测是 np.datetime64此方法未实现 dtype。

我在github页面上打开了一个问题。

关于python - 使用可在内部转换为数值的非数值目标构建回归量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61386425/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com