- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
谢谢大家试图帮助我理解下面的问题。我已经更新了问题并生成了 CPU-only 运行和 GPU-only 运行。通常,在任何一种情况下,直接 numpy
计算似乎都比 model. predict()
快数百倍。希望这可以澄清这似乎不是 CPU 与 GPU 问题(如果是,我希望得到解释)。
让我们用 keras 创建一个训练有素的模型。
import tensorflow as tf
(X,Y),(Xt,Yt) = tf.keras.datasets.mnist.load_data()
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1000,'relu'),
tf.keras.layers.Dense(100,'relu'),
tf.keras.layers.Dense(10,'softmax'),
])
model.compile('adam','sparse_categorical_crossentropy')
model.fit(X,Y,epochs=20,batch_size=1024)
现在让我们使用
model.predict
重新创建
numpy
函数。
import numpy as np
W = model.get_weights()
def predict(X):
X = X.reshape((X.shape[0],-1)) #Flatten
X = X @ W[0] + W[1] #Dense
X[X<0] = 0 #Relu
X = X @ W[2] + W[3] #Dense
X[X<0] = 0 #Relu
X = X @ W[4] + W[5] #Dense
X = np.exp(X)/np.exp(X).sum(1)[...,None] #Softmax
return X
我们可以很容易地验证这些是相同的功能(实现中的模块机器错误)。
print(model.predict(X[:100]).argmax(1))
print(predict(X[:100]).argmax(1))
我们还可以测试这些函数的运行速度。使用
ipython
:
%timeit model.predict(X[:10]).argmax(1) # 10 loops takes 37.7 ms
%timeit predict(X[:10]).argmax(1) # 1000 loops takes 356 µs
我知道
predict
在小批量下的运行速度比
model. predict
快
10,000 倍,而在大批量下,它的运行速度降低到
100 倍左右。无论如何,为什么
predict
这么快?事实上,
predict
甚至没有优化,我们可以使用
numba
,甚至直接在
predict
代码中重新编写
C
并编译它。
keras
内部执行的速度快数千倍?这也意味着编写脚本以利用
.h5
文件或类似文件,可能比手动重写预测函数要慢得多。一般来说,这是真的吗?
Python 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.19.0 -- An enhanced Interactive Python. Type '?' for help.
PyDev console: using IPython 7.19.0
Python 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] on win32
import os
os.environ["CUDA_VISIBLE_DEVICES"]="-1"
import tensorflow as tf
(X,Y),(Xt,Yt) = tf.keras.datasets.mnist.load_data()
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1000,'relu'),
tf.keras.layers.Dense(100,'relu'),
tf.keras.layers.Dense(10,'softmax'),
])
model.compile('adam','sparse_categorical_crossentropy')
model.fit(X,Y,epochs=20,batch_size=1024)
2021-04-19 15:10:58.323137: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-04-19 15:11:01.990590: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2021-04-19 15:11:02.039285: E tensorflow/stream_executor/cuda/cuda_driver.cc:328] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2021-04-19 15:11:02.042553: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-G0U8S3P
2021-04-19 15:11:02.043134: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-G0U8S3P
2021-04-19 15:11:02.128834: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:127] None of the MLIR optimization passes are enabled (registered 2)
Epoch 1/20
59/59 [==============================] - 4s 60ms/step - loss: 35.3708
Epoch 2/20
59/59 [==============================] - 3s 58ms/step - loss: 0.8671
Epoch 3/20
59/59 [==============================] - 3s 56ms/step - loss: 0.5641
Epoch 4/20
59/59 [==============================] - 3s 56ms/step - loss: 0.4359
Epoch 5/20
59/59 [==============================] - 3s 56ms/step - loss: 0.3447
Epoch 6/20
59/59 [==============================] - 3s 56ms/step - loss: 0.2891
Epoch 7/20
59/59 [==============================] - 3s 56ms/step - loss: 0.2371
Epoch 8/20
59/59 [==============================] - 3s 57ms/step - loss: 0.1977
Epoch 9/20
59/59 [==============================] - 3s 57ms/step - loss: 0.1713
Epoch 10/20
59/59 [==============================] - 3s 57ms/step - loss: 0.1381
Epoch 11/20
59/59 [==============================] - 4s 61ms/step - loss: 0.1203
Epoch 12/20
59/59 [==============================] - 3s 57ms/step - loss: 0.1095
Epoch 13/20
59/59 [==============================] - 3s 56ms/step - loss: 0.0877
Epoch 14/20
59/59 [==============================] - 3s 57ms/step - loss: 0.0793
Epoch 15/20
59/59 [==============================] - 3s 56ms/step - loss: 0.0727
Epoch 16/20
59/59 [==============================] - 3s 56ms/step - loss: 0.0702
Epoch 17/20
59/59 [==============================] - 3s 56ms/step - loss: 0.0701
Epoch 18/20
59/59 [==============================] - 3s 57ms/step - loss: 0.0631
Epoch 19/20
59/59 [==============================] - 3s 56ms/step - loss: 0.0539
Epoch 20/20
59/59 [==============================] - 3s 58ms/step - loss: 0.0493
Out[3]: <tensorflow.python.keras.callbacks.History at 0x143069fdf40>
import numpy as np
W = model.get_weights()
def predict(X):
X = X.reshape((X.shape[0],-1)) #Flatten
X = X @ W[0] + W[1] #Dense
X[X<0] = 0 #Relu
X = X @ W[2] + W[3] #Dense
X[X<0] = 0 #Relu
X = X @ W[4] + W[5] #Dense
X = np.exp(X)/np.exp(X).sum(1)[...,None] #Softmax
return X
%timeit model.predict(X[:10]).argmax(1) # 10 loops takes 37.7 ms
%timeit predict(X[:10]).argmax(1) # 1000 loops takes 356 µs
52.8 ms ± 2.13 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
640 µs ± 10.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Ipython 输出(GPU):
Python 3.7.7 (default, Mar 26 2020, 15:48:22)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.4.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import tensorflow as tf
...:
...: (X,Y),(Xt,Yt) = tf.keras.datasets.mnist.load_data()
...:
...: model = tf.keras.models.Sequential([
...: tf.keras.layers.Flatten(),
...: tf.keras.layers.Dense(1000,'relu'),
...: tf.keras.layers.Dense(100,'relu'),
...: tf.keras.layers.Dense(10,'softmax'),
...: ])
...: model.compile('adam','sparse_categorical_crossentropy')
...: model.fit(X,Y,epochs=20,batch_size=1024)
2020-07-01 15:50:46.008518: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-01 15:50:46.054495: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:05:00.0
2020-07-01 15:50:46.059582: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-07-01 15:50:46.114562: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-07-01 15:50:46.142058: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-07-01 15:50:46.152899: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-07-01 15:50:46.217725: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-07-01 15:50:46.260758: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-07-01 15:50:46.374328: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-01 15:50:46.376747: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-07-01 15:50:46.377688: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX FMA
2020-07-01 15:50:46.433422: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 4018875000 Hz
2020-07-01 15:50:46.434383: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x563e4d0d71c0 executing computations on platform Host. Devices:
2020-07-01 15:50:46.435119: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version
2020-07-01 15:50:46.596077: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x563e4a9379f0 executing computations on platform CUDA. Devices:
2020-07-01 15:50:46.596119: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
2020-07-01 15:50:46.597894: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:05:00.0
2020-07-01 15:50:46.597961: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-07-01 15:50:46.597988: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-07-01 15:50:46.598014: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-07-01 15:50:46.598040: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-07-01 15:50:46.598065: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-07-01 15:50:46.598090: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-07-01 15:50:46.598115: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-01 15:50:46.599766: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-07-01 15:50:46.600611: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-07-01 15:50:46.603713: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-01 15:50:46.603751: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-07-01 15:50:46.603763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-07-01 15:50:46.605917: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10311 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:05:00.0, compute capability: 7.5)
Train on 60000 samples
Epoch 1/20
2020-07-01 15:50:49.995091: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
60000/60000 [==============================] - 2s 26us/sample - loss: 9.9370
Epoch 2/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.6094
Epoch 3/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.3672
Epoch 4/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.2720
Epoch 5/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.2196
Epoch 6/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.1673
Epoch 7/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.1367
Epoch 8/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.1082
Epoch 9/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.0895
Epoch 10/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.0781
Epoch 11/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.0666
Epoch 12/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.0537
Epoch 13/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.0459
Epoch 14/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.0412
Epoch 15/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.0401
Epoch 16/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.0318
Epoch 17/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.0275
Epoch 18/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.0237
Epoch 19/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.0212
Epoch 20/20
60000/60000 [==============================] - 0s 4us/sample - loss: 0.0199
Out[1]: <tensorflow.python.keras.callbacks.History at 0x7f7c9000b550>
In [2]: import numpy as np
...:
...: W = model.get_weights()
...:
...: def predict(X):
...: X = X.reshape((X.shape[0],-1)) #Flatten
...: X = X @ W[0] + W[1] #Dense
...: X[X<0] = 0 #Relu
...: X = X @ W[2] + W[3] #Dense
...: X[X<0] = 0 #Relu
...: X = X @ W[4] + W[5] #Dense
...: X = np.exp(X)/np.exp(X).sum(1)[...,None] #Softmax
...: return X
...:
In [3]: print(model.predict(X[:100]).argmax(1))
...: print(predict(X[:100]).argmax(1))
[5 0 4 1 9 2 1 3 1 4 3 5 3 6 1 7 2 8 6 9 4 0 9 1 1 2 4 3 2 7 3 8 6 9 0 5 6
0 7 6 1 8 7 9 3 9 8 5 9 3 3 0 7 4 9 8 0 9 4 1 4 4 6 0 4 5 6 1 0 0 1 7 1 6
3 0 2 1 1 7 5 0 2 6 7 8 3 9 0 4 6 7 4 6 8 0 7 8 3 1]
/home/bobbyocean/anaconda3/bin/ipython3:12: RuntimeWarning: overflow encountered in exp
/home/bobbyocean/anaconda3/bin/ipython3:12: RuntimeWarning: invalid value encountered in true_divide
[5 0 4 1 9 2 1 3 1 4 3 5 3 6 1 7 2 8 6 9 4 0 9 1 1 2 4 3 2 7 3 8 6 9 0 5 6
0 7 6 1 8 7 9 3 9 8 5 9 3 3 0 7 4 9 8 0 9 4 1 4 4 6 0 4 5 6 1 0 0 1 7 1 6
3 0 2 1 1 7 5 0 2 6 7 8 3 9 0 4 6 7 4 6 8 0 7 8 3 1]
In [4]: %timeit model.predict(X[:10]).argmax(1)
37.7 ms ± 806 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [5]: %timeit predict(X[:10]).argmax(1)
361 µs ± 13.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
最佳答案
我们观察到 主要问题是 Eager Execution 的原因模式 .我们根据 对您的代码和相应的结果进行粗略的了解。中央处理器 和 显卡 基地。确实numpy
不适用于 显卡 , 所以不像 tf-gpu
,它不会遇到任何数据转移开销。
但它也是 很明显您定义的 predict
完成了多少快速计算使用 np
的方法比较 model. predict
与 tf. keras
,而输入测试集是 仅 10 个 sample .但是,我们没有做任何深入的分析,就像一件艺术品 here你可能喜欢阅读。
我的设置如下。我正在使用 Colab 环境和检查中央处理器 和 显卡 模式。
TensorFlow 1.15.2
Keras 2.3.1
Numpy 1.19.5
TensorFlow 2.4.1
Keras 2.4.0
Numpy 1.19.5
%tensorflow_version 1.x
import os
os.environ["CUDA_VISIBLE_DEVICES"]="-1"
import tensorflow as tf
from tensorflow.python.client import device_lib
print(tf.__version__)
print('A: ', tf.test.is_built_with_cuda)
print('B: ', tf.test.gpu_device_name())
local_device_protos = device_lib.list_local_devices()
([x.name for x in local_device_protos if x.device_type == 'GPU'],
[x.name for x in local_device_protos if x.device_type == 'CPU'])
TensorFlow 1.x selected.
1.15.2
A: <function is_built_with_cuda at 0x7f122d58dcb0>
B:
([], ['/device:CPU:0'])
现在,运行您的代码。
import tensorflow as tf
import keras
print(tf.executing_eagerly()) # False
(X,Y),(Xt,Yt) = keras.datasets.mnist.load_data()
model = keras.models.Sequential([])
model.compile
model.fit
%timeit model.predict(X[:10]).argmax(1) # yours: 10 loops takes 37.7 ms
%timeit predict(X[:10]).argmax(1) # yours: 1000 loops takes 356 µs
1000 loops, best of 5: 1.07 ms per loop
1000 loops, best of 5: 1.48 ms per loop
我们可以看到执行时间是
可比 与旧
keras
.现在,让我们用
进行测试显卡 还。
%tensorflow_version 1.x
import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"
import tensorflow as tf
from tensorflow.python.client import device_lib
print(tf.__version__)
print('A: ', tf.test.is_built_with_cuda)
print('B: ', tf.test.gpu_device_name())
local_device_protos = device_lib.list_local_devices()
([x.name for x in local_device_protos if x.device_type == 'GPU'],
[x.name for x in local_device_protos if x.device_type == 'CPU'])
1.15.2
A: <function is_built_with_cuda at 0x7f0b5ad46830>
B: /device:GPU:0
(['/device:GPU:0'], ['/device:CPU:0'])
...
...
%timeit model.predict(X[:10]).argmax(1) # yours: 10 loops takes 37.7 ms
%timeit predict(X[:10]).argmax(1) # yours: 1000 loops takes 356 µs
1000 loops, best of 5: 1.02 ms per loop
1000 loops, best of 5: 1.44 ms per loop
现在,这里的执行时间也与旧的
keras
相当。并且没有渴望模式。现在让我们看看新的
tf. keras
首先使用 Eager 模式,然后我们在没有 Eager 模式的情况下进行观察。
import os
os.environ["CUDA_VISIBLE_DEVICES"]="-1"
import tensorflow as tf
from tensorflow.python.client import device_lib
print(tf.__version__)
print('A: ', tf.test.is_built_with_cuda)
print('B: ', tf.test.gpu_device_name())
local_device_protos = device_lib.list_local_devices()
([x.name for x in local_device_protos if x.device_type == 'GPU'],
[x.name for x in local_device_protos if x.device_type == 'CPU'])
2.4.1
A: <function is_built_with_cuda at 0x7fed85de3560>
B:
([], ['/device:CPU:0'])
现在,以 Eager 模式运行代码。
import tensorflow as tf
import keras
print(tf.executing_eagerly()) # True
(X,Y),(Xt,Yt) = keras.datasets.mnist.load_data()
model = keras.models.Sequential([ ])
model.compile
model.fit
%timeit model.predict(X[:10]).argmax(1) # yours: 10 loops takes 37.7 ms
%timeit predict(X[:10]).argmax(1) # yours: 1000 loops takes 356 µs
10 loops, best of 5: 28 ms per loop
1000 loops, best of 5: 1.73 ms per loop
急切禁用
import tensorflow as tf
import keras
# # Disables eager execution
tf.compat.v1.disable_eager_execution()
# or,
# Disables eager execution of tf.functions.
# tf.config.run_functions_eagerly(False)
print(tf.executing_eagerly())
False
(X,Y),(Xt,Yt) = keras.datasets.mnist.load_data()
model = keras.models.Sequential([])
model.compile
model.fit
%timeit model.predict(X[:10]).argmax(1) # yours: 10 loops takes 37.7 ms
%timeit predict(X[:10]).argmax(1) # yours: 1000 loops takes 356 µs
1000 loops, best of 5: 1.37 ms per loop
1000 loops, best of 5: 1.57 ms per loop
现在,我们可以看到执行时间是
可比 在新的
tf. keras
中禁用急切模式.现在,让我们用
进行测试显卡 模式也。
import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"
import tensorflow as tf
from tensorflow.python.client import device_lib
print(tf.__version__)
print('A: ', tf.test.is_built_with_cuda)
print('B: ', tf.test.gpu_device_name())
local_device_protos = device_lib.list_local_devices()
([x.name for x in local_device_protos if x.device_type == 'GPU'],
[x.name for x in local_device_protos if x.device_type == 'CPU'])
2.4.1
A: <function is_built_with_cuda at 0x7f16ad88f680>
B: /device:GPU:0
(['/device:GPU:0'], ['/device:CPU:0'])
import tensorflow as tf
import keras
print(tf.executing_eagerly()) # True
(X,Y),(Xt,Yt) = keras.datasets.mnist.load_data()
model = keras.models.Sequential([ ])
model.compile
model.fit
%timeit model.predict(X[:10]).argmax(1) # yours: 10 loops takes 37.7 ms
%timeit predict(X[:10]).argmax(1) # yours: 1000 loops takes 356 µs
10 loops, best of 5: 26.3 ms per loop
1000 loops, best of 5: 1.48 ms per loop
急切禁用
# Disables eager execution
tf.compat.v1.disable_eager_execution()
# or,
# Disables eager execution of tf.functions.
# tf.config.run_functions_eagerly(False)
print(tf.executing_eagerly()) # False
(X,Y),(Xt,Yt) = keras.datasets.mnist.load_data()
model = keras.models.Sequential([ ])
model.compile
model.fit
%timeit model.predict(X[:10]).argmax(1) # yours: 10 loops takes 37.7 ms
%timeit predict(X[:10]).argmax(1) # yours: 1000 loops takes 356 µs
1000 loops, best of 5: 1.12 ms per loop
1000 loops, best of 5: 1.45 ms per loop
和以前一样,执行时间是
可比 在新的
tf. keras
中使用非急切模式.这就是为什么,
渴望模式是
tf. keras
性能变慢的根本原因比直
numpy
.
关于python - TF.Keras model.predict 比直接 Numpy 慢?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62681257/
如果我在 C 中调用一个函数并传入一个结构(对那些 C++ 读者来说不是通过指针或引用),它会复制该对象。如果我传入一个包含数组的结构,它会复制该数组(如教授在类里面所说)。但是,如果我传入一个包含对
在 vim 等中,您可以使用 CTRLA 和 CTRLX 增加或减少光标所在的数字。然而,这会增加总数,但我想简单地增加光标正下方的数字。这有点难以描述,所以这就是我的意思: Ctrl+A usage
我正在将 Spring 4.3.2 项目升级到 Spring 5.1.5。我的一个测试用例开始因错误而失败。 ClassNotFoundException: org.hibernate.propert
我想在 Java 中分配一个直接 IntBuffer,比如说 10 亿个元素(64 位系统)。我知道的唯一方法是创建一个直接 ByteBuffer 并将其视为直接 IntBuffer。但是,4*1,0
我正在寻找特定的打印机或某些打印机上存在的技术(接口(interface)、标准、协议(protocol)),这使得可以使用 AJAX 从 Web 浏览器实现直接打印。 这意味着打印机必须: 网络接口
我正在寻求实现删除确认表单的最佳实践建议。 除其他选项外,以下页面包含删除按钮... /website/features/f/123 ...当点击一个简单的表单时,会在以下 url 下加载: /web
我正在使用直接 Web 远程处理库在我的应用程序中执行一些 ajax 调用。我有一个问题,我认为归结为服务调用的延迟响应。以下是我认为有问题的部分代码。问题出在 getDefaultReviewerT
我想替换 Javascript confirm() 函数以允许自定义按钮而不是 Yes/Cancel。我尝试搜索,但所有解决方案都是事件驱动的,例如 jquery 对话框(代码不等待响应但它是事件驱动
我知道有几个类似的问题,但是,其中的示例并没有说明问题,或者我无法从中获利 - 我真可耻。 所以我的问题是在带有 GUI 的简单应用程序中加载图像。例如: 我在 "D:\javaeclipseprog
我想用不同的颜色为表格的行着色,所以我正在使用它 table#news tr:nth-child(even) { background-color: red; } table#news
下面的测试代码不起作用 from("direct:start").setExchangePattern(ExchangePattern.InOnly).threads(5).delay(2000).b
我在 python 中实现的第一个项目之一是对棒渗流进行蒙特卡罗模拟。代码不断增长。第一部分是棍子渗滤的可视化。在宽度*长度的区域中,使用随机起始坐标和方向绘制具有一定长度的直棒的定义密度(棒/面积)
跟踪直接文件下载的最佳方法是什么?我找到了一些解决方案,例如这个: http://www.gayadesign.com/diy/download-counter-in-php-using-htacce
我在一个线程中有一个直接的 ByteBuffer(堆外),并使用 JMM 给我的一种机制将它安全地发布到另一个线程。 happens-before 关系是否扩展到由 ByteBuffer 包装的 na
当我测试直接 java.nio.ByteBuffer 的读取性能时,我注意到绝对读取平均比相对读取快 2 倍。此外,如果我比较相对读取与绝对读取的源代码,除了相对读取维护和内部计数器外,代码几乎相同。
我知道这个问题已经被问了无数次,并且在很多情况下都得到了答案。我相信我已经阅读了其中的大部分内容。不幸的是,我在这上面能找到的一切 简单说明 ElementRef.nativeElement不好,不要
回到一些 C 语言工作。 我的许多函数看起来像这样: int err = do_something(arg1, arg2, arg3, &result); 根据意图,结果由函数填充,返回值是调用的状态
当我将 XML 提交到 https://secure-test.WorldPay.com/jsp/merchant/xml/paymentService.jsp 时: Personalised
就目前而言,这个问题不适合我们的问答形式。我们希望答案得到事实、引用或专业知识的支持,但这个问题可能会引起辩论、争论、投票或扩展讨论。如果您觉得这个问题可以改进并可能重新打开,visit the he
我的 Angular 路由行为有问题。刷新或输入的 url 像/user 总是将我重定向到/home。我还在 index.html 文件中设置了 。通过单击导航菜单按钮一切正常。但是一旦我尝试刷新页面
我是一名优秀的程序员,十分优秀!