python - 为什么我的 tanh 激活函数表现如此糟糕？-6ren

python - 为什么我的 tanh 激活函数表现如此糟糕？

转载作者：行者123 更新时间：2023-11-28 17:29:13

26

4

我有两个感知器算法，除了激活函数外，它们都相同。一个使用单步函数 1 if u >= 0 else -1 另一个使用 tanh 函数 np.tanh(u)。

我预计 tanh 的性能会优于该步骤，但实际上相比之下它的性能非常糟糕。我在这里做错了什么，或者它在问题集上表现不佳的原因是什么？

import numpy as np
import matplotlib.pyplot as plt

# generate 20 two-dimensional training data
# data must be linearly separable

# C1: u = (0,0) / E = [1 0; 0 1]; C2: u = (4,0), E = [1 0; 0 1] where u, E represent centre & covariance matrix of the
# Gaussian distribution respectively


def step(u):
    return 1 if u >= 0 else -1


def sigmoid(u):
    return np.tanh(u)

c1mean = [0, 0]
c2mean = [4, 0]
c1cov = [[1, 0], [0, 1]]
c2cov = [[1, 0], [0, 1]]
x = np.ones((40, 3))
w = np.zeros(3)     # [0, 0, 0]
w2 = np.zeros(3)    # second set of weights to see how another classifier compares
t = []  # target array

# +1 for the first 20 then -1
for i in range(0, 40):
    if i < 20:
        t.append(1)
    else:
        t.append(-1)

x1, y1 = np.random.multivariate_normal(c1mean, c1cov, 20).T
x2, y2 = np.random.multivariate_normal(c2mean, c2cov, 20).T

# concatenate x1 & x2 within the first dimension of x and the same for y1 & y2 in the second dimension
for i in range(len(x)):
    if i >= 20:
        x[i, 0] = x2[(i-20)]
        x[i, 1] = y2[(i-20)]
    else:
        x[i, 0] = x1[i]
        x[i, 1] = y1[i]

errors = []
errors2 = []
lr = 0.0001
n = 10

for i in range(n):
    count = 0
    for row in x:
        dot = np.dot(w, row)
        response = step(dot)
        errors.append(t[count] - response)
        w += lr * (row * (t[count] - response))
        count += 1

for i in range(n):
    count = 0
    for row in x:
        dot = np.dot(w2, row)
        response = sigmoid(dot)
        errors2.append(t[count] - response)
        w2 += lr * (row * (t[count] - response))
        count += 1

print(errors[-1], errors2[-1])

# distribution
plt.figure(1)
plt.plot((-(w[2]/w[0]), 0), (0, -(w[2]/w[1])))
plt.plot(x1, y1, 'x')
plt.plot(x2, y2, 'ro')
plt.axis('equal')
plt.title('Heaviside')

# training error
plt.figure(2)
plt.ylabel('error')
plt.xlabel('iterations')
plt.plot(errors)
plt.title('Heaviside Error')

plt.figure(3)
plt.plot((-(w2[2]/w2[0]), 0), (0, -(w2[2]/w2[1])))
plt.plot(x1, y1, 'x')
plt.plot(x2, y2, 'ro')
plt.axis('equal')
plt.title('Sigmoidal')

plt.figure(4)
plt.ylabel('error')
plt.xlabel('iterations')
plt.plot(errors2)
plt.title('Sigmoidal Error')

plt.show()

编辑:即使从我显示的误差图中，tanh 函数也显示了一些收敛性，因此可以合理地假设只需增加迭代次数或降低学习率就可以减少误差。然而，我想我真的在问，考虑到阶跃函数的显着更好的性能，对于哪些问题集，将 tanh 与感知器一起使用是可行的？

最佳答案

如评论中所述，您的学习率太小，因此需要大量迭代才能收敛。因此，为了获得可比较的输出，您可以增加 n 和/或 lr。

如果将 lr 增加到例如0.1(也 1 工作正常)和 n 到 10000，结果看起来几乎相同(见下图)和行

print(errors[-1], errors2[-1])

返回

(0, -8.4289020207961585e-11)

如果您再次运行它，这些值可能会有所不同，因为没有为随机数设置种子。

这是我为上述值得到的图:

关于python - 为什么我的 tanh 激活函数表现如此糟糕？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/35803431/

26

4

0

文章推荐： javascript - 如何将 img 元素放在透明 img 元素后面？

文章推荐： html - 如何使用 HTML 为输入元素创建子标签

文章推荐： html - 如何更改 Bootstrap 导航栏中的链接背景颜色

ios - 来自 Accelerate 的 tanh 与 C 中的 tanh
我刚刚看到 Accelerate 框架有这个:我有一个执行大量 tanh 计算的函数。 void vvatanh ( double *, const double *, const int * );
machine-learning - tensorflow 中的激活函数有区别吗？ tf.nn.tanh 与 tf.tanh
我想设置一个神经网络，我问自己这两个函数之间是否有区别？ tf.nn.tanh 与 tf.tanh 最佳答案很容易确认它们是相同的: In [1]: import tensorflow as tf
python - Tanh-sinh求积数值积分法收敛到错误值
我正在尝试编写一个 Python 程序来使用 Tanh-sinh 求积来计算以下值: 但是尽管程序收敛到一个合理的值并且在每种情况下都没有错误，但它没有收敛到正确的值(对于这个特定的积分是 pi)而且
c++ - 如何将 tanh 函数作为参数传递给另一个函数？
我有一个函数，它接受另一个函数作为参数: #include void someFunc(const std::function &fn); 如何将函数 tanh 作为 someFunc 的参数传递？
C# Complex Tanh 对于大值失败
这是 Microsoft 为 Complex 的 Sinh 实现的 public static Complex Sinh(Complex value) /* Hyperbolic sin */ {
javascript - Tanh 为大输入返回 NaN？
在我的 node.js 程序中，我运行了这段代码 console.log(Math.tanh(-858.625086043538)); 它返回了 NaN。然而，tanh(双曲正切)http://mat
machine-learning - 在生成器网络的输出层中使用 Tanh()
我正在研究生成对抗网络。最近，在阅读 Radford 等人的一篇论文时。 here ，我发现他们的生成器网络的输出层使用了Tanh()。 Tanh()的取值范围是(-1, 1)，而 double 格式
python - 为什么我的 tanh 激活函数表现如此糟糕？
我有两个感知器算法，除了激活函数外，它们都相同。一个使用单步函数 1 if u >= 0 else -1 另一个使用 tanh 函数 np.tanh(u)。我预计 tanh 的性能会优于该步骤，但实
Java Math.tanh() 性能
我有一个 Java 程序，它多次调用 Math.tanh() 函数。出于好奇，我想与 C++ 进行比较。因此我写了两个小程序，一个是Java，一个是C++，来测试。 Java代码: public cl
python - tanh 需要多少个 FLOP？
我想计算 LeNet-5 ( paper) 的每一层需要多少触发器。一些论文总共给出了其他架构的 FLOPs(1，2，3)但是，这些论文没有详细说明如何计算 FLOPs 的数量，我不知道有多少 FLO
neural-network - 具有正则化数据的 tanh 错误饱和度的神经网络
我正在使用由 4 个输入神经元、1 个由 20 个神经元组成的隐藏层和一个由 7 个神经元输出层组成的神经网络。我正在尝试为 bcd 到 7 段算法训练它。我的数据被归一化 0 是 -1，1 是 1
c# - 数学库中 Math.tanh 的倒数在哪里？
y = Math.Tanh(x) 是 x 的双曲正切值。但我需要 f(y) = x。对于正切线，有 Arctan，但 Arctanh 在哪里？谢谢! 最佳答案我认为 C# 库不包含弧双曲三角函数，
function - 神经激活函数 - Logistic/Tanh/等之间的差异
我正在编写一些基本的神经网络方法 - 特别是激活函数 - 并且已经达到了我垃圾数学知识的极限。我理解各自的范围(-1/1)(0/1)等，但不同的描述和实现让我感到困惑。具体来说，sigmoid、lo
javascript - 神经网络连续 tanh-Sigmoid 激活函数和随机权重
我真的需要帮助在非常基本的神经网络中实现连续的 tanh-sigmoid 激活函数。如果你能给出一个基本的例子那就太好了，但如果你能在我的 source code 中改变它我将不胜感激!另外，随机权重
python - PYTHON 中的 tanh 估计器归一化
有人知道如何在 python 中实现 tanh-estimator 吗？我有一个不遵循高斯分布的数字列表。我想使用 tanh-estimator 作为预处理步骤，但我不知道如何在 python 中实现
f# - 在 Math.NET 符号中解析 Tanh
MathNet.Symbolics.Infix 解析器是否有识别更复杂的三角函数(如 tanh)的方法？我在 F# 中尝试了以下内容，但无法识别(我得到一个未定义的表达式)。当我将“tanh”替换为“
neural-network - sigmoid 和 tanh 的数据集值分布
正如许多论文指出的那样，为了获得更好的神经网络学习曲线，最好以值匹配高斯曲线的方式对数据集进行归一化。这是否仅在我们使用 sigmoid 函数作为压缩函数时适用？如果不是，哪种偏差最适合 tanh
python - 如何将 tanh 添加到 keras 中的一个嵌入层
我想使用 keras 功能 api 将一个 tanh 层添加到嵌入层: x=layers.Embedding(vocab_size, 8, input_length=max_length)(input
machine-learning - 与 tanh 这样的激活函数相比，在最后一层使用线性激活函数有什么好处吗？
我知道这个决定取决于任务，但让我解释一下。我正在设计一个模型，该模型使用末端具有密集层的卷积神经网络来预测给定仪表板视频帧的转向角度。在我的最后一个密集层中，我只有一个预测转向角的单元。我的问题是
c++ - 为什么在我的机器上 tanh 比 exp 快？
这个问题产生于 separate question ，结果证明它有一些明显的机器特定的怪癖。当我运行下面列出的 C++ 代码来记录 tanh 和 exp 之间的时间差异时，我看到以下结果: tanh:

首页

博学

6Ren·AI

商城

python - 为什么我的 tanh 激活函数表现如此糟糕？