gpt4 book ai didi

python - Tensorflow 卷积层中的 FLOP

转载 作者:行者123 更新时间:2023-12-01 06:40:55 30 4
gpt4 key购买 nike

我想知道 Tensorflow 卷积层中浮点运算的数量。

当我等待这个功能在 TF 2.x 上发布时,我在 TF 1.x 上进行了尝试,结果我不明白它是如何计算的,其中之一非常糟糕令人印象深刻(检查第三季度)。

我有以下代码:

tf.reset_default_graph()
model = tf.keras.models.Sequential([
InputLayer((32, 32, 1)),
# Conv2D(1, 5, padding='same'),
# Flatten(),
# Dense(1, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')

opts = tf.profiler.ProfileOptionBuilder.float_operation()
profile = tf.profiler.profile(tf.get_default_graph(), tf.RunMetadata(), cmd='op', options=opts)
profile.total_float_ops

完整要点如下:

https://colab.research.google.com/gist/eduardo4jesus/6721ec992c402bcdc834ab2edbc1b2b4/tf1-flops.ipynb

下面的结果如何解释?

  1. 如果我运行上面的代码,仅在 InputLayer 未注释的情况下,FLOPS 输出为 2

Q1: Why 2?

  • 如果我运行以下代码,输出为 2050
  • model = tf.keras.models.Sequential([
    InputLayer((32, 32, 1)),
    Flatten(),
    Dense(1, activation='softmax')
    ])

    Q2: Why 2050?? I was expecting 1026 from 1024 plus those unexplained 2. These 1024 would be from the weights of the dense layer, since we have one neurone is one parameter per each input feature, therefore, 1024. Again, why double? (Back propagation??)

  • 最有趣也最重要的一个。如果我运行以下代码,输出为 2101
  • model = tf.keras.models.Sequential([
    InputLayer((32, 32, 1)),
    Conv2D(1, 5, padding='same'),
    Flatten(),
    Dense(1, activation='softmax')
    ])

    Q3: Why 2101?? I was expecting 2050 + 1024 x 5 which is way greater than only 2101. The convolution layer itself should yield N*N*K*K where N=32 and K=5. How come the model takes less FLOPs than only the last layer, given that the convolution produces the same shape of its input? What kind of crazy optimization it has?

    [更新]

    当打印配置文件时,我有这些节点贡献于total_float_ops。其中大多数(见下文)与初始化器相关,而不是模型计算本身。

    name: "_TFProfRoot"
    total_float_ops: 2101
    children {
    name: "Mul"
    float_ops: 1050
    total_float_ops: 2101
    graph_nodes {
    name: "conv2d/kernel/Initializer/random_uniform/mul"
    float_ops: 25
    total_float_ops: 25
    input_shapes {
    key: 0
    value {
    dim {
    size: 5
    }
    dim {
    size: 5
    }
    dim {
    size: 1
    }
    dim {
    size: 1
    }
    }
    }
    input_shapes {
    key: 1
    value {
    dim {
    size: 1
    }
    }
    }
    total_definition_count: 1
    }
    graph_nodes {
    name: "dense/kernel/Initializer/random_uniform/mul"
    float_ops: 1024
    total_float_ops: 1024
    input_shapes {
    key: 0
    value {
    dim {
    size: 1024
    }
    dim {
    size: 1
    }
    }
    }
    input_shapes {
    key: 1
    value {
    dim {
    size: 1
    }
    }
    }
    total_definition_count: 1
    }
    graph_nodes {
    name: "loss/dense_loss/weighted_loss/Mul"
    input_shapes {
    key: 0
    value {
    dim {
    size: -1
    }
    }
    }
    input_shapes {
    key: 1
    value {
    dim {
    size: -1
    }
    }
    }
    total_definition_count: 1
    }
    graph_nodes {
    name: "loss/dense_loss/weighted_loss/broadcast_weights"
    input_shapes {
    key: 0
    value {
    dim {
    size: 1
    }
    }
    }
    input_shapes {
    key: 1
    value {
    dim {
    size: -1
    }
    }
    }
    total_definition_count: 1
    }
    graph_nodes {
    name: "loss/mul"
    float_ops: 1
    total_float_ops: 1
    input_shapes {
    key: 0
    value {
    dim {
    size: 1
    }
    }
    }
    input_shapes {
    key: 1
    value {
    dim {
    size: 1
    }
    }
    }
    total_definition_count: 1
    }
    children {
    name: "Add"
    float_ops: 1049
    total_float_ops: 1051
    graph_nodes {
    name: "conv2d/kernel/Initializer/random_uniform"
    float_ops: 25
    total_float_ops: 25
    input_shapes {
    key: 0
    value {
    dim {
    size: 5
    }
    dim {
    size: 5
    }
    dim {
    size: 1
    }
    dim {
    size: 1
    }
    }
    }
    input_shapes {
    key: 1
    value {
    dim {
    size: 1
    }
    }
    }
    total_definition_count: 1
    }
    graph_nodes {
    name: "dense/kernel/Initializer/random_uniform"
    float_ops: 1024
    total_float_ops: 1024
    input_shapes {
    key: 0
    value {
    dim {
    size: 1024
    }
    dim {
    size: 1
    }
    }
    }
    input_shapes {
    key: 1
    value {
    dim {
    size: 1
    }
    }
    }
    total_definition_count: 1
    }
    children {
    name: "Sub"
    float_ops: 2
    total_float_ops: 2
    graph_nodes {
    name: "conv2d/kernel/Initializer/random_uniform/sub"
    float_ops: 1
    total_float_ops: 1
    input_shapes {
    key: 0
    value {
    dim {
    size: 1
    }
    }
    }
    input_shapes {
    key: 1
    value {
    dim {
    size: 1
    }
    }
    }
    total_definition_count: 1
    }
    graph_nodes {
    name: "dense/kernel/Initializer/random_uniform/sub"
    float_ops: 1
    total_float_ops: 1
    input_shapes {
    key: 0
    value {
    dim {
    size: 1
    }
    }
    }
    input_shapes {
    key: 1
    value {
    dim {
    size: 1
    }
    }
    }
    total_definition_count: 1
    }
    }
    }
    }

    最佳答案

    我认为这个 API 充其量只是实验性的。

    Q1。不知道 2 从哪里来。

    第二季度。正如我们所见,2 与输入相关。还剩2048。您的输入大小为 32*32*1,经过 1024 展平。你的计算是xW+b,其中x是[1024],对应的W是[1, 1024]。 xW 的运算将导致 1024 次乘法和 1024 次加法。偏差添加似乎被忽略,因为否则它应该导致总共 2051 个操作:2+1024+1024+1。

    第三季度。我将你的过滤器大小更改为 3,并获得了 21 次失败,这太荒谬了。 CPU/GPU 执行器的数量没有变化。我的结论是,卷积层不会产生可信的数字。

    tf.keras.models.Sequential([
    InputLayer((32, 32, 1)),
    Conv2D(1, 3, padding='same'),
    Flatten(),
    ]) # => 21 ops



    tf.keras.models.Sequential([
    InputLayer((32, 32, 1)),
    Conv2D(32, 3, padding='same'),
    Conv2D(1, 3, padding='same'),
    Flatten(),
    ]) # => 1.09K ops

    关于python - Tensorflow 卷积层中的 FLOP,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59460310/

    30 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com