- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我读了很多关于卷积网络的论文,但是我不了解一件事,卷积层中的过滤器是如何初始化的?
因为,例如,在第一层中,过滤器应检测边缘等。
但是,如果它是随机初始化的,那将不准确吗?下一层和高级功能相同。
还有一个问题,这些过滤器中值的范围是多少?
非常感谢您!
最佳答案
您可以随机初始化过滤器,也可以在其他一些数据集上对它们进行预训练。
一些引用:
http://deeplearning.net/tutorial/lenet.html:
Notice that a randomly initialized filter acts very much like an edge detector!
Note that we use the same weight initialization formula as with the MLP. Weights are sampled randomly from a uniform distribution in the range [-1/fan-in, 1/fan-in], where fan-in is the number of inputs to a hidden unit. For MLPs, this was the number of units in the layer below. For CNNs however, we have to take into account the number of input feature maps and the size of the receptive fields.
Transfer Learning
In practice, very few people train an entire Convolutional Network from scratch (with random initialization), because it is relatively rare to have a dataset of sufficient size. Instead, it is common to pretrain a ConvNet on a very large dataset (e.g. ImageNet, which contains 1.2 million images with 1000 categories), and then use the ConvNet either as an initialization or a fixed feature extractor for the task of interest. The three major Transfer Learning scenarios look as follows:
- ConvNet as fixed feature extractor. Take a ConvNet pretrained on ImageNet, remove the last fully-connected layer (this layer's outputs are the 1000 class scores for a different task like ImageNet), then treat the rest of the ConvNet as a fixed feature extractor for the new dataset. In an AlexNet, this would compute a 4096-D vector for every image that contains the activations of the hidden layer immediately before the classifier. We call these features CNN codes. It is important for performance that these codes are ReLUd (i.e. thresholded at zero) if they were also thresholded during the training of the ConvNet on ImageNet (as is usually the case). Once you extract the 4096-D codes for all images, train a linear classifier (e.g. Linear SVM or Softmax classifier) for the new dataset.
- Fine-tuning the ConvNet. The second strategy is to not only replace and retrain the classifier on top of the ConvNet on the new dataset, but to also fine-tune the weights of the pretrained network by continuing the backpropagation. It is possible to fine-tune all the layers of the ConvNet, or it's possible to keep some of the earlier layers fixed (due to overfitting concerns) and only fine-tune some higher-level portion of the network. This is motivated by the observation that the earlier features of a ConvNet contain more generic features (e.g. edge detectors or color blob detectors) that should be useful to many tasks, but later layers of the ConvNet becomes progressively more specific to the details of the classes contained in the original dataset. In case of ImageNet for example, which contains many dog breeds, a significant portion of the representational power of the ConvNet may be devoted to features that are specific to differentiating between dog breeds.
Pretrained models. Since modern ConvNets take 2-3 weeks to train across multiple GPUs on ImageNet, it is common to see people release their final ConvNet checkpoints for the benefit of others who can use the networks for fine-tuning. For example, the Caffe library has a Model Zoo where people share their network weights.
When and how to fine-tune? How do you decide what type of transfer learning you should perform on a new dataset? This is a function of several factors, but the two most important ones are the size of the new dataset (small or big), and its similarity to the original dataset (e.g. ImageNet-like in terms of the content of images and the classes, or very different, such as microscope images). Keeping in mind that ConvNet features are more generic in early layers and more original-dataset-specific in later layers, here are some common rules of thumb for navigating the 4 major scenarios:
- New dataset is small and similar to original dataset. Since the data is small, it is not a good idea to fine-tune the ConvNet due to overfitting concerns. Since the data is similar to the original data, we expect higher-level features in the ConvNet to be relevant to this dataset as well. Hence, the best idea might be to train a linear classifier on the CNN codes.
- New dataset is large and similar to the original dataset. Since we have more data, we can have more confidence that we won't overfit if we were to try to fine-tune through the full network.
- New dataset is small but very different from the original dataset. Since the data is small, it is likely best to only train a linear classifier. Since the dataset is very different, it might not be best to train the classifier form the top of the network, which contains more dataset-specific features. Instead, it might work better to train the SVM classifier from activations somewhere earlier in the network.
- New dataset is large and very different from the original dataset. Since the dataset is very large, we may expect that we can afford to train a ConvNet from scratch. However, in practice it is very often still beneficial to initialize with weights from a pretrained model. In this case, we would have enough data and confidence to fine-tune through the entire network.
Practical advice. There are a few additional things to keep in mind when performing Transfer Learning:
- Constraints from pretrained models. Note that if you wish to use a pretrained network, you may be slightly constrained in terms of the architecture you can use for your new dataset. For example, you can't arbitrarily take out Conv layers from the pretrained network. However, some changes are straight-forward: Due to parameter sharing, you can easily run a pretrained network on images of different spatial size. This is clearly evident in the case of Conv/Pool layers because their forward function is independent of the input volume spatial size (as long as the strides "fit"). In case of FC layers, this still holds true because FC layers can be converted to a Convolutional Layer: For example, in an AlexNet, the final pooling volume before the first FC layer is of size [6x6x512]. Therefore, the FC layer looking at this volume is equivalent to having a Convolutional Layer that has receptive field size 6x6, and is applied with padding of 0.
- Learning rates. It's common to use a smaller learning rate for ConvNet weights that are being fine-tuned, in comparison to the (randomly-initialized) weights for the new linear classifier that computes the class scores of your new dataset. This is because we expect that the ConvNet weights are relatively good, so we don't wish to distort them too quickly and too much (especially while the new Linear Classifier above them is being trained from random initialization).
Additional References
- CNN Features off-the-shelf: an Astounding Baseline for Recognition trains SVMs on features from ImageNet-pretrained ConvNet and reports several state of the art results.
- DeCAF reported similar findings in 2013. The framework in this paper (DeCAF) was a Python-based precursor to the C++ Caffe library.
- How transferable are features in deep neural networks? studies the transfer learning performance in detail, including some unintuitive findings about layer co-adaptations.
关于tensorflow - 如何在convnet中初始化过滤器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41020524/
我正在使用 ConvNet 构建模型来进行天气预报。我的输入数据是 96x144 矩阵(代表地理区域)的 10K 个样本,网格中每个点在固定高度处具有变量 Z(位势高度)的值。如果我包含 3 个不同的
我一直在编码 this example TensorFlow 中的卷积网络,我对这种权重分配感到困惑: weights = { # 5x5 conv, 1 input, 32 outputs 'wc1
我正在尝试使用卷积神经网络对图像进行分类。我经历过这个 tutorial关于深度学习并实现了给定的 code有很多修改。我添加了更多的卷积层和最大池化层,并更改了输入以接受 166x166 的输入。为
我目前正在尝试使用在 Imagenet 上训练的 Densenet 进行迁移学习,以输出一个有序整数值 {2 < 3 < 4 < 5 < 6}。我使用 this method 将目标变量编码为长度为
我正在尝试仅使用 numpy 实现CNN。 在进行反向传播时,我发现我必须使用 col2im 来 reshape dx,所以我检查了 https://github.com/huyouare/CS231
我正在尝试使用 VGG16 网络进行图像分类。我尝试了两种不同的方法来做到这一点,据我所知,这两种方法应该大致相同,但结果却截然不同。 方法 1: 使用 VGG16 提取特征,并使用自定义全连接网络拟
我在一篇论文中读到了这一点:“我们不是在第一个卷积层中使用相对较大的感受野,而是在整个网络中使用非常小的 3 × 3 感受野,这些感受野在每个网络中与输入进行卷积。像素(步幅为 1)。很容易看出,两个
我正在使用卷积网络来预测时间序列。为此,我使用滚动窗口来获取最后 t 点,将它们用作时间序列。每个功能都将成为一个 channel ,因此我设置了多个时间序列。数据需要为 3 维 [n_samples
我正在尝试创建一个用于图像分割的简单 3D U-net,只是为了学习如何使用图层。因此,我进行步幅为 2 的 3D 卷积,然后进行转置反卷积以获得相同的图像大小。我也过度拟合了一个小集(测试集)只是为
我正在完成 Stanford's cs231n course 的作业靠我自己。我不是该类(class)的学生。我在their subreddit中问了同样的问题,但似乎没有人在那里。希望能在这里找到一
我目前正在构建一个卷积神经网络来区分清晰的 ECG 图像和有噪声的 ECG 图像。 有噪音: 没有噪音: 我的问题 所以我确实在 tensorflow 之上使用 keras 构建了一个 convnet
我正在尝试在 Excel 中构建一个非常简单的卷积神经网络。该模型是一个图像分类器,试图识别手写的 I、O 和 X;并使用 Keras 对 EMNIST 字母数据集的一个子集进行了训练。 Excel
我正在尝试构建一个简单的卷积神经网络,将时间序列分为六类之一。由于不兼容的形状错误,我在训练网络时遇到问题。 在以下代码中,n_feats = 1000,n_classes = 6。 Fs = 100
尝试使用 Python + Keras 训练 cnn(卷积神经网络)。但即使是最简单的问题似乎也很难回答,而且那里的教程也没有我寻求的答案。 我可以访问我想要识别的少数类别的数千张图像。但我该如何准备
我有一个在 convnet.js 中创建的神经网络模型,我必须使用 Keras 来定义它。有谁知道我该怎么做? neural = { net : new convnetjs.Net
我正在文本(字符级别)上训练卷积神经网络,并且我想要进行最大池化。 tf.nn.max_pool 需要一个 4 级张量,但 1-d 卷积网络在 tensorflow 中是 3 级([batch, wi
我修改了 MNIST (28x28) Convnet 教程代码以接受更大的图像 (150x150)。但是,当我尝试训练时,我收到此错误(完整堆栈跟踪请参见结尾): W tensorflow/core/
要解决的整个问题是读取这些能量计显示的数字。 An Image of the energy meter.然后我需要能够在 android 应用程序中实现所有这些。我要做的是首先通过回归找到包含数字的黑
关闭。这个问题需要details or clarity .它目前不接受答案。 想改进这个问题吗? 通过 editing this post 添加细节并澄清问题. 关闭 9 年前。 Improve t
我创建了一个微调网络,它使用 vgg16 作为基础。我正在关注 Deep Learning With Python 中的 5.4.2 可视化 CovNet 过滤器部分(这与 Keras 博客上的可视化
我是一名优秀的程序员,十分优秀!