Caffe源码深入解析.pdf-资料库

f84a68dd-6d61-441b-85e8-1182f9e9b786.pdf-第1页.png

第1页 / 共49页

f84a68dd-6d61-441b-85e8-1182f9e9b786.pdf-第2页.png

第2页 / 共49页

f84a68dd-6d61-441b-85e8-1182f9e9b786.pdf-第3页.png

第3页 / 共49页

f84a68dd-6d61-441b-85e8-1182f9e9b786.pdf-第4页.png

第4页 / 共49页

f84a68dd-6d61-441b-85e8-1182f9e9b786.pdf-第5页.png

第5页 / 共49页

f84a68dd-6d61-441b-85e8-1182f9e9b786.pdf-第6页.png

第6页 / 共49页

f84a68dd-6d61-441b-85e8-1182f9e9b786.pdf-第7页.png

第7页 / 共49页

f84a68dd-6d61-441b-85e8-1182f9e9b786.pdf-第8页.png

第8页 / 共49页

1. 概述 1.1. Caffe 优缺点优点：  速度快。Google Protocol Buffer 数据标准为 Caffe 提升了效率。  学术论文采用此模型较多。不确定是不是最多，但接触到的不少论文都与 Caffe 有关（R-CNN，DSN，最近还有人用 Caffe 实现 LSTM）缺点：  曾更新过重要函数接口。有人反映，偶尔会出现接口变换的情况，自己很久前写的代码可能过了一段时间就不能和新版本很好地兼容了。（现在更新速度放缓，接口逐步趋于稳定，感谢评论区王峰的建议）  对于某些研究方向来说的人并不适合。这个需要对 Caffe 的结构有一定了解。 1.2. Caffe 代码层次 Blob，Layer，Net，Solver 这四个类复杂性从低到高，贯穿了整个 Caffe。 SyncedMem：这个类的主要功能是封装 CPU 和 GPU 的数据交互操作。一般来说，数据的流动形式都是：硬盘->CPU 内存->GPU 内存->CPU 内存->（硬盘），所以在写代码的过程中经常会写 CPU/GPU 之间数据传输的代码，同时还要维护 CPU 和 GPU 两个处理端的内存指针。这些事情处理起来不会很难，但是会很繁琐。因此 SyncedMem 的出现就是把 CPU/GPU 的数据传输操作封装起来，只需要调用简单的接口就可以获得两个处理端同步后的数据。 Blob：这个类做了两个封装：一个是操作数据的封装。在这里使用 Blob，我们可以操纵高维的数据，可以快速访问其中的数据，变换数据的维度等等；另一个是对原始数据和更新量的封装。每一个 Blob 中都有 data 和 diff 两个数据指针，data 用于存储原始数据，diff 用于存储反。向传播的梯度更新值。Blob 使用了 SyncedMem，这样也得到了不同处理端访问的便利。这样 Blob 就基本实现了整个 Caffe 数据部分结构的封装，在 Net 类中可以看到所有的前后向数据和参数都用 Blob 来表示就足够了。数据的抽象到这个就可以了，接下来是层级的抽象。前面我们也分析过，神经网络的前后向计算可以做到层与层之间完全独立，那么每个层只要依照一定的接口规则实现，就可以确保整个网络的正确性。

Layer：Caffe 实现了一个基础的层级类 Layer，对于一些特殊种类还会有自己的抽象类（比如 base_conv_layer），这些类主要采用了模板的设计模式（Template）,也就是说一些必须的代码在基类写好，一些具体的内容在子类中实现。比方说在 Layer 的 Setup 中，函数中包括 Setup 的几个步骤，其中的一些步骤由基类完成，一些步骤由子类完成。还有十分重要的 Forward 和 Backward，基类实现了其中需要的一些逻辑，但是真正的运算部分则交给了子类。这样当我们需要实现一个新的层时，我们不需要管理琐碎的事物，只要管理好层的初始化和前后向即可。 Net：Net 将数据和层组合起来做进一步的封装，对外暴露了初始化和前后向的接口，使得整体看上去和一个层的功能类似，但内部的组合可以是多种多样。同时值得一提的是，每一层的输入输出数据统一保存在 Net 中，同时每个层内的参数指针也保存在 Net 中，不同的层可以通过 WeightShare 共享相同的参数， Solver：有了 Net 我们实际上就可以进行网络的前向后向计算了，但是关于网络的学习训练的功能还有些缺乏，于是在此之上，Solver 类进一步封装了训练和预测相关的一些功能。与此同时，它还开放了两类接口：一个是更新参数的接口，继承 Solver 可以实现不同的参数更新方法，如大家喜闻乐见的 Momentum， Nesterov，Adagrad 等。这样使得不同的优化算法能够应用其中。另外一个是训练过程中每一轮特定状态下的可注入的一些回调函数，在代码中这个回调点的直接使用者就是多卡训练算法。 IO：有了上面的东西就够了？还不够，我们还需要输入数据和参数，正所谓巧妇难为无米之炊，没有数据都是白搭。DataReader 和 DataTransformer 帮助准备输入数据，Filler 对参数进行初始化。一些 Snapshot 方法帮助模型的持久化，这样模型和数据的 IO 问题也解决了。多卡：对于单 GPU 训练来说，基本的层次关系到这里也就结束了，如果要进行多 GPU 训练，那么上层还会有 InternalThread 和 P2PSync 两个类，这两个类属于最上层的类了，而他们所调用的也只有 Solver 和一些参数类。我们可以画一张图把 Caffe 的整体层次关系展示出来：

2. Caffe 基础 2.1. Caffe.proto 解析要看 caffe 源码，首先应该看的就是 caffe.proto。它位于…\src\caffe\proto 目录下，在这个文件夹下还有一个.pb.cc 和一个.pb.h 文件，这两个文件都是由 caffe.proto 编译而来的。在 caffe.proto 中定义了很多结构化数据，包括：  BlobProto  Datum  FillerParameter  NetParameter  SolverParameter  SolverState  LayerParameter  ConcatParameter  ConvolutionParameter  DataParameter  DropoutParameter  HDF5DataParameter

 HDF5OutputParameter    ImageDataParameter InfogainLossParameter InnerProductParameter  LRNParameter  MemoryDataParameter  PoolingParameter  PowerParameter  WindowDataParameter  V0LayerParameter caffe.pb.cc 里面的东西都是从 caffe.proto 编译而来的，无非就是一些关于这些数据结构（类）的标准化操作，比如 void CopyFrom(); void MergeFrom(); void Clear(); bool IsInitialized() const; int ByteSize() const; bool MergePartialFromCodedStream(); void SerializeWithCachedSizes() const; SerializeWithCachedSizesToArray() const; int GetCachedSize() void SharedCtor(); void SharedDtor(); void SetCachedSize() const; caffe.proto 中的几个重要数据类型 1. BlobProto message BlobProto {//blob 的属性以及 blob 中的数据(data\diff) optional int32 num = 1 [default = 0]; optional int32 channels = 2 [default = 0]; optional int32 height = 3 [default = 0]; optional int32 width = 4 [default = 0]; repeated float data = 5 [packed = true]; repeated float diff = 6 [packed = true]; }

2. Datum message Datum { optional int32 channels = 1; optional int32 height = 2; optional int32 width = 3; optional bytes data = 4;//真实的图像数据，以字节存储(bytes) optional int32 label = 5; repeated float float_data = 6;//datum 也能存 float 类型的数据(float) } 3. LayerParameter message LayerParameter { repeated string bottom = 2; //输入的 blob 的名字(string) repeated string top = 3; //输出的 blob 的名字(string) optional string name = 4; //层的名字 enum LayerType { //层的枚举（enum，和 c++中的 enum 一样） NONE = 0; ACCURACY = 1; BNLL = 2; CONCAT = 3; CONVOLUTION = 4; DATA = 5; DROPOUT = 6; EUCLIDEAN_LOSS = 7; ELTWISE_PRODUCT = 25; FLATTEN = 8; HDF5_DATA = 9; HDF5_OUTPUT = 10; HINGE_LOSS = 28; IM2COL = 11; IMAGE_DATA = 12; INFOGAIN_LOSS = 13; INNER_PRODUCT = 14; LRN = 15; MEMORY_DATA = 29; MULTINOMIAL_LOGISTIC_LOSS = 16; POOLING = 17; POWER = 26; RELU = 18; SIGMOID = 19; SIGMOID_CROSS_ENTROPY_LOSS = 27;

SOFTMAX = 20; SOFTMAX_LOSS = 21; SPLIT = 22; TANH = 23; WINDOW_DATA = 24; } optional LayerType type = 5; // 层的类型 repeated BlobProto blobs = 6; //blobs 的数值参数 repeated float blobs_lr = 7; //学习速率(repeated)，如果你想设置一个 blob 的学习速率，你需要设置所有 blob 的学习速率。 repeated float weight_decay = 8; //权值衰减(repeated) // 相对于某一特定层的参数(optional) optional ConcatParameter concat_param = 9; optional ConvolutionParameter convolution_param = 10; optional DataParameter data_param = 11; optional DropoutParameter dropout_param = 12; optional HDF5DataParameter hdf5_data_param = 13; optional HDF5OutputParameter hdf5_output_param = 14; optional ImageDataParameter image_data_param = 15; optional InfogainLossParameter infogain_loss_param = 16; optional InnerProductParameter inner_product_param = 17; optional LRNParameter lrn_param = 18; optional MemoryDataParameter memory_data_param = 22; optional PoolingParameter pooling_param = 19; optional PowerParameter power_param = 21; optional WindowDataParameter window_data_param = 20; optional V0LayerParameter layer = 1; } 4. NetParameter message NetParameter { optional string name = 1;//网络的名字 repeated LayerParameter layers = 2; //repeated 类似于数组 repeated string input = 3;//输入层 blob 的名字 repeated int32 input_dim = 4;//输入层 blob 的维度，应该等于(4*#input) optional bool force_backward = 5 [default = false];//网络是否进行反向传播。如果设置为否，则由网络的结构和学习速率来决定是否进行反向传播。 }

5. SolverParameter message SolverParameter { optional string train_net = 1; //训练网络的 proto file optional string test_net = 2; //测试网络的 proto file optional int32 test_iter = 3 [default = 0]; //每次测试时的迭代次数 optional int32 test_interval = 4 [default = 0]; //两次测试的间隔迭代次数 optional bool test_compute_loss = 19 [default = false]; optional float base_lr = 5; //基本学习率 optional int32 display = 6; //两次显示的间隔迭代次数 optional int32 max_iter = 7; //最大迭代次数 optional string lr_policy = 8; //学习速率衰减方式 optional float gamma = 9; //关于梯度下降的一个参数 optional float power = 10; //计算学习率的一个参数 optional float momentum = 11; //动量 optional float weight_decay = 12; //权值衰减 optional int32 stepsize = 13; //学习速率的衰减步长 optional int32 snapshot = 14 [default = 0]; //snapshot 的间隔 optional string snapshot_prefix = 15; //snapshot 的前缀 optional bool snapshot_diff = 16 [default = false]; //是否对于 diff 进行 s napshot enum SolverMode { CPU = 0; GPU = 1; } optional SolverMode solver_mode = 17 [default = GPU]; //solver 的模式，默认为 GPU optional int32 device_id = 18 [default = 0]; //GPU 的 ID optional int64 random_seed = 20 [default = -1]; //随机数种子 } 2.2. solver 及其配置 solver 算是 caffe 的核心的核心，它协调着整个模型的运作。caffe 程序运行必带的一个参数就是 solver 配置文件。运行代码一般为 # caffe train --solver=*_slover.prototxt // NOTE // Update the next available ID when you add a new SolverParameter field. // // SolverParameter next available ID: 43 (last added: ap_version) message SolverParameter { /////////////////////////////////////////////////////////////////////// /////// // Specifying the train and test networks

// // Exactly one train net must be specified using one of the following fie lds: // train_net_param, train_net, net_param, net // One or more test nets may be specified using any of the following fiel ds: // test_net_param, test_net, net_param, net // If more than one test net field is specified (e.g., both net and // test_net are specified), they will be evaluated in the field order giv en // above: (1) test_net_param, (2) test_net, (3) net_param/net. // A test_iter must be specified for each test_net. // A test_level and/or a test_stage may also be specified for each test_n et. /////////////////////////////////////////////////////////////////////// /////// // Proto filename for the train net, possibly combined with one or more // test nets. optional string net = 24; // Inline train net param, possibly combined with one or more test nets. optional NetParameter net_param = 25; optional string train_net = 1; // Proto filename for the train net. repeated string test_net = 2; // Proto filenames for the test nets. optional NetParameter train_net_param = 21; // Inline train net params. repeated NetParameter test_net_param = 22; // Inline test net params. // The states for the train/test nets. Must be unspecified or // specified once per net. // // By default, all states will have solver = true; // train_state will have phase = TRAIN, // and all test_state's will have phase = TEST. // Other defaults are set according to the NetState defaults. optional NetState train_state = 26; repeated NetState test_state = 27; // Evaluation type. optional string eval_type = 41 [default = "classification"]; // ap_version: different ways of computing Average Precision. // Check https://sanchom.wordpress.com/tag/average-precision/ for det ails.

资料库

Caffe源码深入解析.pdf

相关推荐

人工智能

热门标签

最新资料