Chapter 1 to 4
Chapter 1—Introduction of Deep Learning
Background
In 1981, neurobiologist David Hubel discovered the mechanism of information processing in the visual system, demonstrating that the visual cortex of the brain is hierarchical. His contribution is mainly two, one is that he believes that recognizing visual functions, one is abstraction, and the other is iteration.
Abstraction is the abstraction of very concrete, figurative elements, that is, primitive light pixels and other information, to form meaningful concepts. These meaningful concepts will iterate upwards and become more abstract concepts that people can perceive.
Thus, for computers, it needs to simulate the process of abstraction and recursive iteration.
Modern Deep Learning
Convolutional neural networks(CNN) simulate this process with convolutional layers that are usually stacked.
The lower convolutional layers can extract local features of the images, such as corners, edges, lines, and so on. The higher convolutional layers are able to learn more complex features from the lower convolutional layers to achieve classification and recognition of the images.
Reinforcement Learning
Reinforcement Learning mainly includes 4 elements: agent, state, action, reward.
- Features
There are no supervisors, only a feedback signal.
Feedback is delayed and not generated immediately.
Reinforcement learning is sequential learning, and time has an important meaning in reinforcement learning.
The behavior of the agent will affect all future decisions.
Chapter 2—Deep Learning Framework
Caffe
Caffe(Convolutional Architecture for Fast Feature Embedding): mainly used in video and image processing.
The official website of Caffe is http://caffe.berkeleyvision.org/
TensorFlow
TensorFlow is an open-source database that uses data flow graphs for numerical computation. Nodes represent mathematical operations in the graph, and the lines in the graph represent multidimensional arrays of data, that is, tensors, that are related to each other between nodes.
Computation Graph in TensorFlow:
- The leaf node or start node is always a tensor.
- Tensors cannot appear as non-leaf nodes.
- Computational graphs always express complex operations in a hierarchical order.
Pytorch
The biggest advantage of PyTorch is that the neural network built is dynamic, while TensorFlow and Caffe are both static neural network structures.
The design of PyTorch follows the three levels of abstraction from low to high, $tensor\rightarrow variable(autograd)\rightarrow nn.Module$, representing high-dimensional arrays, automatic derivation, and neural networks.
Chapter 3—Machine Learning Basics
Basic Concepts
- Loss Function: $L(y, \hat{y})$ is a measure of model error.
- Training Error: average error on the training set.
- Generalization Error: average error on the test set.
If we unilaterally pursue the minimization of training errors, it will lead to an increase in the complexity of model parameters, resulting in overfitting of the model.
To prevent overfitting:
- Validate set tuning parameters
The selection of parameters (i.e. parameter tuning) must be carried out on a dataset independent of the training and testing sets, and such a dataset used for model tuning is called a development or validation set.
- Loss function for regularization
Regularization is added to the optimization objective to punish the complexity of redundancy.
$$
\mathop{min}\limits_{h}L(\boldsymbol{y},\boldsymbol{\hat{y}};\boldsymbol{\theta})+\lambda\cdot J(\boldsymbol{\theta})
$$
Supervised Learning
Supervised learning is mainly applicable to two main types of problems: regression and classification.
- Classification
Model evaluation metrics:
- Balanced problems: $Accuracy = \frac{k}{D}$
- Non-balanced problems: $F-Metric$
Non-balanced problems
Define the class that is a minority of the sample as a positive class and the class that is a majority of the sample as a negative class
【Predictions】
- Predict a positive sample as a positive class(true positive, TP)
- Predict a negtive sample as a positive class(false positive, FP)
- Predict a positive sample as a negtive class(flase negtive, FN)
- Predict a negtive sample as a negtive class(true negtive, TN)
$$
Define\ \ recall:\
R = \frac{|TP|}{|TP|+|FN|}
$$
The recall rate measures the rate of correct detection by the model among all positive samples, so it also becomes the recall rate.
$$
Define\ \ precision:\ P = \frac{|TP|}{|TP|+|FP|}
$$
The accuracy rate measures the percentage of all samples predicted by the model to be positive, and is therefore also called the accuracy rate.
F-Metric reconciles the average between recall and precision.
$$
F_{\alpha}=\frac{(1+\alpha^2)RP}{R+\alpha^2P}
$$
$$
if\ \alpha=1\Longrightarrow F_1=\frac{2RP}{R+P}
$$
Chapter 4—Pytorch Deep Learning
Pytorch Tensor Features
- Tensor can use GPU for calculation
- In the calculation, it can be automatically added to the calculation diagram as a node, and it can be automatically differentiated.
Tensor Object
Basic Operations
1 | import torch |
Indexes and Slices
1 | 9).view(3,3) a = torch.arange( |
Tensor Transformation, Splicing and Splitting
1 | 1,2,3,4,5) a = torch.rand( |
Tensor Reduction
1 | 1,2],[3,4]) a = torch.tensor([ |
Chapter 1 to 4