Recently, Swin Transformer became the best paper of ICCV2021. As a basic model, it has achieved SOTA results in downstream tasks such as classification, detection and segmentation. MMClassification(MMCls) is an open source image classification toolbox and a member of the open source algorithm library of OpenMMLab. This article mainly introduces how to use swing transformer to start a classification task in MMCls. The specific code is downloaded as follows: google drive .
Related tutorial documents: Welcome to the Chinese tutorial of MMClassification! - MMClassification 0.16.0 documentationhttps://mmclassification.readthedocs.io/zh_CN/latest/
catalogue
1. MMClassification installation
3. Use MMCls to fine tune the model
Preparing to modify the configuration file
1. MMClassification installation
Before using MMClassification, we need to configure the environment. The steps are as follows:
- install Python, CUDA , C/C++ compiler and git
- install PyTorch (CUDA version)
- Install the MMCV
- Clone MMCls The github code base is then installed
You can refer to links and online tutorials for installing python, cuda, torch, etc. After installation:
Check nvcc version
nvcc -V
Check gcc version
gcc --version
Check torch version
pip list | grep "torch"
Install MMCV:
MMCV is the basic library of OpenMMLab code base. The installation whl package for Linux environment has been packaged in advance. You can download and install it directly using pip. The format is as follows:
pip install mmcv -f https://download.openmmlab.com/mmcv/dist/{CUDA_v}/{Torch_v}/index.html
Pay attention to PyTorch and CUDA versions to ensure normal installation.
In the previous steps, we output the versions of CUDA and PyTorch in the environment, which are 11.1 and 1.9.0 respectively. We need to select the corresponding MMCV version.
In addition, you can also install the full version of mmcv full, which contains all the features and rich CUDA operators out of the box. The full version may take longer to compile.
pip install mmcv -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html # pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
To install MMCls:
python setup.py install: The package is relatively stable after installation. If you need to modify the code, you need to reinstall it after modification to take effect.
python setup.py develop: After installation, the package needs to be modified continuously. The modified code can take effect without reinstallation.
git clone https://github.com/open-mmlab/mmclassification cd mmclassification python setup.py develop # Install in developer mode # python setup.py install # Install in normal mode
2. Data set preparation
Cat dog classification dataset
The cat dog classification dataset is used as an example
# Download the classification dataset file in the directory $mmclassification. wget https://www.dropbox.com/s/wml49yrtdo53mie/cats_dogs_dataset_reorg.zip?dl=0 -O cats_dogs_dataset.zip mkdir data unzip -q cats_dogs_dataset.zip -d ./data/
After downloading and decompressing, the file structure under the "Cats and Dogs Dataset" folder is as follows:
data/cats_dogs_dataset
├── classes.txt
├── test.txt
├── val.txt
├── training_set
│ ├── training_set
│ │ ├── cats
│ │ │ ├── cat.1.jpg
│ │ │ ├── cat.2.jpg
│ │ │ ├── ...
│ │ ├── dogs
│ │ │ ├── dog.2.jpg
│ │ │ ├── dog.3.jpg
│ │ │ ├── ...
├── val_set
│ ├── val_set
│ │ ├── cats
│ │ │ ├── cat.3.jpg
│ │ │ ├── cat.5.jpg
│ │ │ ├── ...
│ │ ├── dogs
│ │ │ ├── dog.1.jpg
│ │ │ ├── dog.6.jpg
│ │ │ ├── ...
├── test_set
│ ├── test_set
│ │ ├── cats
│ │ │ ├── cat.4001.jpg
│ │ │ ├── cat.4002.jpg
│ │ │ ├── ...
│ │ ├── dogs
│ │ │ ├── dog.4001.jpg
│ │ │ ├── dog.4002.jpg
│ │ │ ├── ...
You can use the shell command ` tree data/cats_dogs_dataset ` view the file structure.
Support for new datasets
MMClassification requires that the dataset must place images and labels under the same level directory. There are two ways to support custom datasets.
The simplest way is to convert the dataset to an existing dataset format (such as ImageNet). Another way is to create a new dataset class. Details can be viewed file.
In this tutorial, in order to facilitate learning, we have sorted the "cat and dog classification dataset" according to the dataset format of ImageNet.
Standard documents include:
1. Category list. Each row represents a category. The first line of cats category is marked as 0, and the second line of dogs category is marked as 1
cats
dogs
2. Training / verification / test label.
Each line includes a file name and its corresponding label.
...
cats/cat.3769.jpg 0
cats/cat.882.jpg 0
...
dogs/dog.3881.jpg 1
dogs/dog.3377.jpg 1
...
3. Use MMCls to fine tune the model
The steps of fine tuning the model through the command line are as follows:
1. Prepare custom dataset
2. Data set adaptation MMCls requirements
3. Modify the configuration file in the py script
4. Use the command line tool to fine tune the model
Steps 1 and 2 are consistent with the previous introduction. We will introduce the contents of the next two steps.
Preparing to modify the configuration file
In order to reuse the common parts of different configuration files, we support multi profile inheritance. For example, for model fine tuning swin transformer tiny, the new configuration file can inherit "configurations / _base_ / Models / swin_transformer / tiny_224. Py" To create the basic structure of the model. Inherit "configs / _base_ / datasets / imagenet_bs64_swing_224. Py" To use the previously defined dataset. Inherit "configs / _base_ / schedules / imagenet_bs1024_adamw_swing. Py" to define the learning rate policy. In order to run the set learning rate policy, you also need to inherit “configs/_base_/default_runtime.py”.
The beginning of the configuration file should appear as follows
_base_ = [
'../_base_/models/swin_transformer/tiny_224.py', '../_base_/datasets/imagenet_bs64_swin_224.py',
'../_base_/schedules/imagenet_bs1024_adamw_swin.py','../_base_/default_runtime.py'
]
First, modify the model configuration. This new configuration file needs to adjust the model head according to the category of classification problems Num of_ classes. In addition to the last linear layer, the weight of the pre training model is generally reused.
model = dict( backbone=dict( init_cfg = dict( type='Pretrained', checkpoint="https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_tiny_224_b16x64_300e_imagenet_20210616_090925-66df6be6.pth", prefix='backbone') ), head=dict( num_classes=2, topk = (1, ) ), train_cfg=dict(augments=[ dict(type='BatchMixup', alpha=0.8, num_classes=2, prob=0.5), dict(type='BatchCutMix', alpha=1.0, num_classes=2, prob=0.5) ]) )
The second is data configuration. Pay attention to adjusting samples according to the existing size of your GPU_ per_ GPU, which specifies the path of the data set. Each epoch is evaluated once
img_norm_cfg = dict( mean=[124.508, 116.050, 106.438], std=[58.577, 57.310, 57.437], to_rgb=True) data = dict( # batch size and num on each gpu_ Workers setting, which is set according to the situation of the computer samples_per_gpu = 32, workers_per_gpu=2, # Specifies the training set path train = dict( data_prefix = 'data/cats_dogs_dataset/training_set/training_set', classes = 'data/cats_dogs_dataset/classes.txt' ), # Specify the validation set path val = dict( data_prefix = 'data/cats_dogs_dataset/val_set/val_set', ann_file = 'data/cats_dogs_dataset/val.txt', classes = 'data/cats_dogs_dataset/classes.txt' ), # Specify test set path test = dict( data_prefix = 'data/cats_dogs_dataset/test_set/test_set', ann_file = 'data/cats_dogs_dataset/test.txt', classes = 'data/cats_dogs_dataset/classes.txt' ) ) # Modify evaluation indicator settings evaluation = dict(interval=1, metric='accuracy', metric_options={'topk': (1, )})
The third is the learning rate strategy. The fine tuning strategy of the model is very different from the default strategy. Fine tuning generally requires less learning rate and less training cycle.
optimizer = dict(lr=0.00025) # learning policy lr_config = dict( policy='CosineAnnealing', by_epoch=False, min_lr_ratio=1e-1, warmup='linear', warmup_ratio=1e-1, warmup_iters=100, warmup_by_epoch=False) runner = dict(max_epochs=2)
Finally, run environment configuration. Use the default configuration directly.
Save the above inheritance and modifications in "configurations / swin_transformer / swin tiny_cats dogs. Py"
_base_ = [ '../_base_/models/swin_transformer/tiny_224.py', '../_base_/datasets/imagenet_bs64_swin_224.py', '../_base_/schedules/imagenet_bs1024_adamw_swin.py','../_base_/default_runtime.py' ] model = dict(....) # Copy the contents of the above code box img_norm_cfg = dict(...) data = dict(...) evaluation = dict(...) optimizer = dict(...) lr_config = dict(...) runner = dict(...)
To view complete profile information:
python ./tools/misc/print_config.py ./configs/swin_transformer/swin-tiny_cats-dogs.py
train
We use tools/train.py Fine tune the model:
python tools/train.py ${CONFIG_FILE} [optional arguments]
If you want to specify the storage location of relevant files during training, you can add a parameter -- work_dir ${YOUR_WORK_DIR}.
By adding the parameter -- seed ${SEED}, set the random seed to ensure the repeatability of the results, while the parameter -- deterministic will enable the certainty option of cudnn to further ensure the repeatability, but may reduce some efficiency.
The training code of this example is as follows:
python tools/train.py \ configs/swin_transformer/swin-tiny_cats-dogs.py \ --work-dir work_dirs/swin-tiny_cats-dogs \ --seed 0 \ --deterministic
test model
Use tools/test.py to test the model:
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments] [--out ${RESULT_FILE}]
Here are some optional parameters to configure:
--metrics: evaluation method, which depends on the data set, such as accuracy acc
--Metric options: user defined operations for the evaluation process, such as topk=1
--out: the file name of the output result. If not specified, the calculation results will not be saved. Supported formats include json, pkl, and yml
The test code of this example is as follows:
python tools/test.py ./configs/swin_transformer/swin-tiny_cats-dogs.py work_dirs/swin-tiny_cats-dogs/latest.pth --metrics=accuracy --metric-options=topk=1
Visualization results
We use the following command to infer a single picture and visualize the calculation results.
python demo/image_demo.py ${Image_Path} ${Config_Path} ${Checkpoint_Path} --device {cuda or cpu}
The example code of this article is as follows:
python demo/image_demo.py ./data/cats_dogs_dataset/training_set/training_set/cats/cat.1.jpg ./configs/swin_transformer/swin-tiny_cats-dogs.py work_dirs/swin-tiny_cats-dogs.py/latest.pth
For relevant codes and operation process, please refer to Google online disk: https://drive.google.com/file/d/1Z41vYvJkWbMAli81ppUmdIGPMV7bBPY2/view?usp=sharing
Related links:
Thesis address: https://arxiv.org/abs/2103.14030https://arxiv.org/abs/2103.14030
MMClassification : GitHub - open-mmlab/mmclassification: OpenMMLab Image Classification Toolbox and Benchmarkhttps://github.com/open-mmlab/mmclassification
MMClassification documentation: Welcome to the Chinese tutorial of MMClassification! - MMClassification 0.16.0 documentationhttps://mmclassification.readthedocs.io/zh_CN/latest/