What? Neither artificial intelligence nor training models?

Preface:

In a bright and windy day, thunder and lightning intersect.. Okay, let's leave it alone ~ 😊

Because this semester there are classes on artificial intelligence, and then there's a little model like you made.

During his communication with a fellow researcher, he asked me to run down the model to do a comparative experiment and send a paper with him.

I thought: I am in the field of machine learning, belonging to pure white - try it! Then start a long way to learn ~

What can you learn by reading this article?

  • Resume the journey of setting up the environment 🦌~
  • Get to know what each parameter means, folder 📁 Meaning~
  • base training for major basic models to find a sense of accomplishment 🚀~
  • Necessary summary of experience ✈️~

Setting up Pytorch environment

Before I started, I was a fainting chicken 🐤 Feel at a loss. What, what~

I only know how painful it was when I first put my scalp on it, looked for blogs that were N years old, what to do, and did a revert

I was trying to get an SSD and got a good start (when the environment was ready).

Step by step for half an hour, change the source code, find one by one according to the error, it will be painful 😖

Return to the topic!

Set up environment: https://tangshusen.me/Dive-into-DL-PyTorch/#/

👆 The website above is an open source machine learning tutorial translated. I looked through the warehouse first...

Set up the environment directly in the 2.1 on the left, this article introduces 2 articles that you know, point in and teach you by hand~

Mac's words will change when it's set up (2021-11), that is, the latest update is a little different, but Google will be able to come out.

Windows has a graphics card. To download Cuda and configure the environment, what's the Internet speed?

Better to replace with a mirror, download the mirror wheel and install it so that it is the fastest ~

When you can print it out 🖨️ It's time for torch to be successful.

Make your own dataset

The download of Mac label files is easy. Google searches labellmg directly. Here I post Website

It's quick to combine brew, I haven't tried Windows yet

How to use it, a large online tutorial, not to mention more ~ 😊

YoloV4 for Getting Started

While I was at a loss, my brother recommended a blogger

B-Station Search 🔍: Bubbliiiing Search pytorch dynamically

Here's what you'll find: Hand-on training of your own datasets

The blogger's ideas for teaching are:

Briefly introduce the training ideas of network to how each layer of network works

And how to train your own datasets.

I started watching YoloV4's instructional video from start to finish.

Add your own class

Labels listed based on your own dataset 🏷️

Add to/model_data/voc_classes.txt

Same format, one to one line

Make your own dataset

Here you can download datasets shared by bloggers data set Extraction Code: uack

You can also make it yourself, but make it yourself to keep up with your quantity

Directory structure: VOC Dataset Resolution VOC2007 Resolution

My remark is that my brother has already cleaned it (not really cleaned it all)

I took it and deleted the files directly from main under ImageSet

Because this blogger has voc_ The annotation.py file is randomly and automatically divided into 9:1

Then 2007_appears in the home directory Val.txt and 2007_train.txt

So if you did it on your computer, remember to put it on and regenerate it, because the path has changed to

Get the pre-training weight and prepare for training.

Be careful ⚠️: The following are done in a graphics card environment

Under the corresponding warehouse, select the corresponding pre-training weight. download ⏬ upload ⏫ To/model/Down

Start editing train.py and learn more:

First, configure the class, model path

Next is starting to configure parameters manually

Batch_ The size is the number of samples selected for a training session.

lr is: learning rate

num_workers are threads

  • batch_size Matching

If you're better configured:

​ Freeze_batch_size: 16;Unfreeze_batch_size: 8

Nearly:

​ Freeze_batch_size: 8;Unfreeze_batch_size: 4

​ Freeze_batch_size: 4;Unfreeze_batch_size: 2

  • The combination of learning rates:

​ Freeze_lr : 1e-3(0.001);Unfreeze_lr: 1e-4

​ Freeze_lr : 1e-4(0.0001);Unfreeze_lr: 1e-5

There is the knowledge of Alchemy.

  • First: UnFreeze_Epoch=100; Freeze_Epoch=50;

    It will run 100 Epochs, so make sure your space doesn't explode.

  • Second: loss will gradually shrink and stabilize to a range

    The mAP that runs out at this time is not satisfied.

    There are several ways:

    1. Add iterations: UnFreeze_Epoch=150 loss possibilities
    2. Debugging lr, batch_size
  • Last:

    When your loss is Nan, you can stop and debug the parameters.

    How to refine, experience and luck, recommendation: Github, look at the article

If you have a problem 🤨:

Solution:

  1. View Up's list of frequently asked questions
  2. According to error ⚠️ Google search
  3. Github starts an Issue and asks a question

One final multiple cuda errors: no graphics card error specified ❌:

# Add these sentences at train.py
# Number is the card you want to use
import os
os.environ['KMP_DUPLICATE_LIB_OK']='True'
os.environ["CUDA_VISIBLE_DEVICES"] = "1,4,7"

Here your train.py should just run or sit.

Test got mAP

When you have finished 100 iterations, you will start testing.

Your training weights are all under logs~

Edit yolo.py and get_map.py can run by changing a few path s.

Then map_is output Out file, get map_out/reslut/mAP.png is fine.

But is a run unrealistic?

My initial solution was to run 10, 20, 30... out of 10.

But I find that mAP will only rise after loss stabilizes.

So you can delete the first ten and leave about 20 to test.

Whoever looks good anyway! Test him!

image validation~

I don't use this much ~see how it works
You're still happy when you see frames appear 😄

Eat more fast, ssd

The same thing fast, SSD does the same thing

  • fater-rcnn-pytorch https://github.com/bubbliiiing/faster-rcnn-pytorch
  • SSD - Pytorch https://github.com/bubbliiiing/ssd-pytorch

Advanced faster

When I saw Fasr trained with that blogger above, it wasn't ideal.

I'm ready to change Libraries

I'll try the github star ✨ Most Libraries

Warehouse address: https://github.com/jwyang/faster-rcnn.pytorch

Let me just follow the steps:

Deploy Base Environment

# Clone
$ git clone https://github.com/jwyang/faster-rcnn.pytorch.git

# Change Version
$ git checkout pytorch-1.0

# You should have an Anaconda package. There are almost all
$ cd faster-rcnn.pytorch && mkdir data
$ cd data && mkdir pretrained_model
$ cd ../
$ cd lib
$ python setup.py build develop

Download corresponding weights

Put in/data/pretrained_ Below model

Just grab the dataset

Appear 2007_in addition to the dataset in the home directory Val.txt and 2007_train.txt Do not

Take it all and rename it under data: VOCdevkit2007

Adjust class and note case

Special attention here ⚠️ Case

Please take your label class as standard!!! Otherwise test mAP is all 0

Edit lib/datasets/pascal_voc.py replaced with its own

self._classes = ('__background__',  # always index 0
                         'Dark shoes','Light shoes')

If an error occurs when you start running ❌ There is no keyvalue.

Looking at his error stack, you'll see a lower method - directly to pascal_ Just delete it from voc.py

Training dataset

$ CUDA_VISIBLE_DEVICES=0 python trainval_net.py --dataset pascal_voc --net vgg16 --epochs 30 --bs 8 --nw 4 --lr 1e-3 --lr_decay_step 6 --use_tfb --mGPUs --cuda
  • CUDA_VISIBLE_DEVICES: Which GPU to use
  • - dataset pascal_voc: using the Pascal dataset
  • - net res101: use res101 as feature extraction network, optional vgg16
  • - epochs 30: Each epoch passes through all the training maps.
  • - bs 8: see above
  • - nw 4:num_works=4,4 threads for reading
  • - lr: initial learning rate
  • - lr_decay_step: the learning rate decays every few epoch s
  • - use_tfb: use tensorboardX to record and visualize, without writing; Optional parameters
  • - mGPUs: multi-GPU training without writing the command; Optional parameters
  • - cuda: use cuda;

The result is models, if you want to retrain

Remember to delete the cache and logs, cached under data

mAP for test datasets

 test_net.py --dataset pascal_voc --net res101 --checksession 1 --checkepoch 30 --checkpoint 610 --cuda

- checksession 1--checkepoch 30--checkpoint 610 How do I choose?

To model/res101/pascal_ See below VOC for example faster_rcnn_1_30_610.pth understand

He won't generate the corresponding file png, so watch out for the output

Flops

When I privately trusted the blogger of Station B ~he said he would search for it

Then I write a corresponding version based on my own understanding

SSD - Pytorch

from nets.ssd import SSD300
import torchvision.models as models
from ptflops import get_model_complexity_info


def get_classes(classes_path): # Remove name and quantity
    with open(classes_path, encoding='utf-8') as f:
        class_names = f.readlines()
    class_names = [c.strip() for c in class_names]
    return class_names, len(class_names)


if __name__ == "__main__":
    classes_path = 'model_data/voc_classes.txt'
    class_names, num_classes  = get_classes(classes_path)
    # Corresponding model creation
    myNet = SSD300(num_classes + 1, 'vgg')  
    # Set according to the number of layers and the size of the running picture
    flops, params = get_model_complexity_info(myNet, (3,416,416), as_strings=True, print_per_layer_stat=True)
    print("Flops: {}".format(flops))
    print("Params: " + params)

FLOPs-yolov4

from nets.yolo import YoloBody
from ptflops import get_model_complexity_info


def get_classes(classes_path): # Remove name and quantity
    with open(classes_path, encoding='utf-8') as f:
        class_names = f.readlines()
    class_names = [c.strip() for c in class_names]
    return class_names, len(class_names)


if __name__ == "__main__":
    classes_path = 'model_data/voc_clothes_classes.txt'
    class_names, num_classes  = get_classes(classes_path)
    # Corresponding model creation
    anchors_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
    myNet = YoloBody(anchors_mask, num_classes)  
    # Set according to the number of layers and the size of the running picture
    flops, params = get_model_complexity_info(myNet, (3,416,416), as_strings=True, print_per_layer_stat=True)
    print("Flops: {}".format(flops))
    print("Params: " + params)

Faster-rcnn

from nets.frcnn import FasterRCNN
import torchvision.models as models
from ptflops import get_model_complexity_info


def get_classes(classes_path):
    with open(classes_path, encoding='utf-8') as f:
        class_names = f.readlines()
    class_names = [c.strip() for c in class_names]
    return class_names, len(class_names)


if __name__ == "__main__":
    # print("Load model.")
    # ssd = SSD(confidence = 0.01, nms_iou = 0.5)
    # print("Load model done.")

    # net = models.vgg16() #Models you can build for yourself
    classes_path = 'model_data/voc_classes.txt'
    class_names, num_classes  = get_classes(classes_path)
    myNet = FasterRCNN(num_classes, "predict", anchor_scales = [8, 16, 32], backbone = 'vgg')
    flops, params = get_model_complexity_info(myNet, (3,416,416), as_strings=True, print_per_layer_stat=True)
    print("Flops: {}".format(flops))
    print("Params: " + params)

summarize experience

Alchemy is dull!

More experience to search, more try!

Basic Linux commands

Snap the graphics card with the Screen command and watch-n 0.5 nvidia-smi name 😁

get_map is very dull with its own magic script

First of all, you'll decide that you can't 🙅 Delete, avoid wasting time ⌚ Neodymium

  • First you get_map wrapped as main function transfer (model_path)
def main(model_path):
  ####
  • In get_ Subfix under map (multi-threaded write it down)
if __name__ == "__main__":
    Path = os.path.abspath(os.path.abspath(os.path.dirname(__file__)))
    logs_path = Path + '/logs'
    files= os.listdir(logs_path)
    want = []
    for each in files:
        if '.pth' in each:
            want.append(each)
    
    # want = ['ep144-loss2.422-val_loss3.563.pth'] # Individual Test
    for each in want:
        path = 'logs/' + each
        with open(Path + '/log.txt', "a+", encoding='utf-8') as f:
            f.write(each + '*****')
        main(path)
  • Changing yolo.py will default to self.model_path changed to accepted
#---------------------------------------------------#
    #   Initialize YOLO
    #---------------------------------------------------#
    def __init__(self, model_path, **kwargs): # Manually add acceptance model_path
        self.__dict__.update(self._defaults)
        for name, value in kwargs.items():
            setattr(self, name, value)
            
        #---------------------------------------------------#
        #   Number of Get Kinds and Prior Boxes
        #---------------------------------------------------#
        self.class_names, self.num_classes  = get_classes(self.classes_path)
        self.anchors, self.num_anchors      = get_anchors(self.anchors_path)
        self.bbox_util                      = DecodeBox(self.anchors, self.num_classes, (self.input_shape[0], self.input_shape[1]), self.anchors_mask)

        #---------------------------------------------------#
        #   Frame set different colors
        #---------------------------------------------------#
        hsv_tuples = [(x / self.num_classes, 1., 1.) for x in range(self.num_classes)]
        self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
        self.colors = list(map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)), self.colors))
        self.generate(model_path) # Incoming model_path

    #---------------------------------------------------#
    #   Generate model
    #---------------------------------------------------#
    def generate(self, model_path):
        #---------------------------------------------------#
        #   Build a yolo model, load the weights of the yolo model
        #---------------------------------------------------#
        self.net    = YoloBody(self.anchors_mask, self.num_classes)
        device      = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        self.net.load_state_dict(torch.load(model_path, map_location=device)) # Don't self
        self.net    = self.net.eval()
        print('{} model, anchors, and classes loaded.'.format(model_path)) # Don't self

        if self.cuda:
            self.net = nn.DataParallel(self.net)
            self.net = self.net.cuda()
  • Change utils/utils_map.py starts at line 638
    if show_animation:
        cv2.destroyAllWindows()

    results_file.write("\n# mAP of all classes\n")
    mAP     = sum_AP / n_classes
    text    = "mAP = {0:.2f}%".format(mAP*100)

    # Write in
    PATH = os.path.abspath(os.path.dirname(os.path.dirname(__file__)))
    with open(PATH + '/log.txt', "a+", encoding='utf-8') as f:
        f.write(text + '\n')
    
    results_file.write(text + "\n")
    print(text)

Large increase in mAP 🔝

Just because there are too few data sets! Plus random assignment

Your training turned out to be your test, of course, high

Modify voc_annotation.py starts on line 63

        for xml in temp_xml:
            if xml.endswith(".xml"):
                total_xml.append(xml)
        
        total_xml.sort()
        num     = len(total_xml)  
        list    = range(num)  
        tv      = int(num*trainval_percent)  
        tr      = int(tv*train_percent)  
        # trainval= random.sample(list,tv) 
        trainval= list[:tv]  
        # train   = random.sample(trainval,tr)  
        train   = trainval[:tr]

Last

I'm starting to fill in the basics. Tired~ 🥱

Practice and study again, I think it's less abstract.

That's all for now!

Tags: Python AI Pytorch

Posted on Tue, 09 Nov 2021 18:07:36 -0500 by stringman