SpringBoot recognizes numbers with deep learning model: development details

Overview of this article

Above Three minute experience: SpringBoot recognizes numbers with deep learning model In, we experienced a Java application with a click of the mouse. The application integrates a deep learning model and can recognize handwritten digits in images. That article focuses on experience and operation and does not talk about the implementation behind it
At the moment, if you don't know much about deep learning before, just follow Three minute experience: SpringBoot recognizes numbers with deep learning model After a simple experience, you should be confused now. You may have the following questions in your mind:

What is the model file minist-model.zip mentioned earlier? How did you get here?
After SpringBoot gets the model file, how to use it? What does it have to do with recognition?

Today, let's write code together to develop the application mentioned above, which will also answer your questions. The whole article consists of the following contents:

Overview of the whole process
Training practice
In the design phase of SpringBoot application, the first key point is highlighted
The second key point is emphasized in the design stage of SpringBoot application
In the design stage of SpringBoot application, sort out the complete process
SpringBoot application coding
Make the application into a docker image

Overview of the whole process

In short, if you want to solve practical problems through in-depth learning (taking image recognition as an example), you need to do the following two things:

Take the existing data for training, and save the training results in the model file
Use model files in business code to identify images

Let's start with training:

Prepare some handwritten digital pictures in advance. The corresponding numbers of these pictures have been determined. As shown in the figure below, the handwritten characters of 9 are all under directory 9

Write the code of the training model and set various parameters of the neural network, such as normalization, activation function, loss function, convolution layers and so on
Execute the code of the training model and take the above pictures for training
Save the training results to a file, which is the model file. In the previous article, it is minist-model.zip
This completes the training

The next step is to use this model file to solve practical problems

Develop a business application
Load the model file generated by the previous training in the application
After receiving the data submitted by the user, submit it to the model for processing
Return the processing result to the user

The above is the general process of applying in-depth learning to business scenarios. Next, let's start with training and start the actual combat process

Source download

The complete source code in this actual combat can be downloaded from GitHub. The address and link information are shown in the table below( https://github.com/zq2599/blog_demos):

name

link

remarks

Project Home

https://github.com/zq2599/blog_demos

The project is on the GitHub home page

git warehouse address (https)

https://github.com/zq2599/blog_demos.git

The warehouse address of the source code of the project, https protocol

git warehouse address (ssh)

git@github.com:zq2599/blog_demos.git

The project source code warehouse address, ssh protocol

There are multiple folders in this git project. The source code of DL4J practical combat series is in the DL4J tutorials folder, as shown in the red box below:

There are several sub projects in the dl4j tutorials folder. The actual combat code is in the dlfj tutorials directory, as shown in the red box below:

Training practice

The detailed process of the whole training, that is, the generation process of minist-model.zip, has been completed in DL4J practice 3: Classic convolution example (LeNet-5) It is explained in detail in the article, and the complete code is also given. You only need to operate once according to the article, but there is a problem to pay special attention to before operation:
DL4J practice 3: Classic convolution example (LeNet-5) The version of deeplearning 4J framework used in this article is 1.0.0-beta6. Please change it to 1.0.0-beta7. The specific change method is to open the pom.xml file of the simple Revolution project (note that it is the simple Revolution project, not its parent project dlfj tutorials), and modify the content in red box 2 in the figure below:

After modification, you can run the program to generate the model file, and then enter the stage of using the model
You may ask, why DL4J practice 3: Classic convolution example (LeNet-5) Will you use version 1.0.0-beta 6 in this article? In fact, it is to use GPU to accelerate the training process. At that time, 1.0.0-beta7 does not support CUDA-9.2. GPU acceleration will not be used in this article, so it is recommended to use version 1.0.0-beta7
Next, start developing the SpringBoot application, which uses the model to identify images

SpringBoot application design (the first key point)

There are two key points to pay attention to in advance in the design stage. The first one is related to the picture size: when training the model, the pictures used are 28 * 28 pixels, so our application needs to scale the pictures submitted by the user to 28 * 28 pixels

SpringBoot application design (the second key point)

Take another look at the pictures used for training, as shown below. All the pictures are white on black background:

So the problem comes: the model is trained according to the white characters on a black background. It can't recognize the black characters on a white background. How to deal with the pictures with black characters on a white background?
Therefore, if the user inputs a picture with black characters on a white background, our program will reverse the color, change it into white characters on a black background, and then do recognition

SpringBoot application design (process design)

Now let's sort out the whole process. If the user inputs a picture with black characters on a white background, the processing flow of the whole application is as follows:

If the user inputs a picture with white characters on a black background, you only need to remove the reverse color processing in the above process
Provide special interface predict with white background for black word pictures on white background
Provide special interface predict with black background for white pictures on black background
Now that the design work has been completed, you can start coding

Use model (coding)

To facilitate the management of demo code and the versions of dependent libraries, DL4J practice one: preparation A maven project named dl4j tutorials is created in this article. The application we are building today is also a sub project of dl4j tutorials, named predict number image. Its own pom.xml content is as follows:

<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <parent> <artifactId>dlfj-tutorials</artifactId> <groupId>com.bolingcavalry</groupId> <version>1.0-SNAPSHOT</version> </parent> <modelVersion>4.0.0</modelVersion> <artifactId>predict-number-image</artifactId> <packaging>jar</packaging>  <dependencyManagement> <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-dependencies</artifactId> <version>$</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> <dependencies> <dependency> <groupId>com.bolingcavalry</groupId> <artifactId>commons</artifactId> <version>$</version> </dependency> <dependency> <groupId>org.nd4j</groupId>   <artifactId>nd4j-native-platform</artifactId> </dependency> <dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> </dependencies> <build> <plugins>  <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> <configuration> <mainClass>com.bolingcavalry.predictnumber.PredictNumberApplication</mainClass> </configuration> <executions> <execution> <goals> <goal>repackage</goal> </goals> </execution> </executions> </plugin> </plugins> </build> </project>

It can be seen that the predict number image project has little to do with its parent project dlfj tutorials. It only uses the version numbers of several libraries defined by the parent project. You can also create a project without parent-child relationship independently;
Create a new configuration file application.properties, which contains the configuration related to the picture:

# Maximum total uploaded files spring.servlet.multipart.max-request-size=1024MB # Maximum value for a single file spring.servlet.multipart.max-file-size=10MB # Directory for processing picture files predict.imagefilepath=/app/images/ # Model location predict.modelpath=/app/model/minist-model.zip

The static methods required for processing images are concentrated in the ImageFileUtil.java file, mainly including save (save to disk), resize (zoom), colorRevert (reverse color), clear (clean), getGrayImageFeatures (feature extraction, the operation is the same as that during training):

package com.bolingcavalry.commons.utils; import lombok.extern.slf4j.Slf4j; import org.datavec.api.split.FileSplit; import org.datavec.image.loader.NativeImageLoader; import org.datavec.image.recordreader.ImageRecordReader; import org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator; import org.nd4j.linalg.api.ndarray.INDArray; import org.nd4j.linalg.dataset.api.iterator.DataSetIterator; import org.nd4j.linalg.dataset.api.preprocessor.ImagePreProcessingScaler; import org.springframework.web.multipart.MultipartFile; import javax.imageio.ImageIO; import java.awt.*; import java.awt.image.BufferedImage; import java.io.File; import java.io.IOException; import java.util.UUID; @Slf4j public class ImageFileUtil { /** * Adjusted file width */ public static final int RESIZE_WIDTH = 28; /** * Adjusted file height */ public static final int RESIZE_HEIGHT = 28; /** * Save the uploaded file on the server * @param base The directory of the file to process * @param file Files to process * @return */ public static String save(String base, MultipartFile file) { // Check whether it is empty if (file.isEmpty()) { log.error("invalid file"); return null; } // The file name comes from the original file String fileName = file.getOriginalFilename(); // Location to save File dest = new File(base + fileName); // Start saving try { file.transferTo(dest); } catch (IOException e) { log.error("upload fail", e); return null; } return fileName; } /** * Convert picture to 28 * 28 pixels * @param base Directory to process files * @param fileName File name to be adjusted * @return */ public static String resize(String base, String fileName) { // The new file name is the original file name with a random number suffix, and the extension is fixed to png String resizeFileName = fileName.substring(0, fileName.lastIndexOf(".")) + "-" + UUID.randomUUID() + ".png"; log.info("start resize, from [{}] to [{}]", fileName, resizeFileName); try { // Read original file BufferedImage bufferedImage = ImageIO.read(new File(base + fileName)); // Scaled instance Image image = bufferedImage.getScaledInstance(RESIZE_WIDTH, RESIZE_HEIGHT, Image.SCALE_SMOOTH); BufferedImage resizeBufferedImage = new BufferedImage(28, 28, BufferedImage.TYPE_INT_RGB); Graphics graphics = resizeBufferedImage.getGraphics(); // mapping graphics.drawImage(image, 0, 0, null); graphics.dispose(); // The converted picture is written into a file ImageIO.write(resizeBufferedImage, "png", new File(base + resizeFileName)); } catch (Exception exception) { log.info("resize error from [{}] to [{}], {}", fileName, resizeFileName, exception); resizeFileName = null; } log.info("finish resize, from [{}] to [{}]", fileName, resizeFileName); return resizeFileName; } /** * Convert RGB to int number * @param alpha * @param red * @param green * @param blue * @return */ private static int colorToRGB(int alpha, int red, int green, int blue) { int pixel = 0; pixel += alpha; pixel = pixel << 8; pixel += red; pixel = pixel << 8; pixel += green; pixel = pixel << 8; pixel += blue; return pixel; } /** * Reverse color treatment * @param base Directory to process files * @param src Source file for processing * @return New file after reverse color processing * @throws IOException */ public static String colorRevert(String base, String src) throws IOException { int color, r, g, b, pixel; // Read original file BufferedImage srcImage = ImageIO.read(new File(base + src)); // Modified file BufferedImage destImage = new BufferedImage(srcImage.getWidth(), srcImage.getHeight(), srcImage.getType()); for (int i=0; i<srcImage.getWidth(); i++) { for (int j=0; j<srcImage.getHeight(); j++) { color = srcImage.getRGB(i, j); r = (color >> 16) & 0xff; g = (color >> 8) & 0xff; b = color & 0xff; pixel = colorToRGB(255, 0xff - r, 0xff - g, 0xff - b); destImage.setRGB(i, j, pixel); } } // The name of the reflection file String revertFileName = src.substring(0, src.lastIndexOf(".")) + "-revert.png"; // The converted picture is written into a file ImageIO.write(destImage, "png", new File(base + revertFileName)); return revertFileName; } /** * Take the features of black-and-white pictures * @param base * @param fileName * @return * @throws Exception */ public static INDArray getGrayImageFeatures(String base, String fileName) throws Exception { log.info("start getImageFeatures [{}]", base + fileName); // The same settings as when training the model ImageRecordReader imageRecordReader = new ImageRecordReader(RESIZE_HEIGHT, RESIZE_WIDTH, 1); FileSplit fileSplit = new FileSplit(new File(base + fileName), NativeImageLoader.ALLOWED_FORMATS); imageRecordReader.initialize(fileSplit); DataSetIterator dataSetIterator = new RecordReaderDataSetIterator(imageRecordReader, 1); dataSetIterator.setPreProcessor(new ImagePreProcessingScaler(0, 1)); // Feature extraction return dataSetIterator.next().getFeatures(); } /** * Batch cleanup files * @param base Directory to process files * @param fileNames Collection of files to be cleaned */ public static void clear(String base, String...fileNames) { for (String fileName : fileNames) { if (null==fileName) { continue; } File file = new File(base + fileName); if (file.exists()) { file.delete(); } } } }

There is only one method for defining the service layer. You can decide whether to reverse color processing through input parameters:

package com.bolingcavalry.predictnumber.service; import org.springframework.web.multipart.MultipartFile; public interface PredictService { /** * Get the uploaded picture, convert it and recognize it as a number * @param file Uploaded files * @param isNeedRevert Do you want to reverse color * @return */ int predict(MultipartFile file, boolean isNeedRevert) throws Exception ; }

The implementation of the sevice layer is also the core of this article. There are several points to note, which will be mentioned later:

package com.bolingcavalry.predictnumber.service.impl; import com.bolingcavalry.commons.utils.ImageFileUtil; import com.bolingcavalry.predictnumber.service.PredictService; import lombok.extern.slf4j.Slf4j; import org.deeplearning4j.nn.multilayer.MultiLayerNetwork; import org.deeplearning4j.util.ModelSerializer; import org.nd4j.linalg.api.ndarray.INDArray; import org.springframework.beans.factory.annotation.Value; import org.springframework.stereotype.Service; import org.springframework.web.multipart.MultipartFile; import javax.annotation.PostConstruct; import java.io.File; @Service @Slf4j public class PredictServiceImpl implements PredictService { /** * -1 Indicates recognition failure */ private static final int RLT_INVALID = -1; /** * Location of model files */ @Value("$") private String modelPath; /** * Directory for processing picture files */ @Value("$") private String imageFilePath; /** * neural network */ private MultiLayerNetwork net; /** * bean Load the model when the instantiation is successful */ @PostConstruct private void loadModel() { log.info("load model from [{}]", modelPath); // Loading model try { net = ModelSerializer.restoreMultiLayerNetwork(new File(modelPath)); log.info("module summary\n{}", net.summary()); } catch (Exception exception) { log.error("loadModel error", exception); } } @Override public int predict(MultipartFile file, boolean isNeedRevert) throws Exception { log.info("start predict, file [{}], isNeedRevert [{}]", file.getOriginalFilename(), isNeedRevert); // Pre stored file String rawFileName = ImageFileUtil.save(imageFilePath, file); if (null==rawFileName) { return RLT_INVALID; } // File name after reverse color processing String revertFileName = null; // Resized file name String resizeFileName; // Whether reverse color processing is required if (isNeedRevert) { // Reverse the original file, and the return result is the new file after reverse color processing revertFileName = ImageFileUtil.colorRevert(imageFilePath, rawFileName); // Adjust the color to 28*28 size file. resizeFileName = ImageFileUtil.resize(imageFilePath, revertFileName); } else { // Directly adjust the original file to a file of 28 * 28 size resizeFileName = ImageFileUtil.resize(imageFilePath, rawFileName); } // Now you've got the result, the file after anti color and resizing, // Then the original file and the anti color processed file can be deleted ImageFileUtil.clear(imageFilePath, rawFileName, revertFileName); // Take out the features of the black-and-white picture INDArray features = ImageFileUtil.getGrayImageFeatures(imageFilePath, resizeFileName); // Pass the features to the model for recognition return net.predict(features)[0]; } }

In the above code, there are two points to note:

The loadModel method is executed during bean initialization, and the model file is loaded through ModelSerializer.restoreMultiLayerNetwork
The real recognition operation is actually the MultiLayerNetwork.predict method. It's only one step. How simple

Then there is the web interface layer, which provides two interfaces:

package com.bolingcavalry.predictnumber.controller; import com.bolingcavalry.predictnumber.service.PredictService; import org.springframework.web.bind.annotation.PostMapping; import org.springframework.web.bind.annotation.RequestParam; import org.springframework.web.bind.annotation.ResponseBody; import org.springframework.web.bind.annotation.RestController; import org.springframework.web.multipart.MultipartFile; @RestController public class PredictController { final PredictService predictService; public PredictController(PredictService predictService) { this.predictService = predictService; } @PostMapping("/predict-with-black-background") @ResponseBody public int predictWithBlackBackground(@RequestParam("file") MultipartFile file) throws Exception { // When training the model, the numbers are white and black, // Therefore, if you upload a picture with white characters and black background, you can take it directly for recognition without reverse color processing return predictService.predict(file, false); } @PostMapping("/predict-with-white-background") @ResponseBody public int predictWithWhiteBackground(@RequestParam("file") MultipartFile file) throws Exception { // When training the model, the numbers are white and black, // Therefore, if you upload a picture with black characters and white background, you need to reverse the color, // After the color reversal, there are white characters and black background, which can be used for identification return predictService.predict(file, true); } }

Finally, the startup class:

package com.bolingcavalry.predictnumber; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; @SpringBootApplication public class PredictNumberApplication { public static void main(String[] args) { SpringApplication.run(PredictNumberApplication.class, args); } }

At this point, the code has been written, and you can now run the application to verify the function, but it's not over yet, because in Three minute experience: SpringBoot recognizes numbers with deep learning model In this article, we use docker to experience the functions, so next we make the SpringBoot application into a docker image

Make docker image

There are many ways to make SpringBoot applications into docker images. Here, we use the methods officially recommended by SpringBoot:
First write the Dockerfile file and put it in the predict number image directory. It can be seen that it is only a simple file copy operation. Then specify the startup command:

# When the 8-jdk-alpine version starts, it will crash FROM openjdk:8u292-jdk # Create directory RUN mkdir -p /app/images && mkdir -p /app/model # Specifies the source location of the mirrored content ARG DEPENDENCY=target/dependency # Copy content to mirror COPY $/BOOT-INF/lib /app/lib COPY $/META-INF /app/META-INF COPY $/BOOT-INF/classes /app # Specify start command ENTRYPOINT ["java","-cp","app:app/lib/*","com.bolingcavalry.predictnumber.PredictNumberApplication"]

Next, prepare the files required in the Dockerfile and execute mvn clean package -U in the parent project directory. This is a pure make operation and has nothing to do with docker
Enter the predict number image directory and execute the following command to extract class, configuration file, dependency library and other contents from the jar file to the target/dependency Directory:

mkdir -p target/dependency && (cd target/dependency; jar -xf ../*.jar)

Finally, execute the command docker build - t bolingcavalry / dl4j model app: 0.0.3 in the directory where the Dockerfile file is located (there is a point at the end of the command, don't miss it), and then the image production can be completed
If you have an account of hub.docker.com, you can also push the image to the central warehouse through the docker push command, so that more people can use it:
Finally, let's review Three minute experience: SpringBoot recognizes numbers with deep learning model The command to start the docker container in the article is as follows. Through the two - v parameters, the directory of the host computer is mapped to the container. Therefore, the / app/images and / app/model in the container can remain unchanged as long as the directory mapping of the host computer can be ensured to be correct:

docker run \ --rm \ -p 18080:8080 \ -v /home/will/temp/202106/29/images:/app/images \ -v /home/will/temp/202106/29/model:/app/model \ bolingcavalry/dl4j-model-app:0.0.3

For more information about docker image production officially recommended by SpringBoot, please refer to Making Docker image with SpringBoot(2.4) application (official scheme of Gradle version)
So far, the development practice of SpringBoot using deep learning model to identify numbers has been completed. If you are a java programmer interested in deep learning, I believe this article can bring you some references. For more in-depth learning practice, please pay attention to the original series of DL4J practice by Xinchen;