2021SC@SDUSC
In my last blog, I introduced the general steps, common classes and methods of Zxing to generate QR code / barcode. In this article, I will focus on the analysis of QR code coding algorithm. The first half of this blog mainly introduces the relevant knowledge of QR code and the coding steps of QR code; The latter part analyzes the coding class QRCodeWriter of QR code.
1, QR code introduction
1.1 INTRODUCTION
QR code belongs to a kind of matrix QR code, which is developed by Denso (Japan electrical equipment) Company and standardized by JIS and ISO.
Its basic structure is as follows:
Position detection graphics, position detection graphics separator and positioning graphics: used to locate QR codes. For each QR code, the position is fixed, but the size and specification will be different;
Correction graphics: when the specification is determined, the number and position of correction graphics are determined;
Format information: indicates the error correction level of the modified QR code, which is divided into L, M, Q and H;
Version information: that is, the specification of QR code. QR code symbols have a matrix of 40 specifications (generally black and white), from 21x21 (version 1) to 177x177 (version 40). Each version symbol adds 4 modules on each side compared with the previous version.
Data and error correction codeword: actually saved QR code information and error correction codeword (used to correct errors caused by QR code damage).
1.2 characteristics of QR code
① High speed reading: QR is taken from the initials of "Quick Response". It takes about three seconds from shooting to decoding to displaying the content through the camera, and there are no requirements for the angle of shooting;
② High capacity and high density: theoretically, 7089 numbers, 4296 mixed alphabetic and numeric characters, 2953 8bit byte data and 1817 Chinese characters can be stored after compression;
③ Support error correction processing: even if the code becomes dirty or damaged, it can automatically recover the data. This "error correction capability" has four levels, and users can select the corresponding level according to the use environment. When the level is raised, the error correction capability will be improved accordingly, but as the amount of data will increase, the coding size will also increase.
level L: maximum 7% of errors can be corrected;
level M: up to 15% of errors can be corrected;
level Q: maximum 25% of errors can be corrected;
level H: up to 30% of errors can be corrected;
④ Structured: the seemingly irregular graph actually has a strict definition of the region. The following figure is a QR graph structure of mode 2 and version 1.
In the matrix 21 * 21 in the figure above, the blackandwhite area is designated as a fixed position in the QR code specification, which is called finder pattern and timing pattern. Image seeking graphics and positioning graphics are used to help the decoding program determine the coordinates of specific symbols in the graphics.
The yellow area is used to store the encoded data content and error correction information code.
The blue area is used to identify the error correction level (i.e. Level L to Level H) and the socalled "Mask pattern". This area is called "format information".
⑤ Extensibility: the Structure Append feature of QR code enables a QR code to be decomposed into multiple QR codes. On the contrary, the data of multiple QR codes can also be combined into one QR code, as shown in the figure:
1.3 QR code mode and version
QR codes are divided into two modes: model1 and Model2. Model1 is the initial definition of QR and Model2 is an extension of model1. At present, Model2 is widely used.
The size of QR chart is defined as version, and the version number ranges from 1 to 40. Version 1 is a 21 * 21 matrix. Every time a version number is added, the size of the matrix will increase by 4 modules. Therefore, version 40 is a 177 * 177 matrix. (the higher the version, the more stored content, and the stronger the error correction ability).
2, QR code coding process

Data analysis: determine the coded character type and convert it into symbolic characters according to the corresponding character set;
Select the error correction level. Under certain specifications, the higher the error correction level, the smaller the capacity of real data. 
Data coding: convert data characters into bit stream, one codeword every 8 bits, forming a codeword sequence of data as a whole. Knowing the data codeword sequence will know the data content of QR code.
The capacity is as follows:
format  capacity 

number  Up to 7089 characters 
letter  Up to 4296 characters 
Binary number (8 bit)  Up to 2953 bytes 
Kanji / Katakana  Up to 1817 characters (Shift JIS) 
Chinese characters  Up to 984 characters (using UTF8) 
Chinese characters  Up to 1800 characters (BIG5) 
The mode code is as follows:
pattern  Indicator 

ECI  0111 
number  0001 
Alphanumeric  0010 
8bit byte  0100 
Japanese Kanji  1000 
Chinese characters  1101 
Structure link  0011 
FNCI  0101 (first position), 1001 (second position) 
Terminator (end of message)  0000 
The data can be encoded in one mode for more efficient decoding.
For example, code the data: 01234567 (version 1H),
① Grouping: 012 345 67
② Convert to binary: 012 → 0000001100,, 345 → 0101011001, 67 → 1000011
③ Converted to sequence: 0000001100 0101011001 1000011
④ Number of characters converted to binary: 8 → 0000001000
⑤ Add mode indicator (figure above) 0001:0001 000000 1000 000000 1100 0101011001 1000011
For letters, Chinese and Japanese, there are only differences in grouping methods and modes. The basic methods are the same.

Error correction coding: block the above codeword sequence as required, generate error correction codewords according to the error correction level and block codewords, and add the error correction codewords to the data codeword sequence to form a new sequence.
When the twodimensional code specification and error correction level are determined, in fact, the total number of codewords and error correction codewords it can accommodate are determined. For example, in version 10, when the error correction level is H, a total of 346 codewords can be accommodated, including 224 error correction codewords.
That is, about 1 / 3 of the codewords in the QR code area are redundant. For these 224 error correction codewords, it can correct 112 substitution errors (such as blackandwhite inversion) or 224 data reading errors (unreadable or unable to decode), so the error correction capacity is 112 / 346 = 32.4%. 
Construct the final data information: under the condition that the specification is determined, put the sequence generated above in order, such as blocking.
Divide the data into blocks according to the regulations, and then calculate each block to obtain the corresponding error correction codeword blocks. The error correction codeword blocks are formed into a sequence in order and added to the original data codeword sequence.
For example: D1, D12, D23, D35, D2, D13, D24, D36,... D11, D22, D33, D45, D34, D46, E1, E23,E45, E67, E2, E24, E46, E68 
Construction matrix: put the detection graphics, separator, positioning graphics, correction graphics and codeword modules into the matrix.
Fill the above complete sequence into the area of the QR code matrix of the corresponding specification. 
Masking: the masking pattern is used for the coding area of the symbol, so that the dark and light (black and white) areas in the QR code pattern can be optimally distributed.

Format and version information: generate format and version information and put it into the corresponding area.
Versions 740 contain version information. All versions without version information are 0. Two positions on the QR code contain version information, which are redundant.
The version information has 18 bits in total and is a 6X3 matrix, in which the data in 6 bits is, such as version number 8, the information in data bits is 001000, and the following 12 bits are error correction bits.
3, Code analysis
Understanding the coding process of the above QR code is very helpful for the analysis of Zxing source code.
3.1 QRCodeWriter
The QRCodeWriter class is the entry for Zxing to generate QR codes. It inherits from the parent class Writer, which was mentioned in the previous blog. The role of the QRCodeWriter object is to render QR codes as a twodimensional bit matrix array of gray values.
Its default quiet zone value is 4 (QUIET_ZONE_SIZE=4;), and the quiet zone is the blank edge space at the beginning and end of each barcode, so that the scanner can read information accurately. In most cases, quiet areas do not need to be large.
The coding method is as follows: this method inherits from the Writer parent class, and the specific code analysis has been annotated in the corresponding position
@Override public BitMatrix encode(String contents, BarcodeFormat format, int width, int height, Map<EncodeHintType,?> hints) throws WriterException { //The content to be encoded is empty if (contents.isEmpty()) { throw new IllegalArgumentException("Found empty contents"); } //The QR code to be generated is not QR code if (format != BarcodeFormat.QR_CODE) { throw new IllegalArgumentException("Can only encode QR_CODE, but got " + format); } //The QR code to be generated is less than 0 in length or width if (width < 0  height < 0) { throw new IllegalArgumentException("Requested dimensions are too small: " + width + 'x' + height); } //Error correction level ErrorCorrectionLevel errorCorrectionLevel = ErrorCorrectionLevel.L; //Quiet area size int quietZone = QUIET_ZONE_SIZE; //Get other parameters of QR code if (hints != null) { //Get fault tolerance level if (hints.containsKey(EncodeHintType.ERROR_CORRECTION)) { errorCorrectionLevel = ErrorCorrectionLevel.valueOf(hints.get(EncodeHintType.ERROR_CORRECTION).toString()); } //Get margins if (hints.containsKey(EncodeHintType.MARGIN)) { quietZone = Integer.parseInt(hints.get(EncodeHintType.MARGIN).toString()); } } //Call encoding method QRCode code = Encoder.encode(contents, errorCorrectionLevel, hints); //Return corresponding parameters return renderResult(code, width, height, quietZone); }
The private method of generating rendering results is as follows: its output is BitMatrix bit matrix
// Note that the input matrix uses 0 = = white, 1 = = black, while the output matrix uses 0 = = black, 255 = = white (that is, an 8bit grayscale bitmap). private static BitMatrix renderResult(QRCode code, int width, int height, int quietZone) { ByteMatrix input = code.getMatrix(); if (input == null) { throw new IllegalStateException(); } int inputWidth = input.getWidth(); int inputHeight = input.getHeight(); int qrWidth = inputWidth + (quietZone * 2); int qrHeight = inputHeight + (quietZone * 2); int outputWidth = Math.max(width, qrWidth); int outputHeight = Math.max(height, qrHeight); int multiple = Math.min(outputWidth / qrWidth, outputHeight / qrHeight); // Padding includes quiet areas and additional white pixels to suit the required size. // For example, if the input is 25x25, the QR will be 33x33, including quiet area. If the requested size is 200x160, the multiple will be 4 for a QR of 132x132. These will handle all fills from 100x100 (actual QR) to 200x160. int leftPadding = (outputWidth  (inputWidth * multiple)) / 2; int topPadding = (outputHeight  (inputHeight * multiple)) / 2; BitMatrix output = new BitMatrix(outputWidth, outputHeight); for (int inputY = 0, outputY = topPadding; inputY < inputHeight; inputY++, outputY += multiple) { // Write the contents of this line of barcode for (int inputX = 0, outputX = leftPadding; inputX < inputWidth; inputX++, outputX += multiple) { if (input.get(inputX, inputY) == 1) { output.setRegion(outputX, outputY, multiple, multiple); } } } return output; }
reference material
QR code generation principle
Basic structure and generation principle of QR code