Record a string compression operation

There is a scenario in the project: a batch of data needs to be sent to the APP side, and in the actual application scenario, there is a certain limit on the length of data, so string compression is needed.
Java is used for APP, Golang is used for backend, gzip is used for compression, and base64 encoding, Chinese and Western European character set transcoding are involved.

Process description

  1. Backend:

    1. Character set conversion reference from: A complex Chinese coding problem
    2. Compress string
    3. Use base64 to encode as visible characters
    4. network transmission
  2. APP terminal

    1. Receive network response
    2. base64 decodes to get a byte array (compressed)
    3. gzip reads the compressed byte stream and decompresses it
    4. Transcoding to Chinese

Sample code

All sample code can be found in the Here find

  1. server terminal
func compress(s string) string {
    //Use GBK character set encode
    gbk, err := simplifiedchinese.GBK.NewEncoder().Bytes([]byte(s))
    if err != nil {
        logrus.Error(err)
        return ""
    }

    //Change to ISO8859_1, i.e. Latin 1 character set
    latin1, err := charmap.ISO8859_1.NewDecoder().Bytes(gbk)
    if err != nil {
        return ""
    }

    //Using gzip compression
    var buf bytes.Buffer
    zw := gzip.NewWriter(&buf)

    _, err = zw.Write(latin1)
    if err != nil {
        logrus.Fatal(err)
    }

    if err := zw.Close(); err != nil {
        logrus.Fatal(err)
    }

    //Using base64 encoding
    encoded := base64.StdEncoding.EncodeToString(buf.Bytes())
    fmt.Println(encoded)
    return encoded
}
  1. APP terminal
private static String uncompress(String s) throws IOException {

        //base64 decode
        byte[] byteArray = Base64.getDecoder().decode(s);
        ByteArrayInputStream bis = new ByteArrayInputStream(byteArray);
        
        //gzip decompression
        GZIPInputStream gis = new GZIPInputStream(bis);
        BufferedReader br = new BufferedReader(new InputStreamReader(gis, "UTF-8"));
        StringBuilder sb = new StringBuilder();
        String line;
        while ((line = br.readLine()) != null) {
            sb.append(line);
        }
        br.close();
        gis.close();
        bis.close();

        //Using the Latin 1 character set to get bytes
        byte[] latin1 = sb.toString().getBytes("ISO_8859_1");
        //Switch back to GBK
        return new String(latin1, "GBK");
    }

base64 encoding is mainly used because there will be many invisible characters when the data is compressed by gzip and directly converted into strings. In this way, during the transmission process, it will be escaped by the server framework, thus causing distortion.
The code is only used as an example. Please check the error and exception in the actual business code.

Tags: Go encoding network Java

Posted on Tue, 03 Dec 2019 12:14:04 -0500 by jwb666