There is a scenario in the project: a batch of data needs to be sent to the APP side, and in the actual application scenario, there is a certain limit on the length of data, so string compression is needed.
Java is used for APP, Golang is used for backend, gzip is used for compression, and base64 encoding, Chinese and Western European character set transcoding are involved.
Process description
-
Backend:
- Character set conversion reference from: A complex Chinese coding problem
- Compress string
- Use base64 to encode as visible characters
- network transmission
-
APP terminal
- Receive network response
- base64 decodes to get a byte array (compressed)
- gzip reads the compressed byte stream and decompresses it
- Transcoding to Chinese
Sample code
All sample code can be found in the Here find
- server terminal
func compress(s string) string { //Use GBK character set encode gbk, err := simplifiedchinese.GBK.NewEncoder().Bytes([]byte(s)) if err != nil { logrus.Error(err) return "" } //Change to ISO8859_1, i.e. Latin 1 character set latin1, err := charmap.ISO8859_1.NewDecoder().Bytes(gbk) if err != nil { return "" } //Using gzip compression var buf bytes.Buffer zw := gzip.NewWriter(&buf) _, err = zw.Write(latin1) if err != nil { logrus.Fatal(err) } if err := zw.Close(); err != nil { logrus.Fatal(err) } //Using base64 encoding encoded := base64.StdEncoding.EncodeToString(buf.Bytes()) fmt.Println(encoded) return encoded }
- APP terminal
private static String uncompress(String s) throws IOException { //base64 decode byte[] byteArray = Base64.getDecoder().decode(s); ByteArrayInputStream bis = new ByteArrayInputStream(byteArray); //gzip decompression GZIPInputStream gis = new GZIPInputStream(bis); BufferedReader br = new BufferedReader(new InputStreamReader(gis, "UTF-8")); StringBuilder sb = new StringBuilder(); String line; while ((line = br.readLine()) != null) { sb.append(line); } br.close(); gis.close(); bis.close(); //Using the Latin 1 character set to get bytes byte[] latin1 = sb.toString().getBytes("ISO_8859_1"); //Switch back to GBK return new String(latin1, "GBK"); }
base64 encoding is mainly used because there will be many invisible characters when the data is compressed by gzip and directly converted into strings. In this way, during the transmission process, it will be escaped by the server framework, thus causing distortion.
The code is only used as an example. Please check the error and exception in the actual business code.