N suggestions for improving Java programs

The day brings several suggestions related to String operation, such as fine products and fine products.

Recommendation 54: use String, StringBuffer and StringBuilder correctly

The CharSequence interface of Java has three implementation classes related to Strings: String, StringBuffer and StringBuilder.
String is an immutable variable, that is, after it is created, it will exist permanently in memory and cannot be modified. Even if it is generated through the method of string itself, it is also a new string.

String str = "hello";
String str1 = str.substring(1);

str string regenerates a str1 string with the value "ello" through the substring method. Is it possible to return yourself without creating an object? str.substring(0) will not generate new objects, and the JVM will only return str references from the string pool.

Like String, StringBuffer stores an ordered character sequence in memory. The difference is that the value of the StringBuffer object is variable, for example:

StringBuffer sb = new StringBuffer("hello");
sb.append(" world");

The value of sb in the above code changes all the time. After append, it becomes "hello world". What's the difference between this and String class connecting strings through "+"?

Of course, there is a difference. For strings connected by the plus sign of String, the String variable points to the new reference address, while StringBuffer will not change its reference address.

StringBuilder and StringBuffer are basically the same. The difference is that StringBuffer is thread safe, while StringBuilder is thread unsafe. Therefore, it can be seen that the operation of String class is much slower than that of StringBuffer and StringBuilder.

After understanding the principles of the three, let's take a look at their usage scenarios:

  • Usage scenario of String class: it is used when the String does not change frequently, such as declaring constants, a small number of variables, etc
  • Usage scenario of StringBuffer class: frequently perform string operations, such as splicing, replacement and deletion, and run in a multi-threaded environment, such as XML parsing, HTTP parameter parsing and encapsulation
  • Usage scenario of StringBuilder class: frequently perform string operations, such as splicing, replacement, deletion, etc., and run in a single threaded environment, such as SQL statement assembly, JSON encapsulation, etc

  Scan VX for Java data, front-end, test, python and so on

Recommendation 56: free choice of string splicing method

There are generally three methods for string splicing: plus sign, concat method, StringBuffer or the append method of StringBuilder. What is the difference between the three? Let's look at the following example:

str += "a";	// Plus sign connection
str = str.concat("a");	// concat method connection

Use these three methods to splice strings. After 10W cycles, check the execution time:

public class Proposal_56 {
	public static void doWithAdd() {
		String str = "a";
		for (int i = 0; i < 100000; i++) {
			str += "c";
		}
	}

	public static void doWithConcat() {
		String str = "a";
		for (int i = 0; i < 100000; i++) {
			str = str.concat("c");
		}
	}

	public static void doWithStringBuilder() {
		StringBuilder sb = new StringBuilder("a");
		for (int i = 0; i < 100000; i++) {
			sb.append("c");
		}
	}

	public static void main(String[] args) {
		long startTime = System.currentTimeMillis();
		doWithAdd();
		long endTime = System.currentTimeMillis();
		System.out.println("doWithAdd Run time:" + (endTime - startTime) + "ms");

		startTime = System.currentTimeMillis();
		doWithConcat();
		endTime = System.currentTimeMillis();
		System.out.println("doWithConcat Run time:" + (endTime - startTime) + "ms");

		startTime = System.currentTimeMillis();
		doWithStringBuilder();
		endTime = System.currentTimeMillis();
		System.out.println("doWithStringBuilder Run time:" + (endTime - startTime) + "ms");
	}
}

The results are as follows:

1. Plus sign splicing string:
The compiler optimizes the string using the plus sign. It will append using the append method of StringBuilder. The effect is the same as the following code:

str = new StringBuilder(str).append("c").toString();

In principle, shouldn't the efficiency be the same as that of StringBuilder? Why does it take 4372ms to use the plus sign, while StringBuilder only takes 2ms? The reason is very simple. First, it will create a StringBuilder object every cycle. 10W cycles are 10W objects. Second, it takes time to call the toString method after each execution, and it takes time to convert it into a string.

2.concat method splicing string:
Let's take a look at the source code of concat method:

public String concat(String str) {
        int otherLen = str.length();
        if (otherLen == 0) {
            return this;
        }
        int len = value.length;
        char buf[] = Arrays.copyOf(value, len + otherLen);
        str.getChars(buf, len);
        return new String(buf, true);
}

It looks like an array copy as a whole. Although the processing in memory is atomic and very fast, pay attention to the final return. Each time the concat method creates a new String object, which is the reason why the concat method slows down. Cycle 10W times and create 10W objects in the same way.

3.append method splicing string:
Let's also take a look at the source code of append:

public AbstractStringBuilder append(String str) {
        if (str == null)
            return appendNull();
        int len = str.length();
        ensureCapacityInternal(count + len);
        str.getChars(0, len, value, count);
        count += len;
        return this;
}

The whole append method does character array processing, lengthening, and then array copying. These are basic data operations. There is no new object, so the speed is fast.

These three methods for splicing strings have the same functions and different performances, but it does not mean that we must use StringBuilder. This is because "+" is very consistent with our programming habits and easy to read. In most cases, it is OK to use the plus sign. concat or append methods are considered only when the system performance is critical.

Recommendation 57: regular expressions are recommended for complex string operations

Operations such as append, merge, replace, flashback and split are often used in daily string operations, and Java also provides us with methods such as append, replace, reverse and split. However, more often, we still need to complete complex processing with regular expressions. In the following example, count the number of English words in an article, The code is as follows:

public class Proposal_57 {
	public static void main(String[] args) {
		Scanner scan = new Scanner(System.in);
		while (scan.hasNext()) {
			String str = scan.nextLine();
			int wordsCount = str.split(" ").length;
			System.out.println(str + " Number of words:" + wordsCount);
		}
	}
}

The returned results are as follows:

We found that everything else was wrong except the first one. The second one did not consider the continuous spaces entered by the user, the third one did not consider the continuous words, and the fourth one did not take the hyphen "'" into account. How to deal with it? We consider using regular expressions:

public class Proposal_57 {
	public static void main(String[] args) {
		Scanner scan = new Scanner(System.in);
		while (scan.hasNext()) {
			String str = scan.nextLine();
//			int wordsCount = str.split(" ").length;
			Pattern pattern = Pattern.compile("\\b\\w+\\b");
			Matcher matcher = pattern.matcher(str);
			int wordsCount = 0;
			while (matcher.find()) {
				wordsCount++;
			}
			System.out.println(str + " Number of words:" + wordsCount);
		}
	}
}

After changing to the above code, the following results are obtained:

At this time, all the results are correct, \ b represents the word boundary, and w represents numbers or characters. In this way, the matched code will be valid. Regular expression string matching can be used in many situations, such as common server log analysis.

  Scan VX for Java data, front-end, test, python and so on

Tags: Java Back-end

Posted on Mon, 06 Dec 2021 15:28:38 -0500 by dswain