strings package of Go language standard library

The strings package of Go language realizes the common operation of strings. This paper introduces the common use of strings package.

strings package common functions

Compare string Compare/EqualFold

// Compare two strings in dictionary order. a = b returns 0, a < B returns - 1, and a > B returns 1
// Generally, it is faster to directly use =, >, < direct comparison
func Compare(a, b string) int

// Determines whether two strings (case insensitive) are the same
func EqualFold(s, t string) bool

The example code is as follows:

func main() {
    s1, s2 := "aBc", "AbC"
    fmt.Println(strings.Compare(s1, s2))   // 1
    fmt.Println(s1 == s2)                  // false
    fmt.Println(s1 > s2)                   // true
    fmt.Println(s1 < s2)                   // false
    
    // Are case insensitive comparisons the same
    fmt.Println(strings.EqualFold(s1, s2)) // true
}

Is there a pre / suffix HasPrefix/HasSuffix specified

// Judge whether the string s has prefix substring prefix
func HasPrefix(s, prefix string) bool

// Judge whether the string s has suffix substring suffix
func HasSuffix(s, suffix string) bool

The example code is as follows:

func main() {
    fmt.Println(strings.HasPrefix("hello", "he")) // true
    fmt.Println(strings.HasPrefix("hello", "eh")) // false
    fmt.Println(strings.HasPrefix("hello", ""))   // true

    fmt.Println(strings.HasSuffix("hello", "lo")) // true
    fmt.Println(strings.HasSuffix("hello", "ol")) // false
    fmt.Println(strings.HasSuffix("hello", ""))   // true
}

Whether to include the specified substring / character contains / containsrun / containsany

// Judge whether the string s contains substr
func Contains(s, substr string) bool

// Judge whether the string s contains the character r
func ContainsRune(s string, r rune) bool

// Determine whether the string s contains any character in chars. If chars is an empty string, return false directly
func ContainsAny(s, chars string) bool

The example code is as follows:

func main() {
    fmt.Println(strings.Contains("hello", "el")) // true
    fmt.Println(strings.Contains("hello", ""))   // true
    fmt.Println(strings.Contains("", ""))        // true

    fmt.Println(strings.ContainsRune("hello", 'e')) // true
    fmt.Println(strings.ContainsRune("hello", 'a')) // false

    fmt.Println(strings.ContainsAny("hello", "eo")) // true
    fmt.Println(strings.ContainsAny("hello", "ei")) // true
    fmt.Println(strings.ContainsAny("hello", ""))   // false
    fmt.Println(strings.ContainsAny("", ""))        // false
}

Calculates the number of specified substrings Count

// Calculates the number of non overlapping substrings sep in string s. If the substring sep is empty, len(s) + 1 is returned directly
func Count(s, sep string) int

The example code is as follows:

func main() {
    fmt.Println(strings.Count("aaa", "a"))  // 3
    fmt.Println(strings.Count("aaa", "aa")) // 1
    fmt.Println(strings.Count("aaa", ""))   // 4
}

Locate index / indexbyte / indexrun / indexany / indexfunc

// Returns the index subscript of the substring sep for the first time in the string s. if it does not exist, returns - 1
func Index(s, sep string) int

// Returns the index subscript of byte c for the first time in string s. if it does not exist, returns - 1
func IndexByte(s string, c byte) int

// Returns the index subscript of the character r for the first time in the string s. if it does not exist, returns - 1
func IndexRune(s string, r rune) int

// Returns the index subscript of any character in chars for the first time in the string s. if it does not exist or chars is an empty string, return - 1
func IndexAny(s, chars string) int

// Returns the index subscript of the function f for the first time in the string s (where the character r satisfies f(r) == true), and returns - 1 if it does not exist
func IndexFunc(s string, f func(rune) bool) int

The example code is as follows:

func main() {
    fmt.Println(strings.Index("hello world", " wor")) // 5
    fmt.Println(strings.Index("hello world", "aaa"))  // -1

    fmt.Println(strings.IndexByte("hello world", 'l')) // 2
    fmt.Println(strings.IndexByte("hello world", 'x')) // -1

    fmt.Println(strings.IndexRune("hello world", 'l')) // 2
    fmt.Println(strings.IndexRune("hello world", 'x')) // -1

    fmt.Println(strings.IndexAny("hello world", "ie")) // 1
    fmt.Println(strings.IndexAny("hello world", "mc")) // -1

    f := func(c rune) bool {
        return c == 'w'
    }
    fmt.Println(strings.IndexFunc("hello world", f)) // 6
    fmt.Println(strings.IndexFunc("hello world", f))    // -1
}

Locate index LastIndex/LastIndexAny/LastIndexByte/LastIndexFunc

// Returns the index subscript of the last occurrence of substring sep in string s. if it does not exist, return - 1
func LastIndex(s, sep string) int

// Returns the index subscript of the last occurrence of byte c in string s. if it does not exist, returns - 1
func LastIndexByte(s string, c byte) int

// Returns the index subscript of the last occurrence of any character in chars in string s. if it does not exist or chars is an empty string, return - 1
func IndexAny(s, chars string) int

// Returns the index subscript of the function f last satisfied in the string s (where the character r satisfies f(r) == true), does not exist, and returns - 1
func IndexFunc(s string, f func(rune) bool) int

The example code is as follows:

func main() {
    fmt.Println(strings.LastIndex("hello world", " wor")) // 5
    fmt.Println(strings.LastIndex("hello world", "aaa"))  // -1

    fmt.Println(strings.LastIndexByte("hello world", 'l')) // 9
    fmt.Println(strings.LastIndexByte("hello world", 'x')) // -1

    fmt.Println(strings.LastIndexAny("hello world", "ie")) // 1
    fmt.Println(strings.LastIndexAny("hello world", "mc")) // -1

    f := func(c rune) bool {
        return c == 'l'
    }
    fmt.Println(strings.LastIndexFunc("hello world", f)) // 9
    fmt.Println(strings.LastIndexFunc("hello world", f))    // 3
}

Case conversion ToLower/ToUpper

// Returns a new string that converts all letters of string s to corresponding lowercase
func ToLower(s string) string

// Returns a new string that converts all letters of string s to the corresponding uppercase
func ToUpper(s string) string

The example code is as follows:

func main() {
    fmt.Println(strings.ToLower("ABCdeF")) // abcdef
    fmt.Println(strings.ToUpper("abcDEf")) // ABCDEF
}

Repeat concatenation repeat

// Returns a new string concatenated with count strings s. count cannot pass a negative number
func Repeat(s string, count int) string

The example code is as follows:

func main() {
    fmt.Println("ba" + strings.Repeat("na", 2)) // banana
}

Replace substring replace / replaceall

// Returns a new string that replaces the first n non overlapping substrings old in string s with substring new. If n < 0, all substrings old will be replaced
func Replace(s, old, new string, n int) string

// Returns a new string that replaces all non overlapping substrings old in string s with substring new, which is equivalent to n < 0 when Replace is used
func ReplaceAll(s, old, new string) string

The example code is as follows:

func main() {
    fmt.Println(strings.Replace("aaa aaa aaa", "aa", "A", 2))  // Aa Aa aaa
    fmt.Println(strings.Replace("aaa aaa aaa", "aa", "A", -1)) // Aa Aa Aa

    fmt.Println(strings.ReplaceAll("aaa aaa aaa", "aa", "A")) // Aa Aa Aa
}

Character Map replace Map

// Returns a new string after mapping(r) for each character r in string s
func Map(mapping func(rune) rune, s string) string

The example code is as follows:

func main() {
    mapping := func(r rune) rune {
        if 'a' <= r && r <= 'z' {
            return r - 'a' + 'A'
        }
        return r
    }

    fmt.Println(strings.Map(mapping, "abcdef")) // ABCDEF
}

Remove the front suffix Trim/TrimSpace/TrimFunc

// Returns a new string that removes all the characters contained in the cutset before and after the string s
func Trim(s string, cutset string) string

// Returns a new string that removes all white space characters (specified by unicode.IsSpace) before and after string s
func TrimSpace(s string) string

// Returns a new string that removes both the front and back characters r of string s (satisfying f(r) = true)
func TrimFunc(s string, f func(rune) bool) string

The example code is as follows:

func main() {
    fmt.Println(strings.Trim("?!?hello world!?!", "?!")) // hello world

    fmt.Println(strings.TrimSpace("   hello world   ")) // hello world

    f := func(r rune) bool {
        if r == '!' || r == '?' {
            return true
        }
        return false
    }
    fmt.Println(strings.TrimFunc("?!?hello world!?!", f)) // hello world
}

Remove prefix TrimLeft/TrimLeftFunc/TrimPrefix

// Returns a new string that removes all the characters contained in the cutset in front of the string s
func TrimLeft(s string, cutset string) string

// Returns a new string that removes all the leading characters r of string s (satisfying f(r) = true)
func TrimLeftFunc(s string, f func(rune) bool) string

// Returns a new string that removes the prefix substring prefix of string s
func TrimPrefix(s, prefix string) string

The example code is as follows:

func main() {
    fmt.Println(strings.TrimLeft("?!?hello world!?!", "?!")) // hello world!?!

    f := func(r rune) bool {
        if r == '!' || r == '?' {
            return true
        }
        return false
    }
    fmt.Println(strings.TrimLeftFunc("?!?hello world!?!", f)) // hello world!?!

    fmt.Println(strings.TrimPrefix("?!?hello world!?!", "?!?hell")) // o world!?!
}

Remove suffix trimlight / trimlightfunc / trimsuffix

// Returns a new string that removes all characters contained in the cutset at the back end of string s
func TrimRight(s string, cutset string) string

// Returns a new string that removes both the back-end character r of string s (satisfying f(r) = true)
func TrimRightFunc(s string, f func(rune) bool) string

// Returns a new string that removes the possible suffix substring suffix of string s
func TrimSuffix(s, suffix string) string

The example code is as follows:

func main() {
    fmt.Println(strings.TrimRight("?!?hello world!?!", "?!")) // ?!?hello world

    f := func(r rune) bool {
        if r == '!' || r == '?' {
            return true
        }
        return false
    }
    fmt.Println(strings.TrimRightFunc("?!?hello world!?!", f)) // ?!?hello world

    fmt.Println(strings.TrimSuffix("?!?hello world!?!", "orld!?!")) // ?!?hello w
}

Split string Fields/FieldsFunc

// Returns multiple strings that divide a string according to whitespace (determined by unicode.IsSpace, which can be one or more consecutive whitespace characters)
// If all strings are blank or empty, an empty slice will be returned
func Fields(s string) []string

// Returns multiple strings that divide a string according to the separator r (satisfying f(r) == true)
// If all strings are separators or empty strings, empty slices are returned
func FieldsFunc(s string, f func(rune) bool) []string

The example code is as follows:

func main() {
    fmt.Printf("%q\n", strings.Fields("  hello world go ")) // ["hello" "world" "go"]

    f := func(r rune) bool {
        return !unicode.IsLetter(r) && !unicode.IsNumber(r)
    }
    fmt.Printf("%q\n", strings.FieldsFunc(" hello  world go   ", f)) // ["hello" "world" "go"]
}

Cut string split / splitn / splitafter / splitafter

// If the string s is segmented by removing each sep, it will be segmented to the end and the slice composed of all the segmented substrings will be returned
// Each sep will be segmented once. Even if two SEPs are adjacent, they will be segmented twice
// If sep is an empty string, Split will Split the string s into one character and one substring
func Split(s, sep string) []string

// Similar to Split, but parameter n determines the slice size after segmentation. N < 0: equal to Split(s, sep); n == 0: returns an empty slice;
// N > 0: divide up to n substrings, and the last substring contains the part not cut
func SplitN(s, sep string, n int) []string

// If the string s is segmented by cutting after each sep, it will be segmented to the end and the slice composed of all the segmented substrings will be returned
// Each sep will be segmented once. Even if two SEPs are adjacent, they will be segmented twice
// If sep is an empty string, Split will Split the string s into one character and one substring
func SplitAfter(s, sep string) []string

// Similar to SplitAfter, but parameter n determines the slice size after segmentation. N < 0: equivalent to SplitAfter(s, sep);
// n == 0: returns an empty slice; N > 0: divide up to n substrings, and the last substring contains the part not cut
func SplitAfterN(s, sep string, n int) []string

The example code is as follows:

func main() {
    // It is segmented by removing each sep, and the segmented substring is not followed by sep
    fmt.Printf("%q\n", strings.Split("aAa", "a")) // ["" "A" ""]

    // SplitN specifies the length of the returned slice, and the last part of the slice is unprocessed
    fmt.Printf("%q\n", strings.SplitN("aAa", "a", 2))  // ["" "Aa"]
    fmt.Printf("%q\n", strings.SplitN("aAa", "a", 0))  // []
    fmt.Printf("%q\n", strings.SplitN("aAa", "a", -1)) // ["" "A" ""]

    // It is segmented by cutting after each sep, and the segmented substring is followed by sep
    fmt.Printf("%q\n", strings.SplitAfter("aAa", "a")) // ["a" "Aa" ""]

    fmt.Printf("%q\n", strings.SplitAfterN("aAa", "a", 2))  // ["a" "Aa"]
    fmt.Printf("%q\n", strings.SplitAfterN("aAa", "a", 0))  // []
    fmt.Printf("%q\n", strings.SplitAfterN("aAa", "a", -1)) // ["a" "Aa" ""]
}

Connection string Join

// Connect a series of strings into a new string, separated by sep
func Join(elems []string, sep string) string

The example code is as follows:

func main() {
    ss := []string{"abc", "def", "gh"}
    fmt.Println(strings.Join(ss, ",")) // abc,def,gh
}

strings.Builder uses

During string splicing, strings can be built efficiently through the writing method of strings.Builder, which minimizes memory copy.

Method introduction

// Pre allocated memory
func (b *Builder) Grow(n int)

// Returns the length and capacity of the current b underlying [] byte slice used to store data
func (b *Builder) Len() int
func (b *Builder) Cap() int 

// Clear current b
func (b *Builder) Reset() 

// Write different types of data to the current b, and return the byte size of the written data and the error occurred
func (b *Builder) Write(p []byte) (int, error)  
func (b *Builder) WriteByte(c byte) error 
func (b *Builder) WriteRune(r rune) (int, error) 
func (b *Builder) WriteString(s string) (int, error) 

// Converts the data stored in the current b to string output
func (b *Builder) String() string 

As can be seen from the above method, strings.Builder implements the io.Writer interface:

type Writer interface {
    Write(p []byte) (n int, err error)
}

The simple code is as follows:

func main() {
    var b strings.Builder
    // Four writing methods
    b.Write([]byte("hello"))
    b.WriteByte(' ')
    b.WriteRune('you')
    b.WriteString("good")

    for i := 1; i <= 3; i++ {
        // strings.Builder implements the io.Writer interface
        fmt.Fprintf(&b, "%d...", i)
    }
    fmt.Println(b.String()) // hello 1... 2... 3
    fmt.Println(b.Len())    // 24
    fmt.Println(b.Cap())    // 48
}

Bottom analysis

☕ ⅶ storage structure

The bottom layer of strings.Builder stores data through internal [] byte s:

type Builder struct {
    addr *Builder // of receiver, to detect copies by value
    buf  []byte
}

When the write method is called, the data is actually append ed to [] byte:

func (b *Builder) Write(p []byte) (int, error) {
    b.copyCheck()
    b.buf = append(b.buf, p...)
    return len(p), nil
}

Since the underlying layer is Slice, Slice may be expanded during writing. Therefore, strings.Builder provides the Grow() method to pre allocate memory to avoid multiple expansion. The specific implementation of the Grow() method is as follows:

func (b *Builder) grow(n int) {
    buf := make([]byte, len(b.buf), 2*cap(b.buf)+n)
    copy(buf, b.buf)
    b.buf = buf
}

func (b *Builder) Grow(n int) {
    b.copyCheck()
    if n < 0 {
        panic("strings.Builder.Grow: negative count")
    }
    if cap(b.buf)-len(b.buf) < n {
        b.grow(n)
    }
}

The Grow() method ensures that the Slice inside it can write n bytes. Capacity expansion will occur only when the remaining space of the Slice is insufficient to write n bytes.

⭐ ⅸ copy not allowed

strings.Builder is not allowed to be copied. When trying to copy strings.Builder and write it, the program will report an error:

func main() {
    var b1 strings.Builder
    b1.WriteString("aaa")
    b2 := b1
    b2.WriteString("bbb")
    fmt.Println(b2.String())
}

// panic: strings: illegal use of non-zero Builder copied by value

The strings.Builder structure has a pointer add to * Builder. After calling the write method, the pointer will point to itself:

func (b *Builder) Write(p []byte) (int, error) {
    b.copyCheck()
    //...
}

func (b *Builder) copyCheck() {
    if b.addr == nil {
        // This hack works around a failing of Go's escape analysis
        // that was causing b to escape and be heap allocated.
        // See issue 23382.
        // TODO: once issue 7921 is fixed, this should be reverted to
        // just "b.addr = b".
        b.addr = (*Builder)(noescape(unsafe.Pointer(b)))
    } else if b.addr != b {
        panic("strings: illegal use of non-zero Builder copied by value")
    }
}

When we execute b1 = b2, the pointer add pointing to b1 is also copied inside the structure, that is, b2.add = & b1. Therefore, when you write to B2, you will enter the copyCheck() method again and directly report a panic error.

A copy of empty strings.Builder is allowed, because the add pointer is nil, and b.add will not be performed when copyCheck() is executed= In the judgment conditions of B.

func main() {
    var b1 strings.Builder
    fmt.Println(b1) // {<nil> []}
    b2 := b1
    b2.WriteString("aaa")
    fmt.Println(b2) // {0xc0000cdf30 [97 97 97]}
}

✏️ String()

When strings.Builder returns the string of the current data, in order to save memory allocation, it uses pointer technology to convert the internal [] byte into a string, so the String() method saves time and space during conversion. The specific implementation method is as follows:

func (b *Builder) String() string {
    return *(*string)(unsafe.Pointer(&b.buf))
}

📚 Concurrency is not supported

strings.Builder does not support concurrent reading and writing, which is unsafe, so it is best to use it in a single coroutine. If strings.Builder supports concurrency, the running result of the following code should be 1000:

func main() {
    var b strings.Builder
    var wait sync.WaitGroup
    for n := 0; n < 10000; {
        wait.Add(1)
        go func() {
            b.WriteString("1")
            n++
            wait.Done()
        }()
    }
    wait.Wait()
    fmt.Println(b.Len()) // 9349
}

reference resources

  1. Detailed explanation of common methods of strings in Go
  2. Seven key points of strings.builder in Golang
  3. Source code analysis of strings.Builder

Posted on Tue, 30 Nov 2021 07:40:58 -0500 by sportryd