Handwritten prefix tree

preface:

        Remember some problems encountered before, such as how many times words appear in an article? How many times does each word prefix * * appear in an article? An article   ** The word segment appeared several times, and there was an answer after reading the prefix tree.

thinking

        1 to build a tree, you need to record the of each node   Number of pass es   End is the number of letters at the end of the word

        2. An array is required on the node   26 size is used to store the leaf node of this node

        3. When adding words, you need to pass +1 end +1 to the passing nodes. If there is no node, you need to create a new node

        4 when deleting   pass -1 end -1 if pass = 0, delete the node

Method design

  1)void insert(String str)           To add a string, you can add it repeatedly, one at a time

3)   void delete(String str)           Delete a string, you can delete it repeatedly, one at a time

2)int search(String str)             Query word   How many more characters does a string have in the structure

4)int prefixNumber(String str)   How many query string prefixes are prefixed with STR

4)int prefixNumber(String str)   How many string fragments does the query have   str    for example   hello contains ll count 1

Time complexity

        1 add the time complexity as O(n*k), where n is the number of words   K is the length of the word. In fact, because the word is not very long, it can also be ignored as O (n)

        2. The deletion time complexity is O(n*k), where n is the number of words   K is the length of the word. In fact, because the word is not very long, it can also be ignored as O (n). Deleting one can be regarded as O (1)

        3 query the word O (k), because you can find each character as long as you loop the string

          4 same as 3

        5. Each node o (n) of the prefix tree needs to be traversed. Here n is the number of nodes of the prefix tree, and O (P) P represents the number of times that the first letter of the string needs to appear in different positions of the word,   Judge whether the nodes starting with this string can satisfy this string, and add the satisfied quantities. Time complexity n * p * k   Because P is equal to k at most, and K is the word length, which will not exceed 10   So time complexity   O(n)

code

package algorithm;

import com.sun.tools.javac.util.StringUtils;

import java.util.*;

public class trie  {

    /**
     * Prefix tree node
     */
    public static class Node{
        //Record the number of passes through this node
        public int pass;
        //Record the number of text ends at this node
        public int end;
        //The way to 26 letters
        public Node[] nodes;
        public Node() {
            nodes = new Node[26];
        }
    }
    /**
     * trie 
     */
    public static class tree{
        private Node nodeTree;//Prefix tree root node
        //To add a string, you can add it repeatedly, one at a time

        public tree() {
            this.nodeTree = new Node();
        }

        public void insert(String str){
            if(str.isEmpty()){
                return;
            }
            //Split string
            char[] chars = str.toCharArray();
            //Get a reference to the tree
            Node node = nodeTree;
            //Loop each character into the prefix tree
            for (int i = 0; i < chars.length; i++) {
               //Case range a=97z=122A=65Z=90
                //Here, all letters are lowercase by default, and the subscript of the corresponding array is calculated
                int subscript = chars[i]-'a';
                //If the corresponding letter is empty, add one
                if(node.nodes[subscript]==null){
                    node.nodes[subscript] = new Node();
                };
                //Skip to the corresponding node node
                node = node.nodes[subscript];
                //Passing++
                node.pass++;
            }
            node.end++;
        }
        //Delete a string, you can delete it repeatedly, one at a time
        public void delete(String str){
            if(search(str)==0){
                return;
            }
            if(str.isEmpty()){
                return;
            }
            //Split into char
            char[] chars = str.toCharArray();
            //Get the reference of node
            Node node = nodeTree;
            for (int i = 0; i < chars.length; i++) {
                //Case range a=97z=122A=65Z=90
                //Here, all letters are lowercase by default, and the subscript of the corresponding array is calculated
                int subscript = chars[i]-'a';
                //If the value after subtraction is equal to 0, delete it directly

                if(--node.nodes[subscript].pass==0){
                    node.nodes[subscript] = null;
                    return;
                }
                node = node.nodes[subscript];
            }
            node.end--;
        }
        //Query how many times a string appears in the structure, that is, how many times a word appears
        public int search(String str) {
            Node node = nodeTree;
            char[] chars = str.toCharArray();
            for (int i = 0; i < chars.length; i++) {
                int subscript = chars[i]-'a';
                node = node.nodes[subscript];
                if (node == null) {
                    return 0;
                }
            }
            return node.end;
        }

        // Query how many strings are prefixed with str, that is, the prefix has appeared several times before
        public int prefixNumber(String str){
            Node node = nodeTree;
            char[] chars = str.toCharArray();
            for (int i = 0; i < chars.length; i++) {
                int subscript = chars[i]-'a';
                node = node.nodes[subscript];
                if (node == null) {
                    return 0;
                }
            }
            return node.pass;
        }


        /**
         * Query how many words a string paragraph contains in the previous string, such as hello hi name family Amy. There are three words containing am
         * thinking
         *  1 Traverse the tree of each layer, find the node starting with the letter a and put it in the queue
         *  2 Traverse the beginning of a in the queue to see if it conforms to am, and then accumulate the number of pass es at the point of m
         * @param str
         * @return
         */
        public int searchStr(String str) {
            Node node = nodeTree;
            char[] chars = str.toCharArray();

            //1 first traverse the whole tree, find the one starting with a and put it in the queue
            Queue<Node> nodeQueue = new LinkedList<>();
            Queue<Node> nodeStrQueue = new LinkedList<>();
            nodeQueue.add(node);
            int headWord = chars[0]-'a';

            while (nodeQueue.peek()!=null){
                Node node1 = nodeQueue.poll();
                for (int i = 0; i < 26; i++) {
                    //Add nodes that are not empty to the queue and continue to traverse to find nodes equal to the first letter
                    if (node1.nodes[i]!=null) {
                        //Determine whether it is equal to the first letter
                        if (i == headWord) {
                            //Put it in the queue with the first letter and wait for traversal to find out how many match
                            nodeStrQueue.add(node1.nodes[i]);
                        }
                        //Put it in the search queue and continue to find the node starting with the first letter
                        nodeQueue.add(node1.nodes[i]);
                    }
                }
            }
            //2 see if all the words beginning with the first letter match the following letters
            int num = 0;//Accumulated letters
            //Start with the second letter
            while (nodeStrQueue.isEmpty()){
                Node node1 = nodeStrQueue.poll();
                if(node1==null){

                }
                for (int i = 1; i < chars.length; i++) {
                    int subscript = chars[i] - 'a';
                    //Find the next character
                    node1 = node1.nodes[subscript];
                    //If you don't stop searching, exit the node to search the character segment of the char array
                    if (node1 == null) {
                       break;
                    }
                    //If the last letter node is currently found and found
                    if (i == chars.length-1) {
                        //Cumulative occurrences of current node
                        num+=node1.pass;
                    }
                }
            }
            return num;
        }

    }
    // =========================Test code for prefix tree
    // for test
    public static String generateRandomString(int strLen) {
        char[] ans = new char[(int) (Math.random() * strLen) + 1];
        for (int i = 0; i < ans.length; i++) {
            int value = (int) (Math.random() * 6);
            ans[i] = (char) (97 + value);
        }
        return String.valueOf(ans);
    }

    // for test
    public static String[] generateRandomStringArray(int arrLen, int strLen) {
        String[] ans = new String[(int) (Math.random() * arrLen) + 1];
        for (int i = 0; i < ans.length; i++) {
            ans[i] = generateRandomString(strLen);
        }
        return ans;
    }
    public static class Right {

        private HashMap<String, Integer> box;

        public Right() {
            box = new HashMap<>();
        }

        public void insert(String word) {
            if (!box.containsKey(word)) {
                box.put(word, 1);
            } else {
                box.put(word, box.get(word) + 1);
            }
        }

        public void delete(String word) {
            if (box.containsKey(word)) {
                if (box.get(word) == 1) {
                    box.remove(word);
                } else {
                    box.put(word, box.get(word) - 1);
                }
            }
        }

        public int search(String word) {
            if (!box.containsKey(word)) {
                return 0;
            } else {
                return box.get(word);
            }
        }

        public int prefixNumber(String pre) {
            int count = 0;
            for (String cur : box.keySet()) {
                if (cur.startsWith(pre)) {
                    count += box.get(cur);
                }
            }
            return count;
        }
    }
    // =========================Test code for prefix tree
    public static void main(String[] args) {

        //Test add delete yourself
        tree trie = new tree();
        trie.insert("aaa");
        trie.insert("bbb");
        trie.insert("cc");

        trie.delete("aaa");
        trie.delete("bbb");
        trie.delete("cc");

        //Find entire single
        trie.insert("hi");
        trie.insert("hi");
        trie.insert("hello");
        trie.insert("mama");
        System.out.println("hi Occurred:"+trie.search("hi"));
        System.out.println("hello Occurred:"+trie.search("hello"));
        System.out.println("ma: "+trie.search("ma"));

        //Find prefix
        System.out.println("ll Occurred:"+trie.search("ll"));
        System.out.println("h Occurred:"+trie.search("h"));
        System.out.println("hi Occurred:"+trie.search("hi"));
        System.out.println("ma Occurred:"+trie.search("ma"));

        //find paragraph
        trie.insert("myhi");
        trie.insert("ahid");
        System.out.println("hi Occurred:"+trie.searchStr("hi"));
        System.out.println("ll Occurred:"+trie.searchStr("ll"));








        /***
         * Logarithm in video
         */
        int arrLen = 100;
        int strLen = 20;
        int testTimes = 100000;
        for (int i = 0; i < testTimes; i++) {
            String[] arr = generateRandomStringArray(arrLen, strLen);
            tree trie1 = new tree();
            Right right = new Right();
            for (int j = 0; j < arr.length; j++) {
                double decide = Math.random();
                if (decide < 0.25) {
                    trie1.insert(arr[j]);
                    right.insert(arr[j]);
                } else if (decide < 0.5) {
                    trie1.delete(arr[j]);
                    right.delete(arr[j]);
                } else if (decide < 0.75) {
                    int ans1 = trie1.search(arr[j]);
                    int ans3 = right.search(arr[j]);
                    if (ans1 !=  ans3) {
                        System.out.println("Oops!");
                    }
                } else {
                    int ans1 = trie1.prefixNumber(arr[j]);
                    int ans3 = right.prefixNumber(arr[j]);
                    if (ans1 != ans3) {
                        System.out.println("Oops!");
                    }
                }
            }
        }
        System.out.println("finish!");

    }
}

Tags: Algorithm

Posted on Fri, 29 Oct 2021 05:26:40 -0400 by DexterMorgan