[algorithm] Python implements the out of order algorithm and simulates to verify the feasibility of the algorithm

The m elements are randomly disrupted in the n-size prime group, and the number of experiments is n to verify the feasibility of the random algorithm

Out of order algorithm (First Edition)

First, put all the samples that need to be disturbed in the front of the container and randomly generate a position for exchange

# Author: WenRuo 
# CreateDate: 2021/11/29

import random


class KnuthShuffle:
    """
    Use one-dimensional array to simulate whether the out of order results are really random
    """

    def __init__(self, N, n, m):
        """
        hold m Samples were randomly distributed in n Count the frequency of samples allocated to each container in each container
        :param N: Number of tests
        :param n: Total number of containers
        :param m: Total number of samples
        """
        if N <= 0:
            print("Frequency cannot be less than 0")
            return
        if n < m:
            print("The sample cannot be greater than or equal to the container")
            return
        self.N = N
        self.n = n
        self.m = m

    def run(self):
        """
        conduct N The second probability simulation first arranges all elements in the front of the array, and then shuffles the cards
        :return:
        """
        # The position where the recording frequency is initialized to 0 freq[i] refers to the frequency at I
        freq = []
        # Record container
        arr = []
        for i in range(self.n):
            freq.append(0)
        for i in range(self.n):
            arr.append(0)
        # Number of simulation tests
        for i in range(self.N):
            self.reset(arr)  # Initialize the array first and arrange all to the front
            self.shuffle(arr)  # shuffle the cards
            for j in range(self.n):
                freq[j] += arr[j]
        # Printing frequency
        print(">>>Index: probability of occurrence")
        for i in range(self.n):
            msg = "{0} : {1}".format(i, round(freq[i] / self.N, 3))
            print(msg)

        print(">>> Print the scrambled array:")
        for i in arr:
            print(i, end=" ")

        print("\n>>> Print the frequency of each location:")
        for i in freq:
            print(i, end=" ")

    def reset(self, arr):
        """
        The simulation arranges all the samples at the front of the container
        take arr front m Both are set to 1, m All subsequent are set to 0
        :param arr:
        :return:
        """
        for i in range(self.m):
            arr[i] = 1
        for i in range(self.m, self.n):
            arr[i] = 0

    def shuffle(self, arr):
        """
        Random disruption
        """
        for i in range(self.n):
            x = int(random.random() * self.n)
            arr[i], arr[x] = arr[x], arr[i]


if __name__ == '__main__':
    N = 100000
    n = 10
    m = 5
    exp = KnuthShuffle(N, n, m)
    exp.run()

Print results:

>>>Index: probability of occurrence
0 : 0.564
1 : 0.544
2 : 0.519
3 : 0.49
4 : 0.461
5 : 0.47
6 : 0.478
7 : 0.484
8 : 0.492
9 : 0.499
>>> Print the scrambled array:
0 0 0 1 1 1 0 1 1 0 
>>> Print the frequency of each location:
56421 54446 51852 49049 46108 46970 47751 48364 49154 49885 

The error between 0.49 and 0.56 is too large, which shows that the result of random shuffle algorithm is biased, and the probability of random samples at each position can not be guaranteed to be 50-50

Try to improve the Shuffle function (Second Edition)

Since the focus is randomly disturbed in M samples, each time one sample is exchanged with another random position, it only needs to cycle m times, that is, the total number of samples.

def shuffle(self, arr):
     """
     Random disruption
     """
     for i in range(self.m):
         x = int(random.random() * self.n)
         arr[i], arr[x] = arr[x], arr[i]

Print results

>>>Index: probability of occurrence
0 : 0.671
1 : 0.635
2 : 0.593
3 : 0.548
4 : 0.5
5 : 0.413
6 : 0.411
7 : 0.408
8 : 0.407
9 : 0.413
>>> Print the scrambled array:
1 0 1 0 1 0 0 0 1 1 
>>> Print the frequency of each location:
67110 63510 59333 54835 50023 41253 41092 40832 40692 41320 

The error between 0,41-0.67 is larger and more biased than the first version. The scrambled array looks random, but it can be clearly observed from the above frequency results that the random algorithm is unreasonable!!!

The above two attempts can find that the random result is biased, and can not generate one of the results with equal probability. The above two shuffle algorithms are not acceptable for programs with high requirements for random fairness!

Equal probability shuffle algorithm (optimal solution)

Suppose there is a deck of playing cards to shuffle. First, one of the 54 cards is randomly placed in the first position of the array, and then one of the remaining 53 cards is randomly selected and placed in the second position of the array, and so on.

Without opening up a new space, suppose that there is an array of 54 sizes, exchange a random number with the end of the array, and then exchange another random number with the penultimate number from the remaining 53, maintaining an i -

def shuffle(self, arr):
    """
    shuffle algorithm 
    """
    for i in range(self.n - 1, -1, -1):
        x = int(random.random() * (i + 1))
        arr[i], arr[x] = arr[x], arr[i]

Print results:

>>>Index: probability of occurrence
0 : 0.501
1 : 0.499
2 : 0.5
3 : 0.501
4 : 0.499
5 : 0.5
6 : 0.497
7 : 0.5
8 : 0.501
9 : 0.503
>>> Print the scrambled array:
1 0 0 0 1 1 1 0 1 0 
>>> Print the frequency of each location:
50051 49861 49978 50139 49869 49977 49743 49976 50148 50258 

Through the results, it can be intuitively found that the frequency is maintained between 0.49-0.5, and the probability difference is no more than 1%, indicating that the random disordered results ensure enough disordered results. Is a better algorithm!

Tags: Algorithm

Posted on Tue, 30 Nov 2021 04:10:31 -0500 by christophe