Wednesday, August 12, 2015

Python: random (2)

Abstract: shuffling

The package random provide a shuffling method known as random.shuffle(). Fisher-Yates is another choice.

The result:
In: ['q', 'w', 'e', 'r', 't', 'y', 'u', 'i', 'o', 'p']
Default method by random: ['o', 'w', 'i', 'y', 'p', 'q', 'e', 'r', 'u', 't']

Test random:
theoretical frequency: 1000
e [974, 1030, 980, 1029, 1016, 998, 1027, 1023, 959, 964]
i [1018, 1016, 985, 1020, 1014, 1006, 939, 962, 998, 1042]
o [974, 965, 1037, 1025, 1007, 987, 968, 1001, 1009, 1027]
q [957, 976, 993, 965, 1008, 1040, 1025, 959, 999, 1078]
p [1015, 1008, 969, 1022, 987, 997, 980, 1036, 1013, 973]
r [1034, 977, 984, 987, 1003, 997, 957, 1040, 1061, 960]
u [989, 1020, 1017, 1029, 914, 973, 1066, 1020, 964, 1008]
t [993, 968, 1031, 984, 1007, 971, 1023, 1007, 1022, 994]
w [1045, 1045, 972, 944, 1026, 1039, 977, 972, 1009, 971]
y [1001, 995, 1032, 995, 1018, 992, 1038, 980, 966, 983]
Errors percentage of frequency:
e [2, 3, 2, 2, 1, 0, 2, 2, 4, 3]
i [1, 1, 1, 2, 1, 0, 6, 3, 0, 4]
o [2, 3, 3, 2, 0, 1, 3, 0, 0, 2]
q [4, 2, 0, 3, 0, 4, 2, 4, 0, 7]
p [1, 0, 3, 2, 1, 0, 2, 3, 1, 2]
r [3, 2, 1, 1, 0, 0, 4, 4, 6, 4]
u [1, 2, 1, 2, 8, 2, 6, 2, 3, 0]
t [0, 3, 3, 1, 0, 2, 2, 0, 2, 0]
w [4, 4, 2, 5, 2, 3, 2, 2, 0, 2]
y [0, 0, 3, 0, 1, 0, 3, 2, 3, 1]
Mean Errors: 2.01, SD Errors: 1.68223066195
Default method by random: ['w', 'q', 'p', 'y', 'e', 'r', 'o', 'u', 't', 'i']

Test random:
theoretical frequency: 1000
e [1016, 1002, 991, 1023, 1006, 1000, 994, 953, 997, 1018]
i [997, 1028, 953, 1022, 1009, 985, 988, 1030, 1023, 965]
o [987, 1045, 940, 978, 1044, 1012, 977, 1004, 966, 1047]
q [995, 961, 1026, 1044, 1032, 984, 1006, 986, 1003, 963]
p [973, 977, 1001, 1013, 1002, 1043, 992, 992, 992, 1015]
r [1018, 981, 1037, 922, 985, 996, 1031, 996, 1000, 1034]
u [974, 1043, 1012, 1025, 981, 967, 1009, 1008, 1016, 965]
t [969, 1015, 1001, 1052, 981, 1014, 1055, 992, 968, 953]
w [1068, 991, 1006, 936, 992, 1022, 964, 1032, 992, 997]
y [1003, 957, 1033, 985, 968, 977, 984, 1007, 1043, 1043]
Errors percentage of frequency:
e [1, 0, 0, 2, 0, 0, 0, 4, 0, 1]
i [0, 2, 4, 2, 0, 1, 1, 3, 2, 3]
o [1, 4, 6, 2, 4, 1, 2, 0, 3, 4]
q [0, 3, 2, 4, 3, 1, 0, 1, 0, 3]
p [2, 2, 0, 1, 0, 4, 0, 0, 0, 1]
r [1, 1, 3, 7, 1, 0, 3, 0, 0, 3]
u [2, 4, 1, 2, 1, 3, 0, 0, 1, 3]
t [3, 1, 0, 5, 1, 1, 5, 0, 3, 4]
w [6, 0, 0, 6, 0, 2, 3, 3, 0, 0]
y [0, 4, 3, 1, 3, 2, 1, 0, 4, 4]
Mean Errors: 1.81, SD Errors: 1.72449992752


The script:
# -*- coding: utf-8 -*-
"""
Created on Tue Aug 11 21:02:55 2015

@author: yuan
"""

import random
import numpy as np

class shuffling:
    def __init__(self, pool):
        self.pool=pool
        self.len=len(pool)
       
    #Knuth-Durstenfeld method
    def default_method(self):
        mypool=list(self.pool)
        random.shuffle(mypool)
        return mypool
       
    def Fisher_Yates(self):
        mypool=list(self.pool)
        shuffled=[]
        for i in range(self.len):
            index=random.randint(0, (len(mypool)-1))
            shuffled.append(mypool[index])
            mypool.pop(index)
        return shuffled
       
    def random_testing(self, permuation, FUN, *args):
        theo_freq=permuation/self.len
        #initiate counting with nested dictionary
        counting={}
        for a in self.pool:
            counting[a]=[0]*self.len
        #print counting['q'][3]  
        #permutation
        for i in range(permuation):
            shuffled=FUN()
            for index,element in enumerate(shuffled):
                counting[element][index] +=1
        print 'theoretical frequency:', theo_freq
        for e in counting.keys():
            print e, counting[e]
        print 'Errors percentage of frequency:'
        err=[]
        for e in counting.keys():
            err_perc=map(lambda x: abs((x-theo_freq))*100/theo_freq, counting[e])
            print e, err_perc
            err.extend(err_perc)
        err=np.array(err)
        print 'Mean Errors: %s, SD Errors: %s' % (np.mean(err), np.std(err))
        return counting
               

if __name__=="__main__":
    pool=list('qwertyuiop')#ten characters
    print 'In:', pool
    s=shuffling(pool)
    #method 1:
    print 'Default method by random: %s' % s.default_method()
    #test random, should be 1000 times at each position
    print '\nTest random:'
    s.random_testing(10000,s.default_method)
   
    #method 2:
    print 'Default method by random: %s' % s.Fisher_Yates()
    #test random, should be 1000 times at each position
    print '\nTest random:'
    s.random_testing(10000,s.Fisher_Yates)
   
   

No comments:

Post a Comment