Cryptography secure adecuate randomness means that is officially random, don't have any patterns that anybody could use for brake the algorithm, if you are making one time pad, you have a pseudorandom generator that has patterns.
If we have patterns or periodicity we have a vulnerability in the one time pad and somebody can synchronize for have my keys that I made with the key generator, and people could get the sequence of the key.
In this report, I'm going to explain and made different test for randomness, and demonstrate if I have a secure or insecure system, at the end I'm going to make a conclusion about it.
First at all, I'm going to make a statistical test for randomness, as you need to know, randomness is a probabilistic property, so we can use different test for make sure that my data have a true random data, I made two different test, a frequency test and a Runs test, I implemented in python, this is my code that generate key using random.choice and select a 15 symbols in ascii then I translate those into integer for made an XOR in the one time pad implementation.
In the frecuency test this are the steps:
- We need to compute the test statistic: Sobs = abs(s)/sqrt(n) where sn are sum of all the zeros in -1.
- Then, we need to compute P-value = erfc(Sobs/sqrt(2)), where erfc is the complementary error function.
- At the end, we need to know if P-value is small (< 0.01), accept the sequence as random
In runs test this are the steps:
- We need to compute the pre-test proportion pi of ones in the input sequence:
- sum_pi = SUM(number zeros-ones)/n
- Compute the test statistic Vobs = SUM(r(k))+1
- When r(k)=0 if the value of array of number zeros-ones it's the same that the next one, and r(k) = 1 otherwise.
- Compute P-value = erfc( abs(Vobs-2n*sum_pi*(1-sum_pi))/2*sqrt(2n)*sum_pi*1-sum_pi)
- If >= 0.01, accept the sequence as random
In based on Secure Telecomunication System slides[1], the recommendation for made the test is for both, Runs and frequency, has a input size minimum 100, so this are the result.
Then, I change the random import from python to randint from numpy, I run the same test and I have the same result, sequence are not accept as random by frequency test, as well as runs test.
Then, I made a comparison between have a compress file with 100000 random numbers and make zip, for pass this test, they cannot be smaller because, as you know, the compress algorithm look for repeat sequence and replace long sequences with short sequences, we can look at the weight compress by kb.
This are the result:
So in the test, we don't have a good random sequence again, because we can compress the file about 46 %, we can conclude, that my implementation is insecure because it not truly random because we can compress the file.
So, for make a conclusion, we can see that in the three test, we don't have a good response and we have a one time-pad very insecure, so we need to develop with a truly random generator key for have a good implementation, only change the key generator, because in the algorithm, we don't have problem, we have problem at making the keys, we need to know that most of the computation random generators are called pseudo-random, because they use a deterministic algorithm, in most of the case is OK for make programs but, in this case, make the one time pad insecure, so, I investigate and we can use a random generate by OS like os.urandom() in python, but never we can have a truly random generator by computer, only if we can have something mechanical like random.org or maybe in the future.
Bibliography
[1] Slides of Secure Telecommunication System here.