# ppt

## Transcription

ppt

Revisions to the Spectral Test and the Lempel-Ziv Compression Test in the NIST Statistical Test Suite National Institute of Information and Communications Technology, JAPAN Song-Ju Kim and Ken Umeno （ChaosWare Inc.） It is well known that the NIST Statistical Test Suite was used in the evaluation of the AES candidate algorithms. It is also world-widely used by external audiences in the evaluation of their Pseudo Random Number Generators. The NIST Statistical Test Suite “A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications” National Institute of Standards and Technology (2001) http://csrc.nist.gov/rng/ Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Test Name Frequency Block Frequency Runs Longest Run Binary Matrix Rank Discrete Fourier Transform Non-overlapping Template Matching Overlapping Template Matching Universal Lempel Ziv Compression Linear Complexity Serial Approximate Entropy Cumulative Sums Random Excursions Random Excursions Variant OUTLINE On the NIST Statistical Test Suite Test Results (AES, SHA-1, and MUGI) Checking of the Uniformity of P-values Corrections to the Spectral (DFT) Test Corrections to the LZC Test Summary The test procedure A set of sequences, each of length n, is produced from the selected generator. Each statistical test evaluates the sequence and returns one or more P-values. If the P-value ≥ α(=0.01), then we call the sequence “success”. 1. Checking of the success rate. 2. Checking of the uniformity of the distribution of P-values. What is p-value? P-value: the probability that a perfect random number generator would have produced a sequence less random than the sequence that are tested. 1. The checking of the success rate The range of acceptable proportions: 1 3 (1 ) m ※ (μ±3σ)/m : 99.73% range of binomial distribution, where μ= m (1 – α) and σ= m α(1- α). α=0.01: significance level Success Rate (Example) Key 1 Key 4 2. The checking of the uniformity of the P-values distribution The interval [0,1] is divided into 10 sub intervals, and the p-values that lie within each sub-intervals are counted (F i). p-value of p-values: IGMC( 9 / 2, χ2 / 2 ) 1 t n 1 dt where IGMC(n, x) = e t ( n) x and The test passes if p-value of p-values ≥ 0.0001 2 10 i 1 (F i m 10 m 10 2 ) Uniformity of p-values (Example) Key 1 (fail) Key 4 (pass) The parameters we used TEST NAME Block Frequency Template Matching Universal (Initialization Steps) Linear Complexity Serial Approximate Entropy BLOCK LENGTH 20000 9 7 (1280) 500 (5000) 10 10 6 n=10, α=0.01, 1000 samples 10 keysх1000 samplesх10^6 (sequence length) total 10^10 bit Test Results AES (OFB) Key １ ２ ３ ４ ５ ６ ７ ８ ９ １０ Success Rate pass pass REX pass NOTM(2) CUSUM NOTM, OTM pass pass pass Uniformity pass pass pass pass Lempel-Ziv Lempel-Ziv pass pass Lempel-Ziv Lempel-Ziv Test Results SHA-1 Key １ ２ ３ ４ ５ ６ ７ ８ ９ １０ Success Rate pass pass NOTM(2) NOTM(2) pass NOTM, REX, REXV NOTM(2) NOTM pass pass Uniformity pass Lempel-Ziv pass FFT Lempel-Ziv pass pass pass pass Lempel-Ziv Test Results MUGI Key １ ２ ３ ４ ５ ６ ７ ８ ９ １０ Success Rate NOTM pass Lempel-Ziv pass NOTM pass pass pass NOTM pass Uniformity pass Lempel-Ziv Lempel-Ziv pass pass pass pass pass pass FFT If we focus on the uniformity of Pvalues, only the DFT test and LZC test are failed frequently. If we choose the sample size m greater than 10000, we cannot find any PRNG that pass these two test. P-value of P-values (SHA-1) These distributions of P-values indicates a apparent deviation from randomness although we use a well-known good PRBG (SHA-1) This observation suggests that the test settings in these two tests are not accurate. The DFT test test description (NIST document) The zeros and ones of the input sequence are converted to values of -1 and +1. Apply a DFT on X to produce: S=DFT(X). Calculate M=modulus(S’), where S’ is the substring consisting of the first n/2 elements in S. Compute T= 3n : the 95% peak height threshold value. Compute N0 = 0.95n/2. Compute N1 = the actual observed number of peaks in M that are less than T. N | d | Compute P-value = N d erfc 2 1 0 n(0.95)(0.05) / 2 The probability distribution (SHA-1) 2.995732274n 300,000 samples 3n npq 4 npq 2 The LZC test test description (NIST document) Parse the sequence into consecutive, disjoint and distinct words that will form a “dictionary” of words in the sequence. ex. 0|1|00|01|000|11|011| Compute P-value = W obs 1 erfc 2 2 2 The probability distribution (SHA-1) 69588.09 2 L 2 R 75.574336518 72.42178447 Despite the best fitting of the distribution, the uniformity of P-values cannot be improved. This is because the distribution of the number of words is too narrow. In other words, a variety of the appeared P-values is limited. The effect of discreteness Because the variety of appeared Pvalues is too scarce in centered bins, we never get the uniformity of P-values in this situation. The histogram of P-values always has some biases even if we use good PRNG. However, these biases are always the same if we use good PRNG. Checking of Uniformity (LZ) 2 10 i 1 (F i m 2 ) 10 ( F i m S i) i 1 m Si 2 10 m 10 S S S S S 1 3 5 7 9 0.1097085, 0.1076910, 0.1369235, 0.0858035, 0.1028565, S S S S S 2 4 6 8 2 0.0791270, 0.0844650, 0.0911150, 0.1098615, 10 0.0924485. P-value of P-values (before) P-value of P-values (after) Summary We corrected two points for DFT test. (1) the threshold T 3n (2) the variance of the theoretical distribution npq 2 npq 2 4 2 We corrected two points for LZ test. (1) setting of standard distribution (asymmetric) which has no algorithm dependence. (2) re-definition of the uniformity of P-values. 2.995732274n