top of page

A suite DIEHARD é um conjunto de 13 testes adotados pelo NIST (National Institute of Standards and Technology) para avaliar programas e equipamentos que têm por finalidade gerar números aleatórios.

 

Tais números possuem grande aplicação em criptografia e outras ciências, mas justamente por se envolverem em questões de segurança e privacidade, é necessário que não sejam "previsíveis" - de outra forma, bastaria que um atacante tivesse acesso ao algoritmo para saber exatamente quais os números gerados, podendo então derivar a informação protegida.

 

O DIEHARD é extenso e profundo. Ele não busca repetições de números (o que afinal não compromete a aleatoriedade), mas sim, padrões que se repitam e que possam portanto evidenciar como os números foram gerados e que permitam então adivinhar quais números serão gerados em qualquer ordem.

 

Em geral, as linguagens de computador oferecem seus próprios geradores de números aleatórios, chamados de "pseudo-geradores" já que, sendo programas e possuindo fórmulas matemáticas, entende-se não serem "geradores de aleatórios verdadeiros" - o que se consegue realmente apenas com algo totalmente imprevisível; o que é muito mais difícil do que parece.

 

Portanto, o NIST, durante a Guerra Fria, começou a estabelecer testes que detectassem quais pseudo-geradores são eficazes, quais não são. Tais testes vêm sendo cada vez mais refinados e elaborados, já que este assunto envolve diretamente sistemas em uso pelas Forças Armadas e agências de Inteligência e Segurança-Nacional

 

Neste documento em PDF vemos que, numa análise feita em 2001 durante o encontro anual da Associação Estatística Americana, analisou-se, com a ajuda do DIEHARD, a eficácia dos principais pseudo-geradores usados em business (Excel, Access, ACL, RAT-STAT) e o resultado é que todos, com exceção do RAT-STAT, falharam em alguns dos testes efetuados.

 

Segue abaixo o resultado de uma sessão do gerador do CIFRA EXTREMA com 2.560.000 números, e mais abaixo a explicação de cada teste executado.

 

Os textos a seguir são cópia fiel dos arquivos gerados, sem edição ou pós-produção. Caso desejado, podemos fornecer a cópia dos arquivos abaixo demonstrados.

 

 

CIFRA EXTREMA - 2,560,000 NUMBERS

 

 

 BIRTHDAY SPACINGS TEST, M= 512 N=2**24 LAMBDA=  2.0000
           cxt-random.bin    using bits  1 to 24 p-value=  .035512
           cxt-random.bin    using bits  2 to 25 p-value=  .735541
           cxt-random.bin    using bits  3 to 26 p-value=  .014013
           cxt-random.bin    using bits  4 to 27 p-value=  .284940
           cxt-random.bin    using bits  5 to 28 p-value=  .271352
           cxt-random.bin    using bits  6 to 29 p-value=  .985182
           cxt-random.bin    using bits  7 to 30 p-value=  .884997
           cxt-random.bin    using bits  8 to 31 p-value=  .549738
           cxt-random.bin    using bits  9 to 32 p-value=  .844081
   The 9 p-values were
        .035512   .735541   .014013   .284940   .271352
        .985182   .884997   .549738   .844081
  A KSTEST for the 9 p-values yields  .478628
--------------------------------------------------------------------------------
           OPERM5 test for file cxt-random.bin   
 chisquare for 99 degrees of freedom=126.470; p-value= .967304
           OPERM5 test for file cxt-random.bin   
 chisquare for 99 degrees of freedom= 85.684; p-value= .172365
--------------------------------------------------------------------------------
    Binary rank test for cxt-random.bin   
         Rank test for 31x31 binary matrices:
        rows from leftmost 31 bits of each 32-bit integer
      rank   observed  expected (o-e)^2/e  sum
        28       215     211.4   .060688     .061
        29      5091    5134.0   .360319     .421
        30     23261   23103.0  1.079909    1.501
        31     11433   11551.5  1.216120    2.717
  chisquare= 2.717 for 3 d. of f.; p-value= .612301
    Binary rank test for cxt-random.bin   
         Rank test for 32x32 binary matrices:
        rows from leftmost 32 bits of each 32-bit integer
      rank   observed  expected (o-e)^2/e  sum
        29       218     211.4   .204914     .205
        30      5154    5134.0   .077832     .283
        31     23116   23103.0   .007262     .290
        32     11512   11551.5   .135236     .425
  chisquare=  .425 for 3 d. of f.; p-value= .322187
--------------------------------------------------------------------------------
 b-rank test for bits  1 to  8 p=1-exp(-SUM/2)= .35488
 b-rank test for bits  2 to  9 p=1-exp(-SUM/2)= .97368
 b-rank test for bits  3 to 10 p=1-exp(-SUM/2)= .27099
 b-rank test for bits  4 to 11 p=1-exp(-SUM/2)= .14104
 b-rank test for bits  5 to 12 p=1-exp(-SUM/2)= .23149
 b-rank test for bits  6 to 13 p=1-exp(-SUM/2)= .34509
 b-rank test for bits  7 to 14 p=1-exp(-SUM/2)= .85622
 b-rank test for bits  8 to 15 p=1-exp(-SUM/2)= .45829
 b-rank test for bits  9 to 16 p=1-exp(-SUM/2)= .67986
 b-rank test for bits 10 to 17 p=1-exp(-SUM/2)= .89331
 b-rank test for bits 11 to 18 p=1-exp(-SUM/2)= .24424
 b-rank test for bits 12 to 19 p=1-exp(-SUM/2)= .93987
 b-rank test for bits 13 to 20 p=1-exp(-SUM/2)= .18864
 b-rank test for bits 14 to 21 p=1-exp(-SUM/2)= .47907
 b-rank test for bits 15 to 22 p=1-exp(-SUM/2)= .90854
 b-rank test for bits 16 to 23 p=1-exp(-SUM/2)= .88570
 b-rank test for bits 17 to 24 p=1-exp(-SUM/2)= .99514
 b-rank test for bits 18 to 25 p=1-exp(-SUM/2)= .61827
 b-rank test for bits 19 to 26 p=1-exp(-SUM/2)= .90697
 b-rank test for bits 20 to 27 p=1-exp(-SUM/2)= .60637
 b-rank test for bits 21 to 28 p=1-exp(-SUM/2)= .33981
 b-rank test for bits 22 to 29 p=1-exp(-SUM/2)= .26243
 b-rank test for bits 23 to 30 p=1-exp(-SUM/2)= .41706
 b-rank test for bits 24 to 31 p=1-exp(-SUM/2)= .79093
 b-rank test for bits 25 to 32 p=1-exp(-SUM/2)= .08200
   TEST SUMMARY, 25 tests on 100,000 random 6x8 matrices
 These should be 25 uniform [0,1] random variables:
     .354877     .973679     .270988     .141044     .231494
     .345092     .856219     .458291     .679856     .893314
     .244242     .939868     .188643     .479071     .908545
     .885700     .995142     .618273     .906965     .606369
     .339812     .262428     .417063     .790926     .082005
   brank test summary for cxt-random.bin   
       The KS test for those 25 supposed UNI's yields
                    KS p-value= .679094
--------------------------------------------------------------------------------
  No. missing words should average  141909. with sigma=428.
 tst no  1:  141556 missing words,    -.83 sigmas from mean, p-value= .20453
 tst no  2:  141612 missing words,    -.69 sigmas from mean, p-value= .24362
 tst no  3:  141379 missing words,   -1.24 sigmas from mean, p-value= .10766
 tst no  4:  142003 missing words,     .22 sigmas from mean, p-value= .58662
 tst no  5:  141187 missing words,   -1.69 sigmas from mean, p-value= .04574
 tst no  6:  142134 missing words,     .52 sigmas from mean, p-value= .70019
 tst no  7:  141615 missing words,    -.69 sigmas from mean, p-value= .24583
 tst no  8:  142526 missing words,    1.44 sigmas from mean, p-value= .92518
 tst no  9:  142598 missing words,    1.61 sigmas from mean, p-value= .94620
 tst no 10:  142006 missing words,     .23 sigmas from mean, p-value= .58935
 tst no 11:  141759 missing words,    -.35 sigmas from mean, p-value= .36271
 tst no 12:  140636 missing words,   -2.98 sigmas from mean, p-value= .00146
 tst no 13:  142315 missing words,     .95 sigmas from mean, p-value= .82839
 tst no 14:  142613 missing words,    1.64 sigmas from mean, p-value= .94992
 tst no 15:  141383 missing words,   -1.23 sigmas from mean, p-value= .10940
 tst no 16:  141711 missing words,    -.46 sigmas from mean, p-value= .32154
 tst no 17:  141739 missing words,    -.40 sigmas from mean, p-value= .34533
 tst no 18:  141701 missing words,    -.49 sigmas from mean, p-value= .31322
 tst no 19:  141663 missing words,    -.58 sigmas from mean, p-value= .28247
 tst no 20:  141384 missing words,   -1.23 sigmas from mean, p-value= .10984
--------------------------------------------------------------------------------
    OPSO for cxt-random.bin    using bits 23 to 32        142502  2.044  .9795
    OPSO for cxt-random.bin    using bits 22 to 31        142053   .495  .6898
    OPSO for cxt-random.bin    using bits 21 to 30        142037   .440  .6701
    OPSO for cxt-random.bin    using bits 20 to 29        141899  -.036  .4858
    OPSO for cxt-random.bin    using bits 19 to 28        141790  -.411  .3404
    OPSO for cxt-random.bin    using bits 18 to 27        141887  -.077  .4693
    OPSO for cxt-random.bin    using bits 17 to 26        142104   .671  .7490
    OPSO for cxt-random.bin    using bits 16 to 25        142115   .709  .7609
    OPSO for cxt-random.bin    using bits 15 to 24        141861  -.167  .4338
    OPSO for cxt-random.bin    using bits 14 to 23        142179   .930  .8238
    OPSO for cxt-random.bin    using bits 13 to 22        142234  1.120  .8685
    OPSO for cxt-random.bin    using bits 12 to 21        141511 -1.374  .0848
    OPSO for cxt-random.bin    using bits 11 to 20        141733  -.608  .2716
    OPSO for cxt-random.bin    using bits 10 to 19        141786  -.425  .3353
    OPSO for cxt-random.bin    using bits  9 to 18        141576 -1.149  .1252
    OPSO for cxt-random.bin    using bits  8 to 17        142095   .640  .7390
    OPSO for cxt-random.bin    using bits  7 to 16        141217 -2.387  .0085
    OPSO for cxt-random.bin    using bits  6 to 15        141735  -.601  .2739
    OPSO for cxt-random.bin    using bits  5 to 14        141909  -.001  .4995
    OPSO for cxt-random.bin    using bits  4 to 13        142111   .695  .7566
    OPSO for cxt-random.bin    using bits  3 to 12        142741  2.868  .9979
    OPSO for cxt-random.bin    using bits  2 to 11        142368  1.582  .9431
    OPSO for cxt-random.bin    using bits  1 to 10        142097   .647  .7412
    OQSO for cxt-random.bin    using bits 28 to 32        142076   .565  .7140
    OQSO for cxt-random.bin    using bits 27 to 31        141316 -2.011  .0221
    OQSO for cxt-random.bin    using bits 26 to 30        142139   .779  .7819
    OQSO for cxt-random.bin    using bits 25 to 29        141683  -.767  .2215
    OQSO for cxt-random.bin    using bits 24 to 28        141940   .104  .5414
    OQSO for cxt-random.bin    using bits 23 to 27        141867  -.143  .4430
    OQSO for cxt-random.bin    using bits 22 to 26        141952   .145  .5575
    OQSO for cxt-random.bin    using bits 21 to 25        141896  -.045  .4820
    OQSO for cxt-random.bin    using bits 20 to 24        141951   .141  .5562
    OQSO for cxt-random.bin    using bits 19 to 23        141861  -.164  .4349
    OQSO for cxt-random.bin    using bits 18 to 22        141680  -.777  .2185
    OQSO for cxt-random.bin    using bits 17 to 21        141598 -1.055  .1456
    OQSO for cxt-random.bin    using bits 16 to 20        142237  1.111  .8667
    OQSO for cxt-random.bin    using bits 15 to 19        142193   .962  .8319
    OQSO for cxt-random.bin    using bits 14 to 18        142069   .541  .7058
    OQSO for cxt-random.bin    using bits 13 to 17        141857  -.177  .4296
    OQSO for cxt-random.bin    using bits 12 to 16        142391  1.633  .9487
    OQSO for cxt-random.bin    using bits 11 to 15        141905  -.015  .4941
    OQSO for cxt-random.bin    using bits 10 to 14        142016   .362  .6412
    OQSO for cxt-random.bin    using bits  9 to 13        142262  1.195  .8841
    OQSO for cxt-random.bin    using bits  8 to 12        141203 -2.394  .0083
    OQSO for cxt-random.bin    using bits  7 to 11        141695  -.727  .2338
    OQSO for cxt-random.bin    using bits  6 to 10        141974   .219  .5868
    OQSO for cxt-random.bin    using bits  5 to  9        141837  -.245  .4032
    OQSO for cxt-random.bin    using bits  4 to  8        141744  -.560  .2876
    OQSO for cxt-random.bin    using bits  3 to  7        142073   .555  .7105
    OQSO for cxt-random.bin    using bits  2 to  6        141733  -.598  .2750
    OQSO for cxt-random.bin    using bits  1 to  5        141725  -.625  .2660
     DNA for cxt-random.bin    using bits 31 to 32        141794  -.340  .3669
     DNA for cxt-random.bin    using bits 30 to 31        141726  -.541  .2943
     DNA for cxt-random.bin    using bits 29 to 30        142492  1.719  .9572
     DNA for cxt-random.bin    using bits 28 to 29        142019   .324  .6268
     DNA for cxt-random.bin    using bits 27 to 28        141953   .129  .5513
     DNA for cxt-random.bin    using bits 26 to 27        142153   .719  .7639
     DNA for cxt-random.bin    using bits 25 to 26        141589  -.945  .1723
     DNA for cxt-random.bin    using bits 24 to 25        141913   .011  .5043
     DNA for cxt-random.bin    using bits 23 to 24        142015   .312  .6224
     DNA for cxt-random.bin    using bits 22 to 23        142429  1.533  .9374
     DNA for cxt-random.bin    using bits 21 to 22        142214   .899  .8156
     DNA for cxt-random.bin    using bits 20 to 21        142254  1.017  .8454
     DNA for cxt-random.bin    using bits 19 to 20        141453 -1.346  .0891
     DNA for cxt-random.bin    using bits 18 to 19        142116   .610  .7290
     DNA for cxt-random.bin    using bits 17 to 18        141789  -.355  .3613
     DNA for cxt-random.bin    using bits 16 to 17        142102   .568  .7151
     DNA for cxt-random.bin    using bits 15 to 16        141786  -.364  .3580
     DNA for cxt-random.bin    using bits 14 to 15        142360  1.329  .9081
     DNA for cxt-random.bin    using bits 13 to 14        142129   .648  .7415
     DNA for cxt-random.bin    using bits 12 to 13        141874  -.104  .4585
     DNA for cxt-random.bin    using bits 11 to 12        141721  -.556  .2893
     DNA for cxt-random.bin    using bits 10 to 11        141821  -.261  .3972
     DNA for cxt-random.bin    using bits  9 to 10        141665  -.721  .2355
     DNA for cxt-random.bin    using bits  8 to  9        142022   .332  .6302
     DNA for cxt-random.bin    using bits  7 to  8        142121   .624  .7338
     DNA for cxt-random.bin    using bits  6 to  7        142499  1.739  .9590
     DNA for cxt-random.bin    using bits  5 to  6        141678  -.682  .2475
     DNA for cxt-random.bin    using bits  4 to  5        141761  -.438  .3309
     DNA for cxt-random.bin    using bits  3 to  4        141693  -.638  .2617
     DNA for cxt-random.bin    using bits  2 to  3        141713  -.579  .2812
     DNA for cxt-random.bin    using bits  1 to  2        142050   .415  .6609
--------------------------------------------------------------------------------
   Test results for cxt-random.bin   
 Chi-square with 5^5-5^4=2500 d.of f. for sample size:2560000
                               chisquare  equiv normal  p-value
  Results fo COUNT-THE-1's in successive bytes:
 byte stream for cxt-random.bin     2475.56      -.346      .364821
 byte stream for cxt-random.bin     2364.67     -1.914      .027820
--------------------------------------------------------------------------------
 Chi-square with 5^5-5^4=2500 d.of f. for sample size: 256000
                      chisquare  equiv normal  p value
  Results for COUNT-THE-1's in specified bytes:
           bits  1 to  8  2466.18      -.478      .316243
           bits  2 to  9  2602.48      1.449      .926379
           bits  3 to 10  2505.68       .080      .532036
           bits  4 to 11  2469.92      -.425      .335258
           bits  5 to 12  2532.22       .456      .675664
           bits  6 to 13  2525.81       .365      .642424
           bits  7 to 14  2338.06     -2.290      .011007
           bits  8 to 15  2404.06     -1.357      .087424
           bits  9 to 16  2553.03       .750      .773342
           bits 10 to 17  2391.88     -1.529      .063127
           bits 11 to 18  2614.83      1.624      .947810
           bits 12 to 19  2516.63       .235      .592986
           bits 13 to 20  2587.77      1.241      .892757
           bits 14 to 21  2442.35      -.815      .207467
           bits 15 to 22  2669.15      2.392      .991626
           bits 16 to 23  2659.92      2.262      .988138
           bits 17 to 24  2436.99      -.891      .186432
           bits 18 to 25  2430.91      -.977      .164277
           bits 19 to 26  2626.67      1.791      .963387
           bits 20 to 27  2356.54     -2.029      .021238
           bits 21 to 28  2683.73      2.598      .995316
           bits 22 to 29  2558.77       .831      .797068
           bits 23 to 30  2534.21       .484      .685732
           bits 24 to 31  2541.19       .582      .719869
           bits 25 to 32  2553.49       .756      .775313
--------------------------------------------------------------------------------
           CDPARK: result of ten tests on file cxt-random.bin   
            Of 12,000 tries, the average no. of successes
                 should be 3523 with sigma=21.9
            Successes: 3522    z-score:  -.046 p-value: .481790
            Successes: 3549    z-score:  1.187 p-value: .882429
            Successes: 3540    z-score:   .776 p-value: .781201
            Successes: 3523    z-score:   .000 p-value: .500000
            Successes: 3481    z-score: -1.918 p-value: .027568
            Successes: 3560    z-score:  1.689 p-value: .954438
            Successes: 3535    z-score:   .548 p-value: .708135
            Successes: 3540    z-score:   .776 p-value: .781201
            Successes: 3558    z-score:  1.598 p-value: .944998
            Successes: 3523    z-score:   .000 p-value: .500000
 
           square size   avg. no.  parked   sample sigma
             100.            3533.100       21.735
            KSTEST for the above 10: p=  .886433
--------------------------------------------------------------------------------
               This is the MINIMUM DISTANCE test
              for random integers in the file cxt-random.bin   
     Sample no.    d^2     avg     equiv uni            
           5    4.2457   1.3667     .985977
          10     .9913    .9221     .630767
          15    3.5213   1.1198     .970958
          20    1.2824   1.2817     .724406
          25     .6396   1.2906     .474165
          30    4.8530   1.4317     .992383
          35     .8917   1.3683     .591873
          40     .6037   1.3550     .454891
          45     .6368   1.3559     .472724
          50     .5475   1.3898     .423189
          55     .4340   1.3111     .353477
          60    1.4956   1.2899     .777553
          65     .1540   1.2183     .143352
          70     .2093   1.1710     .189738
          75     .3388   1.1494     .288622
          80     .7392   1.1451     .524275
          85     .1653   1.1508     .153100
          90     .0521   1.1105     .051020
          95     .7667   1.0765     .537229
         100     .7915   1.1453     .548654
     MINIMUM DISTANCE TEST for cxt-random.bin   
          Result of KS test on 20 transformed mindist^2's:
                                  p-value= .562841
--------------------------------------------------------------------------------
               The 3DSPHERES test for file cxt-random.bin   
 sample no:  1     r^3=   6.303     p-value= .18950
 sample no:  2     r^3=  35.199     p-value= .69065
 sample no:  3     r^3=   3.769     p-value= .11805
 sample no:  4     r^3=  43.014     p-value= .76160
 sample no:  5     r^3=   1.465     p-value= .04765
 sample no:  6     r^3=  22.555     p-value= .52850
 sample no:  7     r^3=  28.714     p-value= .61601
 sample no:  8     r^3=  88.976     p-value= .94848
 sample no:  9     r^3=  79.909     p-value= .93030
 sample no: 10     r^3=  10.148     p-value= .28699
 sample no: 11     r^3=  13.938     p-value= .37161
 sample no: 12     r^3= 100.973     p-value= .96546
 sample no: 13     r^3=  58.747     p-value= .85889
 sample no: 14     r^3=  38.540     p-value= .72326
 sample no: 15     r^3=   2.297     p-value= .07371
 sample no: 16     r^3=  10.814     p-value= .30264
 sample no: 17     r^3=    .320     p-value= .01060
 sample no: 18     r^3=  18.920     p-value= .46776
 sample no: 19     r^3=  83.657     p-value= .93849
 sample no: 20     r^3=  20.114     p-value= .48853
       3DSPHERES test for file cxt-random.bin         p-value= .190515
--------------------------------------------------------------------------------
            RESULTS OF SQUEEZE TEST FOR cxt-random.bin   
         Table of standardized frequency counts
     ( (obs-exp)/sqrt(exp) )^2
        for j taking values <=6,7,8,...,47,>=48:
     -.1      .1      .6      .3      .4    -1.5
    -1.8    -1.1     1.4    -1.3     -.3     -.1
      .9     -.9      .7      .6    -1.5    -1.0
     1.0    -1.5     1.4     1.3    -1.6     -.1
     1.5      .1      .5      .9      .4     -.2
     1.3     -.5     1.9     -.5      .8    -1.2
      .3     1.1     -.8     -.1     -.6    -1.0
      .8
           Chi-square with 42 degrees of freedom: 41.645
              z-score=  -.039  p-value= .513536
______________________________________________________________
--------------------------------------------------------------------------------
                Test no.  1      p-value  .956031
                Test no.  2      p-value  .875327
                Test no.  3      p-value  .177829
                Test no.  4      p-value  .139778
                Test no.  5      p-value  .090718
                Test no.  6      p-value  .602393
                Test no.  7      p-value  .751880
                Test no.  8      p-value  .591441
                Test no.  9      p-value  .986117
                Test no. 10      p-value  .694083
   Results of the OSUM test for cxt-random.bin   
        KSTEST on the above 10 p-values:  .614884
--------------------------------------------------------------------------------
           The RUNS test for file cxt-random.bin   
     Up and down runs in a sample of 10000
_________________________________________________ 
                 Run test for cxt-random.bin   :
       runs up; ks test for 10 p's: .691860
     runs down; ks test for 10 p's: .151376
                 Run test for cxt-random.bin   :
       runs up; ks test for 10 p's: .145507
     runs down; ks test for 10 p's: .552330
--------------------------------------------------------------------------------
                Results of craps test for cxt-random.bin   
  No. of wins:  Observed Expected
                                98666    98585.86
 Chisq=  20.13 for 20 degrees of freedom, p=  .55004
               Throws Observed Expected  Chisq     Sum
            SUMMARY  FOR cxt-random.bin   
                p-value for no. of wins: .639991
                p-value for throws/game: .550040
  Test completed.  File cxt-random.bin   
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

 

 

 

 

 

     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     ::            This is the BIRTHDAY SPACINGS TEST                 ::
     :: Choose m birthdays in a year of n days.  List the spacings    ::
     :: between the birthdays.  If j is the number of values that     ::
     :: occur more than once in that list, then j is asymptotically   ::
     :: Poisson distributed with mean m^3/(4n).  Experience shows n   ::
     :: must be quite large, say n>=2^18, for comparing the results   ::
     :: to the Poisson distribution with that mean.  This test uses   ::
     :: n=2^24 and m=2^9,  so that the underlying distribution for j  ::
     :: is taken to be Poisson with lambda=2^27/(2^26)=2.  A sample   ::
     :: of 500 j's is taken, and a chi-square goodness of fit test    ::
     :: provides a p value.  The first test uses bits 1-24 (counting  ::
     :: from the left) from integers in the specified file.           ::
     ::   Then the file is closed and reopened. Next, bits 2-25 are   ::
     :: used to provide birthdays, then 3-26 and so on to bits 9-32.  ::
     :: Each set of bits provides a p-value, and the nine p-values    ::
     :: provide a sample for a KSTEST.                                ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     ::            THE OVERLAPPING 5-PERMUTATION TEST                 ::
     :: This is the OPERM5 test.  It looks at a sequence of one mill- ::
     :: ion 32-bit random integers.  Each set of five consecutive     ::
     :: integers can be in one of 120 states, for the 5! possible or- ::
     :: derings of five numbers.  Thus the 5th, 6th, 7th,...numbers   ::
     :: each provide a state. As many thousands of state transitions  ::
     :: are observed,  cumulative counts are made of the number of    ::
     :: occurences of each state.  Then the quadratic form in the     ::
     :: weak inverse of the 120x120 covariance matrix yields a test   ::
     :: equivalent to the likelihood ratio test that the 120 cell     ::
     :: counts came from the specified (asymptotically) normal dis-   ::
     :: tribution with the specified 120x120 covariance matrix (with  ::
     :: rank 99).  This version uses 1,000,000 integers, twice.       ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :: This is the BINARY RANK TEST for 31x31 matrices. The leftmost ::
     :: 31 bits of 31 random integers from the test sequence are used ::
     :: to form a 31x31 binary matrix over the field {0,1}. The rank  ::
     :: is determined. That rank can be from 0 to 31, but ranks< 28   ::
     :: are rare, and their counts are pooled with those for rank 28. ::
     :: Ranks are found for 40,000 such random matrices and a chisqua-::
     :: re test is performed on counts for ranks 31,30,29 and <=28.   ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :: This is the BINARY RANK TEST for 32x32 matrices. A random 32x ::
     :: 32 binary matrix is formed, each row a 32-bit random integer. ::
     :: The rank is determined. That rank can be from 0 to 32, ranks  ::
     :: less than 29 are rare, and their counts are pooled with those ::
     :: for rank 29.  Ranks are found for 40,000 such random matrices ::
     :: and a chisquare test is performed on counts for ranks  32,31, ::
     :: 30 and <=29.                                                  ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :: This is the BINARY RANK TEST for 6x8 matrices.  From each of  ::
     :: six random 32-bit integers from the generator under test, a   ::
     :: specified byte is chosen, and the resulting six bytes form a  ::
     :: 6x8 binary matrix whose rank is determined.  That rank can be ::
     :: from 0 to 6, but ranks 0,1,2,3 are rare; their counts are     ::
     :: pooled with those for rank 4. Ranks are found for 100,000     ::
     :: random matrices, and a chi-square test is performed on        ::
     :: counts for ranks 6,5 and <=4.                                 ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     ::                   THE BITSTREAM TEST                          ::
     :: The file under test is viewed as a stream of bits. Call them  ::
     :: b1,b2,... .  Consider an alphabet with two "letters", 0 and 1 ::
     :: and think of the stream of bits as a succession of 20-letter  ::
     :: "words", overlapping.  Thus the first word is b1b2...b20, the ::
     :: second is b2b3...b21, and so on.  The bitstream test counts   ::
     :: the number of missing 20-letter (20-bit) words in a string of ::
     :: 2^21 overlapping 20-letter words.  There are 2^20 possible 20 ::
     :: letter words.  For a truly random string of 2^21+19 bits, the ::
     :: number of missing words j should be (very close to) normally  ::
     :: distributed with mean 141,909 and sigma 428.  Thus            ::
     ::  (j-141909)/428 should be a standard normal variate (z score) ::
     :: that leads to a uniform [0,1) p value.  The test is repeated  ::
     :: twenty times.                                                 ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     ::             The tests OPSO, OQSO and DNA                      ::
     ::         OPSO means Overlapping-Pairs-Sparse-Occupancy         ::
     :: The OPSO test considers 2-letter words from an alphabet of    ::
     :: 1024 letters.  Each letter is determined by a specified ten   ::
     :: bits from a 32-bit integer in the sequence to be tested. OPSO ::
     :: generates  2^21 (overlapping) 2-letter words  (from 2^21+1    ::
     :: "keystrokes")  and counts the number of missing words---that  ::
     :: is 2-letter words which do not appear in the entire sequence. ::
     :: That count should be very close to normally distributed with  ::
     :: mean 141,909, sigma 290. Thus (missingwrds-141909)/290 should ::
     :: be a standard normal variable. The OPSO test takes 32 bits at ::
     :: a time from the test file and uses a designated set of ten    ::
     :: consecutive bits. It then restarts the file for the next de-  ::
     :: signated 10 bits, and so on.                                  ::
     ::                                                               ::
     ::     OQSO means Overlapping-Quadruples-Sparse-Occupancy        ::
     ::   The test OQSO is similar, except that it considers 4-letter ::
     :: words from an alphabet of 32 letters, each letter determined  ::
     :: by a designated string of 5 consecutive bits from the test    ::
     :: file, elements of which are assumed 32-bit random integers.   ::
     :: The mean number of missing words in a sequence of 2^21 four-  ::
     :: letter words,  (2^21+3 "keystrokes"), is again 141909, with   ::
     :: sigma = 295.  The mean is based on theory; sigma comes from   ::
     :: extensive simulation.                                         ::
     ::                                                               ::
     ::    The DNA test considers an alphabet of 4 letters::  C,G,A,T,::
     :: determined by two designated bits in the sequence of random   ::
     :: integers being tested.  It considers 10-letter words, so that ::
     :: as in OPSO and OQSO, there are 2^20 possible words, and the   ::
     :: mean number of missing words from a string of 2^21  (over-    ::
     :: lapping)  10-letter  words (2^21+9 "keystrokes") is 141909.   ::
     :: The standard deviation sigma=339 was determined as for OQSO   ::
     :: by simulation.  (Sigma for OPSO, 290, is the true value (to   ::
     :: three places), not determined by simulation.                  ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     ::     This is the COUNT-THE-1's TEST on a stream of bytes.      ::
     :: Consider the file under test as a stream of bytes (four per   ::
     :: 32 bit integer).  Each byte can contain from 0 to 8 1's,      ::
     :: with probabilities 1,8,28,56,70,56,28,8,1 over 256.  Now let  ::
     :: the stream of bytes provide a string of overlapping  5-letter ::
     :: words, each "letter" taking values A,B,C,D,E. The letters are ::
     :: determined by the number of 1's in a byte::  0,1,or 2 yield A,::
     :: 3 yields B, 4 yields C, 5 yields D and 6,7 or 8 yield E. Thus ::
     :: we have a monkey at a typewriter hitting five keys with vari- ::
     :: ous probabilities (37,56,70,56,37 over 256).  There are 5^5   ::
     :: possible 5-letter words, and from a string of 256,000 (over-  ::
     :: lapping) 5-letter words, counts are made on the frequencies   ::
     :: for each word.   The quadratic form in the weak inverse of    ::
     :: the covariance matrix of the cell counts provides a chisquare ::
     :: test::  Q5-Q4, the difference of the naive Pearson sums of    ::
     :: (OBS-EXP)^2/EXP on counts for 5- and 4-letter cell counts.    ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     ::     This is the COUNT-THE-1's TEST for specific bytes.        ::
     :: Consider the file under test as a stream of 32-bit integers.  ::
     :: From each integer, a specific byte is chosen , say the left-  ::
     :: most::  bits 1 to 8. Each byte can contain from 0 to 8 1's,   ::
     :: with probabilitie 1,8,28,56,70,56,28,8,1 over 256.  Now let   ::
     :: the specified bytes from successive integers provide a string ::
     :: of (overlapping) 5-letter words, each "letter" taking values  ::
     :: A,B,C,D,E. The letters are determined  by the number of 1's,  ::
     :: in that byte::  0,1,or 2 ---> A, 3 ---> B, 4 ---> C, 5 ---> D,::
     :: and  6,7 or 8 ---> E.  Thus we have a monkey at a typewriter  ::
     :: hitting five keys with with various probabilities::  37,56,70,::
     :: 56,37 over 256. There are 5^5 possible 5-letter words, and    ::
     :: from a string of 256,000 (overlapping) 5-letter words, counts ::
     :: are made on the frequencies for each word. The quadratic form ::
     :: in the weak inverse of the covariance matrix of the cell      ::
     :: counts provides a chisquare test::  Q5-Q4, the difference of  ::
     :: the naive Pearson  sums of (OBS-EXP)^2/EXP on counts for 5-   ::
     :: and 4-letter cell counts.                                     ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     ::               THIS IS A PARKING LOT TEST                      ::
     :: In a square of side 100, randomly "park" a car---a circle of  ::
     :: radius 1.   Then try to park a 2nd, a 3rd, and so on, each    ::
     :: time parking "by ear".  That is, if an attempt to park a car  ::
     :: causes a crash with one already parked, try again at a new    ::
     :: random location. (To avoid path problems, consider parking    ::
     :: helicopters rather than cars.)   Each attempt leads to either ::
     :: a crash or a success, the latter followed by an increment to  ::
     :: the list of cars already parked. If we plot n:  the number of ::
     :: attempts, versus k::  the number successfully parked, we get a::
     :: curve that should be similar to those provided by a perfect   ::
     :: random number generator.  Theory for the behavior of such a   ::
     :: random curve seems beyond reach, and as graphics displays are ::
     :: not available for this battery of tests, a simple characteriz ::
     :: ation of the random experiment is used: k, the number of cars ::
     :: successfully parked after n=12,000 attempts. Simulation shows ::
     :: that k should average 3523 with sigma 21.9 and is very close  ::
     :: to normally distributed.  Thus (k-3523)/21.9 should be a st-  ::
     :: andard normal variable, which, converted to a uniform varia-  ::
     :: ble, provides input to a KSTEST based on a sample of 10.      ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     ::               THE MINIMUM DISTANCE TEST                       ::
     :: It does this 100 times::   choose n=8000 random points in a   ::
     :: square of side 10000.  Find d, the minimum distance between   ::
     :: the (n^2-n)/2 pairs of points.  If the points are truly inde- ::
     :: pendent uniform, then d^2, the square of the minimum distance ::
     :: should be (very close to) exponentially distributed with mean ::
     :: .995 .  Thus 1-exp(-d^2/.995) should be uniform on [0,1) and  ::
     :: a KSTEST on the resulting 100 values serves as a test of uni- ::
     :: formity for random points in the square. Test numbers=0 mod 5 ::
     :: are printed but the KSTEST is based on the full set of 100    ::
     :: random choices of 8000 points in the 10000x10000 square.      ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     ::              THE 3DSPHERES TEST                               ::
     :: Choose  4000 random points in a cube of edge 1000.  At each   ::
     :: point, center a sphere large enough to reach the next closest ::
     :: point. Then the volume of the smallest such sphere is (very   ::
     :: close to) exponentially distributed with mean 120pi/3.  Thus  ::
     :: the radius cubed is exponential with mean 30. (The mean is    ::
     :: obtained by extensive simulation).  The 3DSPHERES test gener- ::
     :: ates 4000 such spheres 20 times.  Each min radius cubed leads ::
     :: to a uniform variable by means of 1-exp(-r^3/30.), then a     ::
     ::  KSTEST is done on the 20 p-values.                           ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     ::      This is the SQEEZE test                                  ::
     ::  Random integers are floated to get uniforms on [0,1). Start- ::
     ::  ing with k=2^31=2147483647, the test finds j, the number of  ::
     ::  iterations necessary to reduce k to 1, using the reduction   ::
     ::  k=ceiling(k*U), with U provided by floating integers from    ::
     ::  the file being tested.  Such j's are found 100,000 times,    ::
     ::  then counts for the number of times j was <=6,7,...,47,>=48  ::
     ::  are used to provide a chi-square test for cell frequencies.  ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     ::             The  OVERLAPPING SUMS test                        ::
     :: Integers are floated to get a sequence U(1),U(2),... of uni-  ::
     :: form [0,1) variables.  Then overlapping sums,                 ::
     ::   S(1)=U(1)+...+U(100), S2=U(2)+...+U(101),... are formed.    ::
     :: The S's are virtually normal with a certain covariance mat-   ::
     :: rix.  A linear transformation of the S's converts them to a   ::
     :: sequence of independent standard normals, which are converted ::
     :: to uniform variables for a KSTEST. The  p-values from ten     ::
     :: KSTESTs are given still another KSTEST.                       ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     ::     This is the RUNS test.  It counts runs up, and runs down, ::
     :: in a sequence of uniform [0,1) variables, obtained by float-  ::
     :: ing the 32-bit integers in the specified file. This example   ::
     :: shows how runs are counted:  .123,.357,.789,.425,.224,.416,.95::
     :: contains an up-run of length 3, a down-run of length 2 and an ::
     :: up-run of (at least) 2, depending on the next values.  The    ::
     :: covariance matrices for the runs-up and runs-down are well    ::
     :: known, leading to chisquare tests for quadratic forms in the  ::
     :: weak inverses of the covariance matrices.  Runs are counted   ::
     :: for sequences of length 10,000.  This is done ten times. Then ::
     :: repeated.                                                     ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     :: This is the CRAPS TEST. It plays 200,000 games of craps, finds::
     :: the number of wins and the number of throws necessary to end  ::
     :: each game.  The number of wins should be (very close to) a    ::
     :: normal with mean 200000p and variance 200000p(1-p), with      ::
     :: p=244/495.  Throws necessary to complete the game can vary    ::
     :: from 1 to infinity, but counts for all>21 are lumped with 21. ::
     :: A chi-square test is made on the no.-of-throws cell counts.   ::
     :: Each 32-bit integer from the test file provides the value for ::
     :: the throw of a die, by floating to [0,1), multiplying by 6    ::
     :: and taking 1 plus the integer part of the result.             ::
     :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
       NOTE: Most of the tests in DIEHARD return a p-value, which
       should be uniform on [0,1) if the input file contains truly
       independent random bits.   Those p-values are obtained by
       p=F(X), where F is the assumed distribution of the sample
       random variable X---often normal. But that assumed F is just
       an asymptotic approximation, for which the fit will be worst
       in the tails. Thus you should not be surprised with
       occasional p-values near 0 or 1, such as .0012 or .9983.
       When a bit stream really FAILS BIG, you will get p's of 0 or
       1 to six or more places.  By all means, do not, as a
       Statistician might, think that a p < .025 or p> .975 means
       that the RNG has "failed the test at the .05 level".  Such
       p's happen among the hundreds that DIEHARD produces, even
       with good RNG's.  So keep in mind that " p happens".

 

bottom of page