Cornell University ECE4760
Random number generators for

Pi Pico RP2040

 

Random numbers on RP2040
No purely software technique can produce truely random numbers. You need a physical source of input that is uncorrelated with the current program. For low rates, the input could be a human mouse click, or if there is an internet connection, maybe packet arrival time. The rp2040 has a high speed, ring oscillator clock (ROSC), which is not phase-locked to the system clock. The ring oscillator frequency depends on temperature, voltage, and process variations. The documentation suggests using the clock output bit as the input to a random number generator, but warns that the function is uncharacterized (see the datasheet, section 2.17). We wrote two different schemes to convert the random bits into useable 32-bit numbers:

DISCLAIMER! The ROSC is not shown by the manufacturer or by me to have any reliable level of random generation. Further, of the three rp2040's I have tested, each gives a somewhat different oscillator speed, and different distributions of bits. Do not use this for any critical project without doing your own tests!


Software-driven, Uniform and normal distributed integer, bits, and s15x16 fixed point numbers.
The basic 32-bit unsigned int functions describled in more detail below were modified to produce two qualities of uniform 32-bit unsigned integers, fixed point uniform distribution between -1 and 1, fixed point normal distribution with mean zero and standard deviation of one, as well as three qualities of single bits. The lower quality routines are faster.


I wrote an application that allows you to play with the five random routines. At the command line you can specify generation of one of five distributions and specify how many events to allow. The event count is specified as 0 to 8. Each larger integer doubles the number of event. Zero corresponds to 400 events in the peak bin.
The distributions are specified as:

Code, ZIP


DMA-driven, uniform, normal and one-bit integers.
The DMA channel sniffer can compute a CRC32 in one cycle on any value sent through it, using the initial value of the sniffer data register as a seed. The slower, software, integer routines from the last example are used once to generate an unpredictable seed, which is copied to the sniffer data register. One DMA sends the ROSC random bit through the sniffer. The next DMA block in the chain moves the sniffer data register to a C variable. The third DMA block in the chain moves a true to a C status vaiable indicating new data, then chains back to the first DMA channel. The three channel chain executes in about 120 nSec, or about 8 million random number calculations per sec But by the time that it is packaged and used in C you get 3 million/sec. The binomial distribution, normal distribution and serial correlation all look good, but again, no formal estimation of quality. The DLA program computes a Diffusion-limited Aggregation, shown below. The serial correlation program should generate completely random dot-field if everything is correct. The distributions program allows you to use the serial interface to plot binomial and normal distributions. The distributions shown have small red markers for the expected histogram shapes. The binomial represents events consisting of 20 coin tosses. The normal distribution is built up from 12 uniform, random integers..

Code: DLA, serial correlation, distributions, ZIP


History; early steps in using ROSC

Uniform random 32-bit unsigned integers
The first thing to get running is a relatively unbiased 32-bit generator. There are two steps. First extract and concantenate 32 bits, then use this as a seed for the C rand function. The rand function served to scramble any remaining correlation between concantenated bits. The resulting distribution is displayed on VGA. I implemented two bit extractors, with different biases and speeds. One is simple and fast, but still shows some correlations. The second one is 5 times slower, but much less biased. The simple one is:

for(k=0;k<32;k++){    
      // at least one nop need to be here for longer timing  
      //extractor_count++ ;
      asm("nop") ; asm("nop") ; // adds one microsec to execution
      //
      random_bit1=0x00000001 & (*rnd_reg);
      random=(random << 1) | random_bit1 ;
    } 

For each of 32 bits, wait for two machine cycles, which is more than 3 ROSC cycles at the ROSC frequency I chose, grab a bit from the rnd_reg and concantenate. A 32 bit random number is generated in about 6 uSec. The better extractor is:
for(k=0;k<32;k++){
      // von Neumann bit extractor         
      while(1){
        //extractor_count++ ;
        random_bit1=0x00000001 & (*rnd_reg);
        //  a small delay decorrelates the bits
        asm("nop") ; asm("nop") ;
        asm("nop") ; asm("nop") ;
        asm("nop") ; asm("nop") ;
        asm("nop") ; asm("nop") ;      
        random_bit2=0x00000001 & (*rnd_reg);
        // if the two are diferent, use the first one
        if(random_bit1!=random_bit2) break;
      }  
      // build the 32 bit sample
      random=(random << 1) | random_bit1 ;
    }

This code uses a Von Neumann extractor, plus a little delay, to produce an unbiased bit stream. Execution speed varies, depending on how may passes are required in the while loop, but is approximately 25-35 uSec. Each of the extractors is followed by a pass through the C rand function. As a sanity check, I computed and printed one generated number in main as soon as serial is enabled. The number is unpredictable, unlike the number returned from C rand. The speed of the ROSC is also printed. On my cpu the speed was 241 MHz with the speed range set to HIGH and full drive turned on in the freqA and freqB registers. You can try setting the speed range to TOOHIGH, but it may not work. My specific system gives 325 MHz on the TOOHIGH setting.

Code, ZIP


Normal distribution by adding uniformly distributed unsigned integers.
The Central Limit Theorem says that if you add together enough uniformly distributed numbers (UDN) that you must get a normal distribution. The Irwin-Hall distribution converges fairly closely to a normal distribution after adding about 12 UDN. The result of adding 12 UDN generated using the lower quality extractor is shown below. The samples are in green and the theoretical normal distribution is show in red. There is probably a slight deficit in samples near the peak. The deficit disappears if 20 UDN are used. For graphics or game use 12 is probably enough. Even 4 UDN added together has strong central tendency. In this code, the normal distribution is computed in the main loop and you can experiment with the effect of different sums of UDNs. (See the next project for better packaging).

Code, ZiP


 


Copyright Cornell University January 19, 2023