rp2040 PIO Design
Hardware/software projects
2022 - 2024
- rp2040 Microcontroller
- C-SDK -- The C-SDK is open source, has good examples and is very well documented. The C-SDK directly supports PIO assembly language, DMA transactions, and will support the highest interrupt rate of the development systems. The instructor for ece4760, Van Hunter Adams, has produced many cool C applications for the rp2040.
My Applications:
- ece4760 Support
- Protothreads 1.3 (written by Adam Dunkels) to be used 2024
A port to the rp2040, extended for two cores with support for hardware locks, core-safe semaphores, and core-to-core hardware FIFOs. Protothreads provides a simple, cooperative, thread environment with very fast context switch and low memory overhead. Modified thread dispatcher implements priority scheduling in addition the default roundrobin.
- Visual Studio configuration.
Configure VS for editing, programming, debugging using the Pi Debug Probe
- Add program/reset button
A trivial hardware and software addition simplifies programming.
Superceded by using the debugger.
- WIFI setup.
The physical considerations and software necessary for WIFI.
- Fixed point arithmetic
Somewhat faster the floating point on the rp2040. We implemented s15x16 (signed, with 15 bits of integer, and 16 bits of fraction) and s1x14 fixed point.
- Graphics
- VGA 16-color 640x480 with better fonts (LWIP compatable, protothreads 1.3)
Adding classic IBM VGA-437
font, and boldface for the 5x7 font. (older depricated version VGA Four-bit color)
An extension of Hunter's PIO VGA driver from 3-bit color to 4 bit color, with Protothreads.
Two bits of green allows more shades of blue and red, and adds orange. Color Map
- VGA 256 color 319x240. (LWIP compatable, protothreads 1.3, PIO bug fix)
An extension of Hunter's PIO VGA driver for 8 bit color.
Three bits of green, three bits of red, and two bits of blue.
- Serial command interface
Modern serial terminal emulators can produce a good looking user interface.
- Joystick and encoder wheel with VGA
A simple analog joystick with select button, and a digital encoder wheel with direction and selector buttons.
- External Memory
- LWIP on PicoW
(Lightweight TCP/IP stack)
- TCP protocol
- Network Time Protocol
- UDP protocol
- Symmetric send/receive between two picoWs (access point to station)
Each picoW can send or receive packets.
The code running on the two nodes is almost identical.
- Audio-rate UDP from picow to picow (access point to station)
Sending real-time data from picow to picow with no router involved.
- Data array UDP send/receive (station to station thru hotspot)
Sending an array of data allows performance testing as well as understanding of data sizes, and dealing with packet efficiency.
- UDP send/receive from desktop (station to station thru hotspot)
UDP
protocol is a simple and fast, but does little error checking. It is therefore useful for data streams where a bad value is not a show stopper. You might send music but not code.
- UDP send/receive pico-to-pico. (station to station thru hotspot)
A simple scheme for figuring out IP addresses on the fly allows two picos to send data to each other.
- USB on Pico
- Amusing (to me) computational projects
- DMA computing machine.
The DMA subsystem is capable of running a compute-universal system.
This implementation uses three channels for fetch/execute, and a list of DMA control blocks as a program.
- Random number generation.
Using the ROSC to make reasonable quality random numbers. Routines for integers, fixed-point uniform and normal distributions, and for single bits.
- Lattice-Boltzmann fluid flow simulation
A strange hybrid of finite difference and cellular automaton using fixed point arithmetic for speed. The fixed point system used is s1x25. The shorter, and faster, 16-bit fixed point did not have enough precision for stable solutions.
- 16-bit floating point,
with similar bit layout to the IEEE standard FP16, but without infinities or NANs. The reason for doing this annoying exercise is to see if the Lattice-Boltzmann solver runs better in limited precision, and therefore faster, floating point than in fixed point. The 16-bit floats have a dynamic range of 1e5 and resolution of 1e-4.
- rp2040 PIO control
Experiments to see if modifing the code of a running PIO channel is feasable.
- rp2040 Interpolator hardware
- DDS and FM waveform synthesis using interpolator peripherial.
- DSP
- IIR filter designer
Using s1x14 fixed point format. Lowpass, bandpass, and highpass filters design. Plots the Bode plot of the designed filter, and allows the new filter to be used in realtime for audio input.
- FIR filter designer
Using s1x14 fixed point format. Lowpass, bandpass, and highpass filters design. Plots the Bode plot of the designed filter, and allows the new filter to be used in realtime for audio input. Also a version for linear frequency plots.
- FFT Spectrogram
Plotting the power spectrum, log power spectrum, and spectrogram of voice signals. The spectrogram actually has enough info to decode what is being said.
- DSP development
IIR filters in fixed point s7x24 format. Butterworth lowpass and bandpass filter.
- Sound Synthesis
- FM synthesis of sounds.
Some FM synthesis even sounds like music. Widely used in the 1970's, '80s, and even into the 90's. for music and sound effects.
- Karplus-Strong strings
Plucked and bowed string physical synthesis. An approximation of the actual PDE of a string.
- MicroPython -- I played with this system for a couple of months just after the PICO was announced. Python is a very nice environment, but fairly slow for real-time programming. I got DMA running, but it was tedious because Python really does not want you to mess directly with memory. In the end, we decided to move to straight C-SDK for development.
Applications:
- A simple program to sample the ADC and copy the values to a PWM channel at 500 Ksamples/sec using zero cpu cycles.
- I figured out a DDS system using the settable DMA transfer clock to produce sinewaves on a pwm channel with no cpu cycles.
- MBED/Arduino -- The MBED framework (arduino site) allows use of a simple RTOS on core0, while running bare C-SDK on core1. The tool chain is installed in the usual trivial Arduino way. Speed is good and the RTOS is quite handy. It is a bit confusing because there are three different peripherial libraries to handle: Arduino, MBED, and C-SDK. I completely eliminated the Arduino constructs and used a mixture of MBED and C-SDK. Using the PIO subsystems required an external assembler for the PIO assembly language and was a little inconvenient. In the end, we decided to move to straight C-SDK for development.
Applications:
- ADC and PWM DAC using DMA with
ADC>DMA>PWM on Core0
DDS>DMA>PWM on Core1
The DDS algorithm is completely carried out by a DMA channel and the PWM DAC.
- A PIO
input event time/duration capture which works up to about 10 MHz.The RP2040 has no "input capture" peripherial that uses hardare to grab a time stamp for an external event (edge on i/o pin). Both the AVR and PIC32 that I have used can capture times in hardware, and I find it useful. The PIO can be used to implement a fast timer/counter, detect i/o pin edges, and log the time stamps at full bus rate to a 8-slot hardware FIFO.
- PIO based stepper motor drivers that took high level move commands from the cpu.
The implementation ran two motors. You specify steps/sec, pattern, and step count for each motor. The pattern can be forward/backward using steps/half-steps. The motors run until the step count is reached, then the PIO throws an interrupt.
- DMA and PIO based NTSC video generator which used zero cpu cycles to
refresh the screen.
there are 255x200 1-bit black/white and 4-bit gray scale versions.
One use of this was to simulate diffusion-limited aggregation.
- A
scheme to extract the dominant frequencies (called formants) of vowels in speech. This is mathematically involved (three FFTs), but runs in real-time on the two cores. The code uses two interesting approximations. The first is the alpha-max, beta-min algorithm to speed up squareroot of sum-of-squares. It is accurate to within 2%. The second is an approximation of log base two from Generation of Products and Quotients Using Approximate Binary Logarithms for Digital Filtering Applications, IEEE Transactions on Computers 1970 vol.19 Issue No.02. It is accurate to within 0.02 log units.
- There are also some tests for ISR speed, IIR fliter speed, and FIR speed. The FIRs are just fast enough to do Head Related Transfer Functions in real-time.
- Future:
- packet radio or packet IR
- adaptive filters
- head related transfer function
- Obsolete ece4760 page -- for reference only!
- Cyclone5 FPGA/SoC
- HPS Lattice-Boltzmann fluid flow solver aimed at the FPGA, but currently running on HPS side. The current code uses 20-bit fixed point in s1x18 format (sign-bit, one bit to left of binary-point, 18 to right). This format fits nicely into the M10k memory blocks which can be formatted as 512, 20-bit words.
- Quartus 18.1 ports. We need to move from Quartus 15 to 18.1 for continued support. This activity is aimed at checking compatability (and cleaning up examples). The only compatability issue so far is a change in a Qsys module of the VGA control system. The change invalidates two obsolete i/o devices, which happen to be set as defaults in Quartus 15. The fix is to use pull down device menus to choose valid devices.
- Using Verilog to understand the rp2040 PIO processors. The Pi rp2040 microcontroller has 8, single cycle, deterministic, Programmable i/o blocks (PIO). Each PIO is programmed using a 9-instruction assembly language. We have been speculationg about how you make a single-cycle machine (including jump). This project attemptes to answer that, without implementing the entire PIO architercture.
- ece5760 page
- Analog
- Antialiasing.
Build your own or buy a module.
- Analog computer
- Math
- Lattice-Boltzmann
- Cellular Automata
- Computer graphics
- Cepstral analysis of speech
Copyright Cornell University
April 19, 2024