ECE4760 ProtoThreads

Cornell University ECE4760
ProtoThreads
PIC32MX250F128B

Protothreads on PIC32
Protothreads is a very light-weight, stackless, threading library written entirely as C macros by Adam Dunkels. As such, it is trivial to move to PIC32. Adam Dunkels' documentation is very good and easy to understand. There is support for a thread to wait for an event, spawn a thread, and use semaphores. The Protothreads system is a cooperative multithread system. As such, there is no thread preemption. All thread switching is at explicit wait or yield statement. There is no scheduler. You can write your own, use mine (version 1.3.2) or just use a simple round-robin scheme, calling each thread in succession in main. Because there is no preemption, handling shared variables is easier because you always know exactly when a thread switch will occur. Because there is no separate stack for each thread, the memory footprint is quite small, but using automatic (stack) local variables must be avoided. You can use static local variables.

The Protothreads system is a cooperative multithread system.
There is no thread preemption. All thread switching is at explicit yield statement.
This means that any thread that does not yield blocks all other threads.
Yield commands include:
- PT_YIELD(pt)
- PT_YIELD_UNTIL(pt, condition)
- PT_YIELD_TIME_msec(delay_time)
A thread may be scheduled by main (or another thread) or it may be spawned by a parent thread.
- A scheduled thread acts as an independent flow of control with its own
  initialization, and endless loop. A scheduled thread must never exit!
- A spawned (child) thread acts like a function call which blocks only the parent thread until it exits.
  A child thread can yield to the scheduler.
  A function called (not spawned) from a thread cannot yield and will block all threads. Avoid this.
There is no separate stack for each thread, so using automatic (stack) local variables must be avoided.
Use static local variables!
Thread switch time is quite short, less than 1 microsec.
Repeatable (low jitter) timing must be done by ISR because
the actual time of next execution of a given task depends on all other tasks.

You must read sections 1.3-1.6 of the reference manual (local copy) to see all of the implementation details.
I added:

A millisecond resolution time thread yield macro, which uses a timer ISR.
Read the protothreads-header timer setup before you modify any timer control registers for timer 1.
Turning off interrupts (or failing to turn them on) crashes Protothreads.
Nonblocking UART receive thread for human typing at a terminal (optional)
Nonblocking UART machine-to-machine receive thread (optional)
Nonblocking normal and DMA UART transmit thread (optional)
A simple rate scheduler that allows some threads more cpu time than others
A 1-pin event debugger using the settable voltage reference pin. (optional --uses RB14, pin 25)

Debugging ProtoThreads

If your program just sits there and does nothing:
-- Did you turn on interrupts? The thread timer depends on an interrupt.
-- Did you write each thread to include at least one YIELD (or YIELD_UNTIL, or YIELD_TIME_msec) in the while(1) loop?
-- Did you schedule the threads?
-- Did you ask to initialize the TFT display in the code, while not having a TFT actuallu connected?
-- Does the board have correct Vdd?

If your program reboots about 5 times/sec:
-- Did you exit from any scheduled thread? This blows out the stack.
-- Did you turn on any interrupt source and fail to write an ISR for that source?
-- Did you divide by zero? A divide by zero is an untrapped exception.
-- Did you write to a non-existant memory location, perhaps by using a negative (or very large) array index.
A write to a non-existant memory location is an untrapped exception.

If your thread state variables seem to change on their own:
--- Did you define any automatic local variable? Local variables in a thread should be static.
-- Are your global arrays big enough the clobber the stack in high memory? You should be able to use over 30kbytes.

Version 1_3_2: Works with the Big development board (SECABB)
Use config_1_3_2.h and pt_cornell_1_3_2.h on the big development board (SECABB).
ZIP file. Keep reading for the details. All current examples codes for 1_3_2.
All previous software examples for the SECABB will run with this version.
You must still select the hardware features you need in config_1_3_2.h.
-- Turning on the use_vref_debug feature disables RB14 for everything except Vref output.
-- Turning on the use_uart_serial feature disables RB10 and RA1 for everything except USART2.
(For serial connection to a pC, use the cable specified near the bottom of the serial page.)
See the tables below for the Protothreads Function Summary.

Added features:

Moved thread system timer to timer 1 for better use of more flexible timers 4 and 5.
A get_machine_serial_buffer routine designed for machine (non-human) communication.
The routine uses DMA UART2-to-memory for high speed communication.
Better format control for serial console interactions with humans.
Higher default baud rate in config file. You may need to change this!
This new version has cleaner scheduler format. There is a round-robin and a rate scheduler,
which are packaged up so that a single thread now acts as the scheduler and was moved to the header file.
(Yes indeed, a thread is the thread scheduler.)
A new structure hides the protothread structs and is set up using one function call per thread:
thread_identifier = pt_add(function_name, thread_scheduler_rate)
Where thread_identifier is an int.
The rate can be set from 0 to 4 with each increasing integer value halfing the scheduled rate.
Setting the rate to a value greater than 4 freezes the thread until the rate is set between 0-4.
If you are using the round-robin scheduler, then the rate parameter has no effect.
The rate can later be modified with
PT_SET_RATE(thread_identifier, new_rate)
and read with
PT_GET_RATE(thread_identifier)
The last few lines of main will be:
PT_INIT(&pt_sched) ;
CHOOSE the scheduler method:
pt_sched_method = SCHED_RATE ; // or SCHED_ROUND_ROBIN ;
The scheduler never exits
PT_SCHEDULE(protothread_sched(&pt_sched));
There is a variant (1.3.3) on this version on the UART/Serial page that handles two UARTs.
One for console/machine communication, and another for machine communication.
If you turn on UART2 in the 1.3.3 config file by defining use_uart_serial, then U2TX is on RB10 and U2RX is on RA1.
If you turn on aux UART by defining use_uart_serial_aux, then U1TX is on RB7 and U1RX is on RB13.
(see serial page ProtoThreads support section),

This demo code uses improved scheduler syntax.
Test example SECABB_test_1_3_2.c (code, ZIP) includes:

DAC with DDS and with appropriate SPI share with port_expander
Port_expander running keypad with appropriate SPI share with ISR (critical sections)
Keypad attached to port_expander with shift-key.
Blink the LED attached to RA0.
TFT showing system time and keypad button presses
Serial showing protothreads version, and a command line for setting parameters
Note that you can choose human input or machine input routines.
- DDS frequency. Use command f 200 to set 200 Hz.
- Toggle between human and machine input using command m.
  In machine mode, there is NO visual feedback to the serial console.
- Set terminator conditions for machine input mode only.
  - Syntax is t char_number max_chars time_out
    See ascii table for char_number.
    Time out is in milliseconds.
  - Terminator condition defaults to <enter> with no count and no time-out.
  - Refer to ascii table for numerical equivalents of characters.
    The character # is ascii 35. The character <enter> is ascii 13.
  - Use command t 35 10 1000 to change machine-mode terminator character to '#', or terminate at 10 characters, or after 1 second.
  - Setting any of the three conditions to zero disables it. For example setting
    command t 0 5 0 waits for exactly 5 characters only.
  - Command t 13 0 2000 terminates on a <enter>, called 'carriage return' in the ASCII table, or after 2 seconds.

Protothreads Function Summary
The following table give the syntax for the thread macros and functions written by Adam Dunkels.

Protothreads Functions (from Adam Dunkels)	Description
`PT_INIT(pt)`	Initialize a protothread.
`PT_THREAD(name_args)`	Declaration of a protothread. name_args The name and arguments of the C function implementing the protothread.
`PT_WAIT_THREAD(pt, thread)`	Wait until a child protothread completes. pt A pointer to the protothread control structure. thread The child protothread with arguments
`PT_WAIT_UNTIL(pt, condition) PT_WAIT_WHILE(pt, condition)`	Wait the thread and wait until/while condition is true.
`PT_BEGIN(pt)`	Declare the start of a protothread inside the C function implementing the protothread. pt A pointer to the protothread control structure.
`PT_END(pt)`	Declare the end of a protothread. pt A pointer to the protothread control structure.
`PT_EXIT(pt)`	This macro causes the protothread to exit. If the protothread was spawned by another protothread, the parent protothread will become unblocked and can continue to run. pt A pointer to the protothread control structure.
`PT_RESTART(pt)`	This macro will block and cause the running protothread to restart its execution at the place of the `PT_BEGIN()`call. pt A pointer to the protothread control structure.
`PT_SCHEDULE(f)`	This function shedules a protothread. The return value of the function is non-zero if the protothread is running or zero if the protothread has exited. f The call to the C function implementing the protothread to be scheduled
`PT_SPAWN(pt, child, thread)`	Spawn a child protothread (example) and block the thread until it exits. Parameters: pt A pointer to the protothread control structure. child A pointer to the child protothread’s control structure. thread The child protothread with arguments
`PT_YIELD(pt)`	yield the protothread, thereby allowing other processing to take place in the system.
`PT_YIELD_UNTIL(pt, cond)`	Yield from the protothread until a condition occurs.
`PT_SEM_INIT(s, c)`	This macro initializes a semaphore with a value for the counter. Internally, the semaphores use an "unsigned int" to represent the counter, and therefore the "count" argument should be within range of an unsigned int. s A pointer to the pt_sem struct representing the semaphore
`PT_SEM_SIGNAL(pt, s)`	This macro carries out the "signal" operation on the semaphore. The signal operation increments the counter inside the semaphore, which eventually will cause waiting protothreads to continue executing. pt A pointer to the protothread (struct pt) in which the operation is executed. s A pointer to the pt_sem struct representing the semaphore
`PT_SEM_WAIT(pt, s)`	This macro carries out the "wait" operation on the semaphore. The wait operation causes the protothread to yield while the counter is zero. When the counter reaches a value larger than zero, the protothread will continue.

The following table has the protothread macro extensions and functions
I wrote for the PIC32 and which are included in the Cornell header file.

`Added Protothreads functions`	Description
`PT_YIELD_TIME_msec(delay_time)`	Causes the current thread to yield (stop executing) for the `delay_time` in milliseconds. The time is derived from a 1 mSec timer ISR. Note that this is a non-blocking delay, not an interval timer.
`PT_GET_TIME()`	Returns the current millisecond count since boot time. Overflows in about 5 weeks. The time is derived from a 1 mSec timer ISR.
`PT_RATE_INIT()`	Sets up variables for the optional rate scheduler. Called once before main schedule loop.
`PT_RATE_LOOP()`	House keeping for the optional rate scheduler. Called once at the beginning of the main schedule loop.
`PT_RATE_SCHEDULE(f,rate)`	For thread `f`, set the `rate`=0 to execute always, rate=1 to execute every other traversal for PT_RATE_LOOP, rate=2 to every fourth traversal, rate=3 to every 8th, and rate=4 to every 16th. rate=5 DISABLES the thread.
`PT_DEBUG_VALUE(level, duration)`	Causes a voltage `level` from 0 to 15 (1 implies ~150 mV) to appear at pin 25 (CVrefOut) for `duration` microseconds (approximately). Zero duration means hold the voltage until changed by another call. To use this, `config_1_x_x.h` must contain `#define use_vref_debug`
`int PT_GetSerialBuffer(struct pt *pt)`	A thread which is spawned to get nonblocking string input from UART2. This function assumes that a human is typing at a terminal and thus supports backspace, termination by enter key and echos characters! String is returned in `char PT_term_buffer[max_chars].` If more than one thread can spawn this thread, then there must be semaphore protection. Control returns to the scheduler after every character is received. The thread dies when it receives an `<enter>.` To use this, `config_1_x_x.h` must contain `#define use_uart_serial`
`int PT_GetMachineBuffer(struct pt *pt)`	A thread which is spawned to get nonblocking string input from UART2. This function assumes that a machine is sending data and thus does not assume any particular termination condition and does not echo characters. You can specify termination method. A terminator character (PT_terminate_char), or a character count (PT_terminate_count), or a timeout (PT_terminate_time). A zero for any of the three disables that termination. String is returned in `char PT_term_buffer[max_chars].` No string is returned if there is a timeout. If more than one thread can spawn this thread, then there must be semaphore protection. Control returns to the scheduler after every character is received. The thread dies when it receives an `termination condition.` To use this, `config_1_x_x.h` must contain `#define use_uart_serial The 1_3_2 version uses DMA transfer for higher speed.`
`int PutSerialBuffer(struct pt *pt)`	A thread which is spawned to send a string input from UART2. String to be sent is in `char PT_send_buffer[max_chars].` If more than one thread can spawn this thread, then there must be semaphore protection. Control returns to the scheduler after every character is loaded to be sent. The thread dies after it sends the entire string. To use this, `config_1_x_x.h` must contain `#define use_uart_serial`
`int PT_DMA_PutSerialBuffer(struct pt *pt)`	A thread which is spawned to send a string input from UART2. String to be sent is in `char PT_send_buffer[max_chars].` If more than one thread can spawn this thread, then there must be semaphore protection. Control returns to the scheduler immediately. The thread dies after it sends the entire string. To use this, `config_1_x_x.h` must contain `#define use_uart_serial`
`void PT_setup (void)`	Configures system frequency, UART2, a DMA channel 1 for the UART2 send, system timer, and the debug pin Vref controller, depending on the features chosen in `config_1_x_x.h`.
Added in version 1_3_2	-- System time switched to Timer1 -- DMA GetMachineBuffer! -- Modified scheduler! See example code! The 1_2_3 code style still works.
thread_identifier = pt_add(function_name, thread_scheduler_rate)	A thread is now defined only by it's function name and there is an addition rate scheduler setting. See example code! The old code style still works.
PT_SET_RATE(thread_identifier, new_rate)	Sets a new relative execution rate for a thread.
PT_GET_RATE(thread_identifier)	Reads the current relative execution rate for a thread.
PT_INIT(&pt_sched) ; pt_sched_method = SCHED_ROUND_ROBIN ; // or // pt_sched_method = SCHED_RATE PT_SCHEDULE(protothread_sched(&pt_sched));	Main now starts a scheduler thread (see examples). Using the SCHED_RATE option executes some threads more often then others: rate=0 fastest, rate=1 half, rate=2 quarter, rate=3 eighth, rate=4 sixteenth, rate=5 or greater DISABLEs thread! In round_robin mode, the rate parameter has no effect.

Development History and Details:

Version 1_2_3: Works with the Big development board (SECABB)
Conversion to use the big development board.
U2TX moved to RB10. U2RX moved to RA1. Use the cable specified on the serial page.
Use config_1_2_3.h and pt_cornell_1_2_3.h on the big development board.
You must still select the hardware features you need in config_1_2_3.h.
---- Turning on the use_vref_debug feature disables RB14 for everything except Vref output.
---- Turning on the use_uart_serial feature disables RB10 and RA1 for everything except USART2.
Peripherial test example (code, ZIP) includes:
(1) DAC with DDS and with appropriate SPI share with port_expander
(2) Port_expander running keypad with appropriate SPI share with ISR (critical sections)
(3) keypad attached to port_expander with shift-key.
(4) TFT showing system time, color patches, and keypad button presses
(5) Serial with formatting showing protothreads version,
and a formatted command line for setting DDS frequency and showing system time.
Added features:
(1) A get_machine_serial_buffer routine designed for machine (non-human) communication.
(2) Better format control for serial console interactions with humans.
(3) Higher default baud rate in config file. You may need to change this!

Version 1_2_2: Works with the Big development board
Conversion to use the big development board.
U2TX moved to RB10. U2RX moved to RA1. Use the cable specified on the serial page.
Use config_1_2_2.h and pt_cornell_1_2_2.h on the big development board.
Example code is on the Development board page. Search for PuTTY on that page.

TEST Version 1_3_0: Works with the Big development board (SECABB), but is to be considered BETA
This new version has cleaner scheduler format, improved readability, but no new functionality. As before, there is a round-robin and a rate scheduler, but they are packaged up so that a single thread now acts as the scheduler and was moved to the header file. (Yes indeed, a thread is the thread scheduler.) A new structure hides the protothread structs and is set up using one function call per thread:
thread_identifier = pt_add(function_name, thread_scheduler_rate)
The rate can be set from 0 to 4 with each increasing integer value halfing the scheduled rate.
Setting the rate to a value greater than 4 freezes the thread until the rate is set between 0-4.
If you are using the round-robin scheduler, then the rate parameter has no effect.
The rate can later be modified with
PT_SET_RATE(thread_identifier, new_rate)
and read with
PT_GET_RATE(thread_identifier)

The last few lines of main will be:
PT_INIT(&pt_sched) ;
CHOOSE the scheduler method:
pt_sched_method = SCHED_RATE ; // or SCHED_ROUND_ROBIN ;
The scheduler never exits
PT_SCHEDULE(protothread_sched(&pt_sched));

The demo code has the same functions as Version 1_2_3 above, but with the improved scheduler syntax.
(Demo code, config_1_3_0, pt_cornell_1_3_0, project ZIP)
Based on the work of edartuz and properly licensed in the code.

A very simple test program in which there are three threads that each just increment a counter, then yield (plus a print thread and an 100 KHz ISR) gives an estimate of the number of thread yields/sec the scheduler can sustain. For the rate scheduler, and at maximum rate (rate=0) for each of the three, about 3.2x10⁵ yields/sec is possible. At rates of 0,1, and 2 for the three threads respectively about about 2.1x10⁵ yields/sec. For the simpler round-robin scheduler, which ignores the relative thread rates, you get about 4.2x10⁵ yields/sec. Thus, for light-weight, fast executing, threads you probably want to just use the simpler scheduler. Turning off the ISR in the round-robin test program increases the thread rate to 8.6x10⁵ yields/sec.

What about threads that have a fixed execution speed ratio? By matching the thread rate schedule rate to the thread compute time, you can approximate a rate-monotonic scheduler using the rate scheduler. For three threads which take 1,2, and 4 mSec respectively to execute, and using the round-robin scheduler, the best you can get is about 143 executions for each of the three threads in 1 second because each one must wait for the others (1000 mS/(1+2+4)=143). The rate scheduler, set up with rates 0, 1, and 2 respectively gives 336 executions (per second) for the fast thread, 168 for the medium speed thread and 84 for the slow thread. So the fast one executes more, the slow one less. Overall, I think that most of the time you want ot use the simple round-robin scheduler.

Version 1.2.1:
Fixes a bug caused by starting a second DMA-to-UART DMA burst, before the UART transmit FIFO is empty. pt_cornell_1_2_1.h.
// Wait for the DMA tranfer to complete -- existed in 1.2 PT_YIELD_UNTIL(pt, DmaChnGetEvFlags(DMA_CHANNEL1) & DMA_EV_BLOCK_DONE); // Wait until the UART transmit buffer is empty -- added in 1.2.1 based on section 21.5.2 of Reference Manual PT_YIELD_UNTIL(pt, U2STA&0x100);

Version 1.2:
To run protothreads 1.2 you need to download config.h, pt_cornell_1_2.h, plus the TFT routines, or the project ZIP (see below) file.
The main change from Version 1.1 is a fix for the limitation on having any thread-yield statement inside a switch statement.
This version allows yield, spawn, and wait statements anywhere. This version depends on documented, but seldom used, features of GCC.
You must still select the hardware features you need in config.h.
--Turning on the use_vref_debug feature disables pin 25 for everything except Vref output.
--Turning on the use_uart_serial feature disables pins 21 and 22 for everything except the USART
--Make sure that all of the special feature pins are disabled so that you can use them as i/o. Select:
-- #pragma config POSCMOD = OFF, FWDTEN = OFF, FSOSCEN = OFF, JTAGEN = OFF, DEBUG = OFF

Keyboard Thread control: The first example code takes keyboard commands (thread 3) to spawn a thread, or signal a thread, or wait from inside a switch statement.
Three LEDs must be connected as explained in the code comments. This code will not work in Version 1.1.
The ZIP file has serial support and debug enabled in config.h.
Output Compare Units: This example starts two output compare units connected to Timer 2 with settable pulse lengths:
- OC3 mapped to pin 18 and running a PWM with rising edge at the timer event. If the pwm_on_time is set to zero, the pin never goes high. If pwm_on_time is greater than the timer period, then it never goes low. If pwm_on_time = (timer period)/2, then the duty cycle is 50%.
- OC2 mapped to pin 14 and running a pulse with rising and falling edges settable to arbitrary phase within the timer cycle.
The program uses serial communication to set the parameters, so the serial support must be enabled in config.h.
The Timer 2 ISR just toggles a pin as a trigger reference for testing.
ADC setup: This example uses the TFT to display the ADC (as in Version 1.1 below)
The ZIP file therefore has serial support and debug disabled in config.h because the pins are used by the TFT.
TFT alternate SPI channel: This example uses the TFT to display the ADC (as in Version 1.1 below)BUT uses SPI channel 2 to run the TFT display.
The project which reads the AN11 analog input and draws the voltage on the TFT using the SPI2 master and Protothreads 1.2 is ZIPPED here.

Version 1.1:
To run protothreads 1.1 you need to download config.h, pt_cornell_1_1.h, plus the TFT routines, or one of the project ZIP (see below) files.
The ProtoThreads include file structure has been simplified.This version of Protothreads uses a switch-statement type construct to handle thread switching, so it is not possible to embed a thread-wait statement in a switch stanza.
The config.h file now sets:

CPU clock speed and peripheral bus speed
Defines macros for CPU and peripherial speeds, which correctly sets the thread-timer millisceond time tick clock.
By default this version of protothreads starts timer5 and uses a timer ISR to count milliseconds.
Defines macros to enable the UART and Vref output pins, if desired. These must be disabled (commented out) to use TFT.
You must uncomment the #define to activate these features.
#define use_vref_debug turns pin 25 into a settable voltage source as desribed in the table below
#define use_uart_serial activates pin 21 and pin 22 as a console interface.
Defines the terminal BAUD rate if you are using the serial options.

There are examples:

Serial support and debugging pin. The serial test code also requires a UART connection to a terminal, as explained in a project further down the page. The test code toggles three i/o pins and supports a small user interface through the UART.It also emits three different amplitude debugging pulses on pin 25. You must uncomment the defines mentioned above to run this code. (ZIP for serial)
TFT support with serial and dubugging turned off. The Protothreads library has appropriate defines disabled (see above).
The (ZIP for TFT) includes the project, source and libraries for TFT and for Protothreads. Test codes:
- TFT_animation_BRL4.c. -- bounces a ball with gravity and drag, using 16:16 fixed point arithmetic
- TFT_ADC_read.c -- reads the AN9 ADC channel (pin 26) then prints raw ADC counts, floating voltage, and 16:16 fixed point voltage
- TFT_test_BRL4.c. -- displays color patches, system time, and moves a ball.

Bugs to be fixed:

FIXED in 1.1: in the pt_cornell header file, the variable time_tick_millsec should be made volatile
FIXED in 1.1: The timer 5 ISR is named Timer2Handler, this is not a problem unless you also define a timer 2 handler with the same name

Timers, Output Compare, PWM, and Input Capture
All of the following examples use Protothreads. PIC architecture separates timers, from compare units and from input capture. This means that one timer can drive several output compare units for waveform generation, or act as a time reference for several input compare units. In all the examples, the cpu is running at 64 MHz and the peripherial bus at 32 MHz. It might be safer to run everything at 40 MHz.
-- This example sets up timer2 to drive two pulse trains from OC2 and OC3. Either of these pulse trains can be hooked to an input capture unit, which uses timer3 as a time reference. Timer three is set up to overflow so that periods are correct when computed from sequential edge capture times. The print thread prints out the generated interval, and the min, max and current value of the captured interval. The command thread listens for user input to set the timer2 period, and a one second clock thread gives system time (using timer5, as explained below in the protothreads section). The example code.
-- Example 2 sets up OC3 as a PWM unit with settable timer2 period (and thus PWM resolution) and settable PWM on-time. The on-time is then auto incremented in the timer2 ISR to sweep the on-time from zero to the timer2 period. Setting the timer2 period and OC2 pulse period in the user interface thread is cleaner.
-- Example 3 sets up OC3 as a PWM unit with timer2 period (and thus PWM resolution) equal to 64 cycles (500 kHz). PWM on-time is set by a sine wave Direct Digital Synthesis (DDS) unit. The frequency synthesized is set by the UART user interface. The PWM output (Pin 18) must be passed through an analog lowpass filter. Choose the time constant of the filter consistent with the frequencies you wish to generate. Spectral purity is about 32 db at low frequencies. You could get better spectral purity by increasing the PWM resolution, but that, of course, lowers the sample rate. Eight-bit samples have a PWM sample rate of 125 kHz.

=============================

Version 1.0
To run protothreads you need to download pt_cornell.h and you need to download software from Dunkels' site or use a local copy. The Example1 test code also requires a UART connection to a terminal, as explained in a project further down the page. The test code toggles three i/o pins and supports a small user interface through the UART.It also emits three different amplitude debugging pulses on pin 25. By default this version of protothreads starts timer5 and uses a timer ISR to count milliseconds.

=============================
To run these older examples you need to download software from Dunkels' site or use a local copy. Most examples also require a UART connection to a terminal, as explained in a project further down the page.
Older examples:
-- The first example has two threads executing at a rate based on a hardware timer ISR, which generates a millisecond time counter. Each thread yields for a waiting time and when executing prints the thread number and time. Thread 1 executes once per second. Thread 2 executes every 4 seconds. Main just sets up the timer ISR and UART, then inintialzes the threads and schedules them.
-- The second example has three threads. Threads 1 and 2 wait on semaphores, each of which is signaled by the other thread. The two threades therefore alternate. Thread 3 just executes every few seconds. I defined an new macro to make it easier for a thread to wait for a specific time. PT_YIELD_TIME(wait_time) takes the wait time parameter and uses a local variable and the millisceond timer variable to yield the processor to another thread for wait_time milliseconds. The second example also has a small routine to compute approximate microseconds since reset and return it as a 64-bit long long int.

#define PT_YIELD_TIME(delay_time) \
    do { static int time_thread; \
    PT_YIELD_UNTIL(pt, milliSec >= time_thread); \
    time_thread = milliSec + delay_time ;} while(0);

-- The third example has three threads. Threads 1 and 2 wait on semaphores, each of which is signaled by the other thread. The two threads therefore alternate. Thread 3 takes input from a serial terminal. The actual input routine is a thread which is spawned by thread 3. Thread 3 then waits for the input thread to terminate which it does when the human presses <enter>. The input thread yields the processor while it is waiting for the slow human to type each character, so other threads do not stall. The key statment is below which causes protothreads to wait/yield on a hardware flag. The flag is defined as part of plib.h.
PT_YIELD_UNTIL(pt, UARTReceivedDataIsAvailable(UART2));
Note that the spawn command
PT_SPAWN(pt, &pt_input, GetSerialBuffer(&pt_input) );
initializes the input thread and schedules it. The three parameters are the current thread structure, a pointer to the spawned thread, and the actual thread function. If more than one thread is using serial input, then the spawn command should be surrounded by semaphore wait/signal commands because GetSerialBuffer is not reentrant.
-- The fourth example investigates non-blocking UART transmit. In a printf, there is a waitloop for each character. We can replace that with thread yield on a per character basis. Doing this speeds up processing a factor of 2 or so. But how fast is the swtich between two threads? Is it worth a thread yield on every character? Commenting out all UART code and just waiting/signaling on a semaphore between thread 1 and thread 2 gives a switch time between threads (twice) of 2.1 microcseconds or about 126 cpu cycles. This value includes the signaling, waiting, and thread switch code two times (thread 1 to thread 2 and back). For a 1 mSec charcter transmit time, the thread switch is worth the overhead.
-- The fifth example implements a terminal command interface in thread 3 using non-blocking UART send/receive. Thread 1 and 2 toggle and are dependent upon signalling each other unless turned off by a flag from the interactive input. . Thread 4 just toggles at a fixed rate, unless it is turned off by the interactive input, working through the scheduler. The code assumes that port pins B0, A0, and A1 are connected to LEDs (with 300 ohm resistor to ground). Hitting the <enter> key to finish a command results in a 9 microSec pause in the toggling of the other threads.
There are 8 commands:

command	effect
`t1 time`	sets blink rate of thread 1/2 to `time`
`t2 time`	sets blink rate of thread 4 to `time`
`g1`	starts thread 1/2 blink
`s1`	stops thread 1/2 blink
`g2`	starts thread 4 blink
`s2`	stops thread 4 blink
`k`	kills the interactive input until RESET
`p`	prints the current blink times

-- The sixth example runs the same interface as example five, but uses a DMA channel to drive the UART output with no software overhead. The DMA pattern matching feature detects the end of a string to stop the UART automatically. Using the DMA transfer allows a per-string thread yield, rather than a per-character thread yield. The code assumes that port pins B0, A0, and A1 are connected to LEDs (with 300 ohm resistor to ground). The thread switch after hitting <enter> now takes 5 microSec. With both t1 and t2 set to 1 milliSec, the dispersion in actual times for both is less than 10 microSec (<1%).
-- The seventh example adds a microsecond resolution yield option. The option is marginally useful down to about 10 microseconds, where the timing uncertainty reaches about 10%. At 100 microseconds the accuracy is good. This means that you could attempt audio synthesis in a thread at 10 KHz sample rate. Thread 4 is timed by the microsecond timer. With three threads running below100 microSec repeat rate, the system starts to miss events. The previous PT_YIELD_TIME macro has been replaced by two. One for millisecond timing and one for microsecond timing. The millisecond timer overflows about once/month. The microsecond timer overflows every 64 milliseconds. The maximum time delay using the microsecond timer is 64000 microseconds.

// macro to time a thread execution interval
#define PT_YIELD_TIME_msec(delay_time) \
    do { static int time_thread; \
    time_thread = milliSec + delay_time ; \
    PT_YIELD_UNTIL(pt, milliSec >= time_thread); \
    } while(0);
// macro to time a thread execution interveal
// parameter is in MICROSEC < 64000
//ReadTimer2()
#define PT_YIELD_TIME_usec(delay_time) \
    do { static unsigned int time_thread, T3, c ; \
      time_thread = T3 + delay_time ; c = 0;\
      if(time_thread >= 0xffff) { c = 0xffff-T3; }\
      PT_YIELD_UNTIL(pt, ((ReadTimer3()+c)& 0xffff) >= ((time_thread+c) & 0xffff)); \
      T3 = ReadTimer3() ;\
    } while(0);

-- The eighth example introduces a minimal scheduler which allows each thread to execute at a rate determined as a fraction of full speed. The default protothreads thread swap is so fast that it a challange to introduce scheduling which does not slow down thread execution rates. The approach taken is to allow some threads to execute every time through the main while-loop, but allow others to only execute at 1/2, 1/4, 1/8, or 1/16 of the times through the main loop. The approach is consistent with a nonpremptive thread system and gives better execution consistency if one thread has to execute at a much higher rate than the others. Rate 0 executes every time throught the loop, rate 1 every other time, 2 every four times, rate 3 every 8 times, and rate 4 every 16 times through the main while-loop. Any other value freezes the thread execution. With thread 4 executing at a nominal 10 microSec period, the actual time varys from 11 to 13 microSec, but the actual time can vary widely depending on the exact interval picked due to coincidence with other processes. This version also fixes the microsecond timer by using timer45 as a 32-bit counter.
-- Finally we get to something like a final version of the code.