In this project, we implemented a system that sends audio packets over the Internet using the Jacktrip protocol, which was specifically designed for low-latency audio for musicians. We used the PIC32 microcontroller along with an ENC28J60 ethernet chip as both a sender and receiver, communicating with a Raspberry Pi set up in the lab. Hardware components used in the project include the ADC for sampling audio played by a musician, various DMA channels for data movement, SPI for communicating with peripherals and the ENC28J60 chip, and the DAC for playing received audio packets. We utilized a modified version of the Microchip network stack to handle the details of the transport layer and below, ultimately settling on UDP as the transport protocol for its speed and simplicity.
COVID-19 has made it impossible for musicians around the world to meet in person and rehearse together. Unfortunately, most video calls or even simple audio calls are incapable of synchronizing audio produced by musicians playing remotely due to long network delays. In most cases network latency is not a problem for verbal communication because dialogue naturally follows a turn-based structure; the listener will simply wait until they hear what the speaker has said before speaking themselves. In musical ensembles, however, musicians are expected to perform together in perfect synchrony, meaning that any significant network delay will wreak havoc. Researchers at Stanford developed a multi-machine, uncompressed audio streaming system called Jacktrip to address this exact issue along with a Raspberry Pi equipped with Jacktrip intended for the general public to use. Our project explores the feasibility of using a PIC32 instead of a Raspberry Pi as the processing device, which would considerably reduce the cost to the consumer. The main advantages of using a full computer such as a Raspberry Pi include its built-in network peripherals and OS services which abstract away the details of network functions. On the other hand, the OS may unnecessarily hinder the data transfer rate due to preemptive multitasking and other sources of overhead. With a PIC32, we have to interface the microcontroller with an external Ethernet chip and handle the communication between the chip and the PIC, but we are also able to avoid the overhead of an OS while throwing the full resources of the PIC at the task at hand.
A key requirement is the bidirectional streaming data rate for 48KHz
uncompressed audio.
(MathML fractions do not format correctly on Chromium-based browsers.)
The primary project specifications include the ability to send and receive single-channel Jacktrip audio packets on the network using a PIC32. Since the PIC does not have hardware support for the physical and link layers, we obtained a Microchip ENC28J60 Ethernet controller chip to do so. The ENC28J60 communicates with the PIC via SPI. Other project requirements include being able to transform audio signals sampled by the ADC into network packets and converting received packets into audible sound using a DAC. A Raspberry Pi serves as the other Jacktrip-enabled end host and is used to verify that packets are being sent by the PIC correctly. The Pi is also used to transmit packets to the PIC to test the receive functionality on the PIC.
When a microphone or other audio capture device (in our case, we tested using the audio output from the lab desktop computers) produces an analog signal on AN11, we use the ADC - triggered by timer 3 - to sample the signal. Since we want to save CPU cycles whenever possible to perform critical functions, we use DMA to transfer the data in the ADC buffer through SPI to buffers on the ethernet chip. To achieve maximal performance, the DMA functions had to be modified to be non-blocking, as explained in the next section. Once the data is on the ENC28J60 chip, we call functions from the Microchip network stack to initiate a UDP packet send to the destination IP address of the Raspberry Pi.
When the Pi sends data to the PIC, packets are accumulated in the ENC28J60's internal receive buffers. Again, we use DMA to move the data from the ethernet chip onto the PIC's memory to save precious CPU cycles. From there, the audio data is written to the MCP4822 DAC, which also communicates with the PIC via SPI.
Tradeoffs in hardware and software implementations affected our design process and results. For instance, as mentioned before, data transfer can either be accomplished by CPU load and store functions or using the DMA. While using CPU software is easier to code, we likely would not have been able to meet our timing constraints due to wasted cycles. The DMA is more complicated to set up, but it operates completely independently of the software, meaning the CPU is free to run more high-level control code.
Our software drew heavily from the Microchip network stack modified by
Alex Whiteway. The network stack has several high-level application examples
available for use, most of which are removable (to gain higher performance)
simply by commenting out a few lines at most in ethernet_entry.c
.
That network stack also requires the naming of a few pins using the latch
register bit structs.
Configuring the network stack is straightforward. The stack is set up to
use non-framed SPI mode, so you must select a chip select pin using the
tristate and latch register bit structures. We chose pin B3 as our chip
select pin and placed the following lines in HardwareProfile.h
:
#define ENC_CS_TRIS TRISBbits.TRISB3
#define ENC_CS_IO LATBbits.LATB3
#define _ENC_USE_SPI_1
We also included the following lines simply as a memory aid.
#define ENC_SDO_TRIS TRISBbits.TRISB8
#define ENC_SDI_TRIS TRISBbits.TRISB13
#define ENC_SCK_TRIS TRISBbits.TRISB14
If your ENC28J60 uses SPI2, you would have to use
#define _ENC_USE_SPI_2
in the appropriate place.
TCPIPConfig.h
, we chose to use the DHCP client, DNS,
the Berkeley sockets API, and the ICMP server by uncommenting the appropriate
options. We also changed the default IP address to 0.0.0.0 to force our
application to wait for DHCP configuration. That is accomplished by
changing the appropriate define constants.
#define MY_DEFAULT_IP_ADDR_BYTE1 (0ul)//(192ul)
#define MY_DEFAULT_IP_ADDR_BYTE2 (0ul)//(168ul)
#define MY_DEFAULT_IP_ADDR_BYTE3 (0ul)//(1ul)
#define MY_DEFAULT_IP_ADDR_BYTE4 (0ul)//(120ul)
The original network stack blocks on every SPI transfer. Most notably,
it blocks on all of the packet data. That means that at least 16 CPU cycles
(assuming a 40MHz system clock and a 20MHz SPI clock) are wasted while
shifting out every packet byte. Considering that the packets are 144 bytes
in size, the network stack easily becomes CPU bound.
With overhead, we could only achieve around
650 kbps in a bidirectional fashion, and during that test, we did not implement
any significant data processing such as reading from the ADC or writing
to the DAC.
It was straightforward to boost performance by simply using DMA channels to
read and write the packet data. To do that, we defined and implemented
nonblocking, DMA-based functions in UDP.h/UDP.c
and in
mac.h/ENC28J60.c
. They use DMA channels 2 (to write to
SPI1BUF) and 3 (to read from SPI1BUF). When sending a packet, we do not care
what is received as the ENC28J60 has no documented full-duplex SPI operation.
So, we can simply use DMA channel 2 to send the packet buffer, saving many
CPU cycles.
When receiving a packet, DMA channel 2 must be set as auto-enable so that
a minimally-sized dummy byte can be repeatedly read and sent and disabled
when the packet send is complete. A global variable blocks other SPI1
transfers from occurring while a DMA transfer to SPI1 is occurring, state
machine states poll the appropriate DCH#CON register to determine if we
have completed a packet send or receive, and handles the de-assertion of
the ENC chip select signal.
We used DMA channels 0 and 1 to load double-buffered packets directly
from the ADC. Since Jacktrip uses s.15 fixed point for 16-bit audio
transport, we simply had to copy the output from the ADC into the
packets. Channel 1 is chained from channel 0. Both channels raise
block transfer done interrupts whose ISRs signal the network routines
that there is a packet available to send. Channel 1's ISR also re-enables
channel 0 because we could not satisfactorily chain the channels to
each other.
When a packet is received, the receiver code sets playback start and end
pointers. The Timer3 ISR writes a sample to the DAC at 48000 Hz when
there is a sample available, and does nothing otherwise. Timer3 also
signals the ADC to start a conversion.
Local audio loopback is achieved by having the DMA 0 and DMA 1 ISRs
set the playback pointers and not the receiver code.
A Raspberry Pi ran Jacktrip and dnsmasq (for DHCP). Since Cornell may change the eduroam/RedRover 10-space address of the Pi, the Pi needed a way to share its ip address automatically. We used a cron job that checked every five minutes for a new IP address and pushed updates to a git repo as necessary.
#!/bin/bash
#Example update IP script
#crontab -e
#append the following line (uncommented) to the end of the user crontab to
#update every 5 minutes
#this script assumes that update_ips is the git repo that you want to push to.
#*/5 * * * * /home/pi/update_ips/update.sh
pushd /home/pi/update_ips
CURR_IP=$(hostname -I)
HOSTNAME=$(hostname)-$(ifconfig wlan0 | grep -o -E '([[:xdigit:]]{1,2}:){5}[[:xdigit:]]{1,2}')
if [[ -e $HOSTNAME ]]; then
REC_IP=$(cat $HOSTNAME)
else
REC_IP=0
fi
if [[ $CURR_IP != $REC_IP ]]; then
hostname -I > $HOSTNAME
git pull
git add $HOSTNAME
git commit -m "update $HOSTNAME"
git push
fi
popd
It is best to compile Jacktrip from source. You need to install the qt5-qmake
and libjack-jackd2-dev packages (which should install all other dependencies
for building), assuming you are running Raspbian Buster.
Get the source from the Jacktrip Github repo. Make a shadow
build directory (i.e. jacktrip/build) and run qmake ../src/jacktrip.pro
in that shadow build directory. Then, run make. Make install is not required.
If you are running a different flavor of Linux (i.e. Arch Linux, RHEL, CentOS) perhaps
not on a Pi, be aware that there might not be recent JACK or Qt packages
available. JACK, Qt, Qjackctl, and Jacktrip are straightforward to build and
give decent performance with recent versions of GCC. The prebuilt Jacktrip
package on Debian-based Desktop OSes is often out-of-date, but the core functions
implemented in this project have not changed between Jacktrip 1.2 (prepackaged)
and Jacktrip 1.3 (on Github).
JackTrip does run on Windows and MacOS as well. There are prebuilt packages
for each, although it is very useful to have a DHCP server on your testing
network.
The packets sent over the network receive so much distortion that they are essentially unusable. However, a clue may reside in the aliasing that occurs at 17:10 in the video above. It seems like not many packets are actually making it out of the device.
2021-05-18T18:29:14 10.253.98.244 send: 0/0 recv: 2077/4 prot: 292302/0/0 tot: 448144 sync: 10/0/0/4/290225 skew: -290229/-290229 bcast: 0/0 autoq: 1.9/29.5
2021-05-18T18:29:17 10.253.98.244 send: 0/0 recv: 3101/4 prot: 436462/0/0 tot: 593530 sync: 10/0/0/4/433361 skew: -433365/-433365 bcast: 0/0 autoq: 1.9/24.8
From the above logging present in Jacktrip on the Raspberry Pi, it was clear
that thousands of packets per second were lost (first number in "prot" section)
which caused thousands of receive buffer underruns (first number in "recv" section).
We were unable to rectify the problem in the course of this project, but we
suspect that it may be due to the fact that the ENC28J60 does not easily support
10Mbps full-duplex mode. We tried setting a switch in ENC28J60.c and using
ethtool on the Raspberry Pi as
sudo ethtool -s eth0 speed 10 duplex half autoneg off
(and to reverse)
sudo ethtool -s eth0 autoneg on
but were unsuccessful in getting the ENC28J60 to connect as a full duplex
device. The switching overhead is likely quite high, so it is unlikely that
half duplex mode is appropriate for streaming.
We have said several times that plain Jacktrip (which the PIC32 runs) does
not encrypt its audio data. So, that poses a privacy issue that we hope that
we have identified in the spirit of section I-1 of the IEEE Code of Ethics.
Other than that, our device minimizes risk to the public by using low voltages
which are attainable through an isolating power converter.
We do believe that it is ethical to work towards a world where more people have
access to effective technology. Realtime audio transmission is certainly among that
technology.
Part | Cost ("rental") | Quantity | Vendor | Part Number |
PicKit 3 | $1 | 1 | ||
Big Board | $10 | 1 | ||
Remote Access Board | ??? | 1 | ||
Power Supply | $5 | 1 | ||
PIC32MX250F128B | $5 | 1 | ||
ENC28J60-based Ethernet Board | $7.69 | 1 | Amazon | B00WX1NRO0 |
Ethernet Cable | ??? | 1 | ||
Loaner Raspberry Pi 4, 8GB | ??? | 1 | ||
Loaner 2.5A+ 5V USB-C power supply | ??? | 1 | ||
Loaner SD Card, 16GB | ??? | 1 | ||
F-F jumper wire | $0.10 | 8 | ||
Total | $29.49 |