As a part of my project to scale beyond the ~4000 pixels that Teensy 3.2 supports at 60Hz, I’m looking into ways of using multiple Teensy 3.2’s and farming out pixels to each of them. The idea would be to build a simple “Branch Controller” consisting of a Teensy 3.2, a WIZnet W5500 Ethernet adapter, and an OctoWS2811 adapter.
The first part of the project was just making sure that I could use the W5500 with Teensy 3.2. I had bought the Arduino Shield version of the W5500 ($23). The whole Arduino Shield is large and clunky and includes an SD card slot I don’t need, so I really should have used a WIZ850io ($20) which is just the ethernet adapter in a much more compact form, so I switched to that:
To get the Ethernet to work with the Teensy 3.2, all I had to do was make six connections (see the picture above):
Teensy 3.2 Pins | W5500 Ethernet Shield Pins | WIZ850io Pins (pinout) |
---|---|---|
10 – SCS | D10 | SCNn |
11 – MOSI | D11 | MOSI |
12 – MISO | D12 | MISO |
13 – SCLK | D13 | SCLK |
GND | GND | GND |
Vin (3.6 to 6.0 Volts) | 5V | |
3.3V | 3.3V |
Then I connected the W5500 to my local area network and plugged the Teensy into the computer. From the Arduino IDE, I loaded the Ethernet > Web Server example, and uncommented the line Ethernet.init(10) in setup(). After running this I had a web server running and everything seemed to be working perfectly.
Next step: I want to see how much data I can push down to the Teensy how fast. The W5500 is a 10/100 ethernet board, but the serial protocol it’s using to talk to the board will slow that down a lot.
The first experiment I did was just a minimal web server running on the Teensy. For the client, I ran node.js calling request-promise in a loop. This was able to make about 100 HTTP requests per second, which could provide a decent frame rate, but there is no data payload yet.
Next I added some payload to try to figure out the bandwidth of a Teensy. The first experiment I did got about 100 kilobytes/sec. This could handle pixel-level data for:
555 pixels | at 60 fps |
1110 pixels | at 30 fps |
4416 pixels | at 7.5 fps |
That’s pretty disappointing; it almost defeats the purpose of using the OctoWS2811 to drive 8 separate strands. According to Paul Stoffregen’s benchmark, I ought to be able to get about 958 kilobytes per second. Which would be enough for my needs! So there was obviously some kind of optimization I’m missing.
Looking closely at the sample web server code I was using, I noticed that it was set up to read one byte at a time, no matter how many bytes were available. A modification to the Teensy code to read blocks of bytes into a 256 byte buffer got much better results; I was able to get up to 500 kilobytes per second! Which translates to:
2777 pixels | at 60 fps |
4416 pixels | at 37 fps |
This is a significant improvement and probably adequate, but I wasn’t happy.
I wondered if the fact that the test computer was on WiFi instead of a wired LAN could be the bottleneck. Sure enough, moving my computer to LAN dramatically improved the throughput, and I got 1,125,093 bytes across in a second, which was faster than even Paul’s benchmark. This would translate to a frame rate of 85 frames per second with 4416 pixels, well above the 60 fps limit of the WS2812b-type LED strips!
Finally, I tried combining the Ethernet and LED code into one script (branchController) to see how the timing was. The ideal architecture, I thought, would be to open a client connection to the branchController every time you have another frame to send. The trouble with this method is that there is too much overhead to opening each connection. I only got about 20 frames per second doing it this way. Another option is to send four frames at a time (connecting 15 times per second)… this worked fine (but as the code is structured right now, probably freezes LED refresh unnecessarily while the TCP connection is happening).
What I think I’m really going to need to do is open a connection and keep it open, then stuff down one frame of data whenever I have one. That will require changing the protocol a bit so that the connection is expected to stay open, which will require rejiggering the code a bit, but I’m pretty confident that is going to work and have the performance that I expect.