Saturday, January 18, 2020

TTL Brainfuck Computer Part 9 - Clocking Back In

Part 1 (first) - Part 8 (prev) - Part 10 (next)

After sitting in a box for two years over two moves, the BFCPU has re-emerged and progress has resumed!

Before the box

Shortly after I posted the last update I incorporated the new clock generator which you can see in the upper-left, though not exactly in the form pictured. There weren't any of the colored wires going over to the right side of the board. I just used a single jumper to connect any of the frequency divider outputs to a final flip-flop at the end.

The flip-flop served a few purposes:
  • Final divide-by-two stage to make a very slow clock since I didn't have a button anymore
  • Buffer so the frequency dividers wouldn't have to drive the clocks for everything else on the board
  • Provides inverted and non-inverted output
After that, I connected the program counter to the program ROM address lines.

Since brainfuck programs only have 8 instructions, I only need 3 bits for the instruction set. I connected the first three data lines of the ROM to some blue LEDs and the unused lines to red ones.

As a quick sanity check to make sure the chips play well together, I connected the high bit of the ROM data and the manual reset button through a NOR gate to the program counter. Since the ROM is filled with FF in the unwritten areas, this means the program I had written onto it:

00 01 02 03   +-><

would repeat indefinitely.

I connected a slow clock to the program counter, and ... nothing. The program counter was all zeros and staying that way. The data lines of the ROM were all off, since the first instruction is 00. If I disconnected bit 7 everything would work. I spent a few days trying to diagnose the problem. I checked voltage levels, I added a buffer between the ROM data line and the reset circuitry, and then looked again at the data sheet.

If I'm remembering correctly from two years in the future, I discovered that I had not ordered the part that was TTL-compatible. So I spent a bit of time looking into options for level shifting, but my efforts were quickly put on hold.

I got a new job that was an hour commute each way for only 16 miles of driving, so my available time and mental capacity drained away. We decided to move closer to work, so everything was packed up.

We never really settled into the first place we moved, and our bedroom had water intrusion from the storms last winter. So we moved again when our lease was up. Fast forward several months, I have yet another new job and I finally finish unpacking most of my boxes.

After the box

I got my electronics desk set up and started digging through my neural records to figure out what my next steps would be. At first I went back to looking at logic level shifting solutions, but it would just be a hack for a mistaken purchase. Plus, I was never fully convinced that that was even the problem.

So I went back to investigating the circuit. I tried passing bit 7 through a bus transceiver before going to the reset line. I checked the current passing through the reset lines of the counters. I checked the voltages coming out of the ROM, the transceiver, the NOR gate, etc. Everything was consistent between bit 7 being attached and not.

Then I looked at the absolute voltage levels. On the board with the ROM chip I was seeing it dip down to about 4.7 volts, even though the power supply output was reading 5.0. Eventually I narrowed it down to the power line coming into the board. So I soldered up some new leads with some 0.1" headers to provide more low resistance paths to the power supply. It was in vain though. The program counter was still resetting.

Feeling a bit defeated, I hooked things back up to the scope. This time I looked for a pulse on the ROM data line. Lo and behold, for the 150 nanoseconds that the ROM takes to switch between addresses, the data lines are all high. This means the counter would reset immediately after incrementing to the second instruction!

All I had to do was make sure the reset only occurred on the other edge of the clock. Either I misread that the part was incompatible or I'm getting lucky, but everything I'm seeing so far points to the ROM chip working exactly as intended.

With that problem solved, I was officially making progress once more!

Of course, restarting isn't what most programs do when they get to the end. If you write a brainfuck program to compute some result, you want to see that result. Starting the program from the beginning of non-empty RAM would have unpredictable results. So I decided to use bit 7 as a halt signal instead. This meant revisitng the clock circuitry.

Design Time

Before getting to that, though, I started thinking more about the high level design. I've had a soft goal of one instruction per clock through this entire process, and I wanted to make sure I worked out the details for the four instructions I'm working on.

I was watching Robert Baruch's series Building a 6800 CPU on an FPGA with nMigen and decided to come up with timing diagrams for the clock, ROM outputs, and control signals. As an example, to execute a + instruction, there are three steps:
  1. Read and decode the instruction
    • Increment the program counter
    • Set the RAM to input
    • Set the data register to output
  2. Pulse the data register's count up line
  3. Pulse the RAM's write line
For one instruction per cycle, step 1 needs to happen on the rising edge of every clock. Step 2 has to wait 150 ns for the program ROM to update plus a bit more for that to propagate to the enable signals. Step 3 has to wait however long the count up was pulsed. Then step 1 of the next instruction will have to wait for the RAM write pulse to finish.

With three different places to wait per cycle, there was no way I could just use the rising and falling edges of the clock. The first obvious thought is to use a three-phase clock, but that would be overkill. I would have six different rising or falling edges per instruction in that case. I only needed one more edge than I already had.

I decided to add a single phase, delayed by the time it takes to decode an instruction (about 160ns). The instruction is read on phase 0, the data register is incremented on phase 1, and the RAM is written on the falling edge of phase 0. Rather than using pulse generators on the clock edges, I logically combined the two phases into an adder clock and a write clock. Here's the timing diagram for the full program:

In order for the phases to overlap correctly, the clock speed has to be a bit slower than half the speed of the decoder (160ns → 320ns). That works out to a bit over 3 MHz. Not too shabby! Though if I really want to push the performance in the future, I can add a third phase for the write clock and optimize the adder->write->next instruction timings.

For the actual hardware, I looked up some example circuits, and found one built around the 74LS123. This is a pair of delayed pulse circuits that can be wired up to act as a clock phase shifter. All I needed to make it work were some jumper wires, a 5k resistor, and a 10 pF capacitor.

Now that I knew I had a path to success, I started working on the clock circuitry again.

Muxing it up

While I really liked having the ability to select from a wide range of very accurate clocks, it didn't have a convenient interface. You had to know or find out by trial and error which pins on the chips have the clock outputs, which ones are faster than the others, etc. and manually switch a long jumper wire between them. As I mentioned earlier, I also lacked the ability to manually trigger the clock.

I had 26 different clock signals available from 8.000 MHz to 0.238 Hz. I knew it was going to take an entire section to make a selector, so I shifted the entire right half of the board down by one to make room for a new clock section.

I found the largest multiplexers in my kit which can combine up to 8 signals into one using a three bit selection. This means I could connect 24 of the clocks to three chips, then use a fourth chip to select from those. So this means there need to be three switches to select the inputs on each of the first three multiplexers, and two switches to select between multiplexers.

That left 5 inputs open on the final stage and one unused switch. I hooked one up to a wire I can use to select a clock the old way. I hooked the first four inputs up to a manual clock button and the unused switch to the high order selector bit.

So with the pictured DIP switch, the first switch on the left toggles between step vs run mode (all four manual inputs vs any of the four clock inputs). Skipping one switch, the 3rd and 4th are used to select among the three multiplexers (00, 01, and 10 slowest to fastest) and the manual clock selector wire (11). The final three switches are hooked up to the other three multiplexers. This makes it easy to switch among two different clock speeds and manual mode.

The final multiplexer output goes into the flip-flop that used to be the last stage of the frequency divider. So this means the clock toggles between high and low each time you press and release the button.

While moving the boards to accommodate the clock, I also discovered the earlier voltage issues were from a single wire that had corroded. Replacing it brought the voltage back up to 4.98 across the board, so I put the old power switch back and gave it both an "on" and "off" LED.

Back to the drawing board

So now I had a nice convenient clock selector and single-step mode, I went back to implementing the phase shifter. I went back to the 74LS123 design doc and verified the parts I would need: 5k resistor and 10 pF capacitor.

Wait... 10 pF? I think the smallest capacitor I have in my kits is 22 pF and they're all in use keeping the high frequency clocks from ringing. Not only that, but this is a freaking breadboard computer! 10 pF is down in the range where all the large, parallel runs of metal in the slots can make a significant difference. There's no way I would be able to get a reliable result from this. Thankfully I RTFMed before starting to wire it up.

So my next thought was to find something in my TTL kit with a lot of gates that could be used for a propagation delay. I figured it would be one of the high numbered ones, and the highest I had was 74LS374. Well what do you know. Eight flip flops? That's exactly the kind of thing I was looking for. The '374 was edge triggered though, so chaining them together would divide the frequency in half each time. Luckily I also had the '373 which is "transparent" so it would pass along the change immediately.

Note to self: replace the '93s in the frequency divider with '374s

I had to use 12 stages (one and a half chips' worth) to get the 160ns delay I was after, but it worked like a charm.

Next I implemented the clock phase combiner in logisim-evolution and did some boolean algebra to figure out how to use a single kind of gate (NOR or NAND) to reduce the chip count. I wired that up, hooked up a clock fast enough to see the two different phases clearly on the scope, and saw something pretty similar to the timing diagram!

I had to do a bit of tweaking since the '373's flip-flops seem to respond faster on the falling edge than the rising edge. So at first I was getting a lot less overlap than there should have been. After switching around some of the phases and inverting the logic, it was all working as intended.

It's the final count down!

Now that I have the clock pulses I need, the last thing between here and fully executing instructions is the decoder. Figuring out which instruction is loaded is pretty simple. The 74LS155 takes a 3-bit input and makes one of its 8 outputs low depending on the value. So I just hooked that up to the low three bits of the ROM. You can see the outputs of this selector on the right here:

Since I want FF to represent halt, all of the data lines go into an 8-bit NAND. I was starting to work out what gates/chips I would need to combine the halt signal with the clocks. Then I realized the flip-flop in the final stage of the clock selector has an input that will stop it from toggling. I just hook the NAND up to that and halt is implemented!

This is where things stand right now. I just need to send the output of the selector and the correct clock signals over to the other side of the board. I'm optimistic that my next update will have a fully excuting brainfuck program (albeit without loops or I/O).

Until Part 10!

Part 1 (first) - Part 8 (prev) - Part 10 (next)