For the past year or so, I was relatively convinced that it wasn’t possible to put a fully working AGC into an FPGA. In fact, I even talked about it about in Simulation, and the Binary Clock Divider. The main reasons I thought it wouldn’t work were as follows:
- The AGC is built on asynchronous logic. FPGAs very strongly prefer synchronous logic.
- It’s not easy (or possible?) to implement everything with just NOR gates. FPGA tools get freaked out about combinational loops, and the AGC is, from the point of view of an FPGA, one big combinational loop. You might try to implement cross-coupled NOR gates as a register, but then what do you do about this thing?
- Several circuits in the AGC rely on propagation delays for timing. Remember this guy?Delays inside an FPGA are going to be much smaller than with discrete components, and can vary depending on how the fitter decides to lay things out.
A couple of weeks ago I realized that I had somehow managed to think about the problem at too high of a level and too low of a level at the same time. The real answer was to build a clock-driven NOR gate.
The basic idea is that this clocked NOR gate will sample its inputs on the falling edge of its clock, and apply the result to its output on the rising edge. The entire AGC receives the same clock, and so the entire state of the system is propagated lockstep. The sampling and output application are on opposite clock edges to prevent race conditions. All gates in the system change their outputs at the same time, and then a delay is given before sampling for the next cycle, to ensure all transitions have occurred and everything is settled. The propagation delays for the gates can even be tuned by setting the frequency of the clock!
The Verilog model for this is pretty easy:
module nor_2(y, a, b, rst, clk); parameter iv = 1'b0; ... input wire a, b, rst, clk; `ifdef TARGET_FPGA output reg y = iv; reg next_val = iv; always @(posedge clk) begin y = next_val; end always @(negedge clk) begin next_val = ~(a|b); end `else ... `endif endmodule
This model solves all of the above problems — the design becomes fully synchronous, we’re using clocked registers for everything, and propagation delays are accurately captured at the gate level! The only downside is that it takes a whopping 2 registers per NOR gate, which leads to a rather large final product that needs a kinda beefy FPGA to fit.
After verifying that the computer still worked with this new NOR gate model, I set about setting up a project in Quartus Prime for the FPGA dev board I have, the DE0-Nano.
I set up a PLL that generates both 51.2MHz and 2.048MHz clocks from the 50MHz crystal on the board. The 2.048MHz clock is fed directly into the timer module as the CLOCK input. The 51.2MHz clock serves as the system propagation clock. I chose 51.2MHz because it’s the multiple of 2.048MHz that gets closest to the propagation delays of the original gates. The original delays were 20ns, and a 51.2MHz clock gives our gates a delay of 19.53ns. Close enough! I want the system clock to be a multiple of the AGC clock so I don’t get any weird phasing issues.
With everything ready to go, I pulled in all of generated Verilog files and hit “Start Compilation” — and very quickly received an error message from synthesis:
Verilog HDL unsupported feature error: can’t synthesize pullup or pulldown primitive
Craaaaap. This makes total sense, of course, and I should have expected it. FPGAs don’t have internal tri-stating, open-drains, pullups, or anything of that sort. This type of thing only exists for the GPIOs. But as I’ve talked about before, the open-drain nature of the AGC gates is extremely important — both for fan-in expansion and cross-module buses.
It seems that the proper way to do this type of thing in an FPGA is to use the Verilog wand (wired-and) data type. A wand is just like a wire, but it can be assigned to multiple times, and its value is the logical AND of all expressions assigned to it. So in theory, all I had to do was tweak the open-drain buffer model to be a simple passthrough, rather than switching to high-impedance mode when its input is high:
module od_buf(y, a); parameter delay = 2; input wire a; output wire y; `ifdef TARGET_FPGA assign y = a; `else assign #delay y = (a == 1'b1) ? 1'bZ : 1'b0; `endif endmodule
I also had to edit the verilog generator to leave codegen hints for the backplane generator. If any of a net’s connections are an open-drain pin, it leaves a comment that looks like this:
inout wire RL01_n; //FPGA#wand
The backplane generator looks for things like this, and changes the data type of the line from wire to wand when emitting code for the FPGA.
Problem solved, right?
Error (12014): Net “RL01_n”, which fans out to “RL01_n”, cannot be assigned more than one value
What?! That’s the whole point of wands…
After some experimentation, I found that Quartus only supports intramodule wands. As soon as a wand crosses a module boundary, all bets are off. So in order to get this to work, I was going to need to put everything into the same module — the same file.
I decided to draw the line at the component level. I really didn’t want to have the implementations for the components repeated everywhere, and besides, each component (74HC02, 74HC04, …) is sort of an atomic thing from the point of view of this simulation. So, I devised a way to get wands to cooperate across the component module boundaries. I call it “proxy wires”.
Here’s the idea: since the components can be considered atomic, we can have regular wires cross their module boundaries, rather than the main wands. We then assign each of these wires to the wand. So instead of driving the wand directly, the module drives wires which are then combined into the wand.
To accomplish this, the verilog generator needed more tweaks. When emitting a component declaration, it looks to see if any of the component’s pins are open-drain. If they are, it emits a codegen hint to the backplane generator that looks like this:
U74LVC07 U8007(__A08_NET_191, __A08_2___CO_IN, __A08_NET_185, RL01_n, __A08_NET_198, L01_n, GND, __A08_1___Z1_n, __A08_NET_221, RL01_n, __A08_NET_222, RL01_n, __A08_NET_220, VCC, SIM_RST, SIM_CLK); //FPGA#OD:2,4,6,8,10,12
This says that pins 2, 4, 6, 8, 10, and 12 are open-drain outputs, and should be handled specially.
I made the backplane generator combine all of the AGC modules into one big module (“fpga_agc”) when it’s generating code for an FPGA, rather than just a small module that ties modules together. If, when processing module contents, the backplane generator sees a comment like the one above, it splits up the declaration and determines which nets are affected. For each of these nets, it spits out some proxy wire declarations:
wire RL01_n_U8007_10; wire RL01_n_U8007_12; wire RL01_n_U8007_4; ...
Each proxy wire takes the name of its original net, and is suffixed with the component number and pin number that it will be attached to. A series of assignments is then emitted to assign the proxy wires to their parent net:
assign RL01_n = RL01_n_U8007_4; assign RL01_n = RL01_n_U8007_10; assign RL01_n = RL01_n_U8007_12; ...
And finally, the component declaration is reconstructed, with the original nets subbed out for their proxy wire counterparts:
U74LVC07 U8007(__A08_NET_191, __A08_2___CO_IN_U8007_2, __A08_NET_185, RL01_n_U8007_4, __A08_NET_198, L01_n_U8007_6, GND, __A08_1___Z1_n_U8007_8, __A08_NET_221, RL01_n_U8007_10, __A08_NET_222, RL01_n_U8007_12, __A08_NET_220, VCC, SIM_RST, SIM_CLK);
For the strong of heart, the final product can be found here: almost 5,000 lines of Verilog fun!
Once everything was in one big happy file, with proxy wires to protect the wands, synthesis went off more or less without a hitch!
And voila! Timepulses 1 and 2 showing up on my oscilloscope! Further testing showed that the circuit I was most worried about, the edge detecting pulse generator, produced exactly the expected width (100ns) based on the simulation. And finally, here’s what the layout in the FPGA looks like:
As it happens, this was destined to not be my last battle trying to get things to work on the FPGA — but that’s a story for another post!