Sat Jun 16 19:10:34 EDT 2018

Push a stack

So, the memory has a 2 cycle delay.  I wonder if it is necessary to
make a 2-step processor?

What is the simplest thing to do?  Speed is not an issue for the task
at hand.  Simplicity is more important.  That means no pipelining:
results of the previous instruction step should be available in the

- stack architecture (working reg + top of data stack = ALU input)
- no need for pipeline delays

- instruction word
- working reg
- stacks : read register

- instruction pointer
- working reg
- stacks: write address + data

Assume combinatorial path between those.  What can be implemented
without delays?

- jump
- alu (top of data stack + working reg -> working reg)
- stack pointer (e.g inc / dec)

Lost it... Start simpler.  Start with:
- ip, wreg
- jump
- conditional jump
- inc / dec / load

So... this looks really interesting, but DO NOT do this for work.
Keep the FPGA circuits simple, and put the control logic in the CPU.