[<<][rtl][>>][..]
Thu Aug 16 23:49:53 EDT 2018

Removed old notes from CPU.hs

-- Original notes on stack macchines:

-- Memory seems to be the most important component.  I'm going to
-- target the iCE40, which has a bunch of individual memories,
-- allowing separate busses for instruction, data and return stack,
-- and the rest bundled as data memory.

-- At every clock, each of the 4 memories has a word sitting on its
-- read port:
-- i: current instruction
-- d: 2nd on stack  (top is in a register)
-- r: return address (ip is the instruction memory's write port)

-- The instruction word drives the decoder, which drives all the
-- muxes.

-- It seems that reading out instructions is the most useful thing to
-- start with.  This could be used for specialized sequencers that are
-- not necessarily general purpose CPUs.  This can then be gradually
-- extended to more abstract operations.


-- The main problem for building a CPU is to properly decompose the
-- decoder.  I'm not sure how to do this exactly, so just start in an
-- ad-hoc way.

-- There is some arbitraryness here: a hierarchy is created in the
-- nesting of the "close" operations.  The guideline is to abstract
-- away a register as soon as possible, i.e. move it to the inner part
-- of the hierarchy.

-- At the very top, there is:
-- . instruction memory access:
--   . read:  program sequencing
--   . write: bootloader
-- . BUS I/O (i.e. containing GPIO)

-- Each hierarchy level is an adaptation.  closeIW will abstract the
-- inner decoder as an iw -> jump operation, and insert the necessary
-- logic to either just advance to the next instruction, or perform a
-- jump.


-- The origianl problem that drove this exploration is meanwhile
-- implemented on PRU.  These were the instructions needed:
-- 
-- a) loop n times
-- b) write UART byte, wait until done
-- c) wait
-- d) set I/O
-- e) read I/O into memory and advance pointer

-- To implement loops, it would be useful to have a stack to be able
-- to have nested loop counters.  This would mean less registers.  I'm
-- not going to be able to make this simpler than making a small forth
-- machine..  This way:

-- UART out can be bit-banged.
-- Multiple counters not needed for timing control.
-- No "wait" instruction needed: instruction counting suffices.
-- Add a data stack when needed.  Probably a single top register is enough.

-- The basic instructions seem straightforward.  This is just a
-- decoder that fans out into mux controls.  The unknown part to me is
-- the call/return.

-- Call:   move IP+1 -> rtop write port
--         inc rpointer
--         set ip from instruction word
-- Ret:    dec rpointer
--         move rtop -> IP

-- This could also be microcoded:
-- a) load literal into rdata
-- b) increment rstack
-- c) unconditional jump

-- The operations that can be reused are:
-- write, postinc  (stacks + buffers)
-- read, predec

-- So there is a clear tradeoff between the complexity of the
-- instruction decoder, and the amount of instructions needed.

-- Where to start?  Conditional memory write.

-- So for unidirectional flow, this is easy.  For bi-directional such
-- as a stack, two pointers need to be maintained.  It might be
-- simplest to initialize them such that the write/read operation can
-- happen immediately?  Both will have individual adders.  Maybe not a
-- good idea?



-- Perform an operation and wait for it to finish.
-- Let's keep the operation abstract, so what this does is:
--
-- . first time the instruction is executed, the sub-machine is
--   enabled.  the sequencer will wait until the machine provides a
--   "done" flag, which will advance the instruction pointer.
--
-- . it seems simpler to split this into "start" and "wait"
--   instructions.
--


-- Each instruction can have push/pop/write/nop wrt imm?  It seems
-- possible that stack can be manipulated in parallel with bus
-- transfer.

-- But let's not make this too complicated.  Some observations:

-- . This is for very low level, specialized code.  It will never
--   necessary to manipulate addresses as data, so address for read,
--   write can always come from the immediate word.  The data itself
--   might be manipulated.


-- It's probably ok to instantiate it fully even if certain
-- instructions are not used.  Yosys/abc removes unused logic.

-- Still, this can use some decomposition.  For now, because there are
-- not many instructions, use one-hot encoding to keep the logic
-- simple.



[Reply][About]
[<<][rtl][>>][..]