This thing should grow toward the main GETTING STARTED documentation
for the PURRR forths, which target the Microchip PIC18F architecture.


HOW TO GET IT RUNNING
---------------------


A. GET A CHIP WITH A PURRR BOOTLOADER

It is possible to order PIC chips with pre-programmed bootloader and
chip configuration by sending an email to <purrr@goto10.org>

Currently, only the serial bootloader is working. In the future, a USB
version will be added.


B. PROGRAM IT YOURSELF

Boot images can be built using a command like

     (purrr seed 1220-8) use

These correspond to the seed files in the purrr/seed directory. They
contain all the necessary info to build a boot monitor, and an initial
version of the state database for a purrr project. The monitor file
it produces a file in standard Intel Hex format, usable by most PIC
programmers, and readable by the MPLAB suite. Consider typing

Depending on the name chosen inside the seed file, which here is just
'1220-8', you can load its state by

     (purrr 1220-8) restore


C. BUILD A DEV STATION

Currently, only the serial version works. This consists of
- 5V power supply
- 18F1220 chip with PURRR monitor
- PC <-> 5V ttl serial connection

The last one can be anything from a plain serial cable + MAX232 based
serial level convertor, to a USB chip or cable which does the
same. For example the FTDI TTL-232R USB-Serial Converter is a ready
made solution.

My setup consists of a simple 5V net adapter, a serial cable and a
mini board made using the following components:

- 1x protection diode 1N4001
- 1x DB9 female serial connector
- 1x MAX232
- 4x 1uF/10V

I put everything on a small board accessible with standard pin
headers to easily connect to a bread bord.

An alternative is to use the Maxim DS275 chip, which doesn't need
capacitors.


D. START PROGRAMMING

Once you got a serial connection going with a 18F chip running the
PURRR monitor, and the 'monitor' application is running in CAT, you
are basicly set. You can try if it works by typing 'ping', to which
the chip should reply with 'PING'.

Some important commands:

pts	       ( -- )	      print target stack
ping	       ( -- )	      check if target is still alive
upload	       ( code -- )    upload a chunk of forth code
record r       ( code -- )    same as upload, but with version control
trun t	       ( code -- )    interpret a chunck of code
>t	       ( x -- )	      transfer a number to the target stack
t>	       ( -- x )	      transfer a number from the target stack
warm	       ( -- )	      warm reboot target machine


THE SYSTEM
----------

1. LEARNING
-----------

The system consists of two parts: a lowlevel target language PURRR,
and a highlevel metaprogramming language CAT.

How do you get started learning the system? I think the easiest way to
do so is to focus on the target language PURRR first, try to get some
things running, and gently move up the abstraction ladder to write CAT
metaprogramming tools whenever the need arises.

I will explain both at once in the section below. This might be a bit
overwhelming at first, but I think it is the only fair thing to do. I
tried to stay as close as possible to existing ideas when they were
good, but not existing languages or standards. This means PURRR is not
standard FORTH, but a simple subset with extensions specific to the
setup, and CAT is not really Joy, but very similar in spirit.


2. LANGUAGE
-----------

PURRR is a dialect of FORTH, while CAT is a similar, but has some very
different properties. Both of them work together as one system. How
exactly and why will become clear later on.

PURRR is a very flexible language for low level programming. Low level
programming means programming 'close to the metal'. In practice this
means that the distance between PURRR and the machine (language) it
will be translated to is not so big, which gives you as a programmer
ultimate control over the hardware.

CAT is a very flexible language for high level programming. High level
programming means programming 'closer to the grey matter', using
constructs that might not be as efficient for a machine to execute,
but are more convenient for a human to work with.

The reason this system uses two languages is to enable you as a
developer to clearly see what you are doing by keeping close to the
metal at the base level. You will be able to express clearly what
exactly the microcontroller will need to do without a layer of pillows
between you and the machine. But, in doing so, you can make use of a
very high level tool to automate this process of expression in cases
where the low level programming language has not enough expressive
power, and writing a program would result in some tedious repetition
that could be easily done by a computer. The way by which you will be
doing this is called 'meta programming' : generating low level code
from higher level constructs.

The idea behind the CAT/PURRR combo is to make the language (PURRR)
and metalanguage (CAT) optimized for their respective tasks, without
making them all too different in spirit. Both of them are based on the
principle of 'concatenative programming', which means no more than: 'a
program is a concatenation of two or more other programs.'

Let's illustrate some of these ideas. A concatenative program is a
list of words, separated by white space. For example:

       1 2 +

The meaning of this program is:

    (a) remember 1 
    (b) remember 2 
    (c) add the last two numbers remembered, forget the original
        numbers and remember only the resulting number 3

Some things to take note of. Every word is a command, meaning it
_does_ something, or has an associated action. The action associated
with numbers is _remember_. The action associated with symbols can be
anything, in the case of '+' it is the action of adding two numbers in
the way described above.

Now, what does _remember_ mean? It means: 'put something on top of a
stack'. This is stack is very real in CAT/PURRR, and it is called the
data stack, or parameter stack.

This is really all you have to know about understanding both CAT and
PURRR. The rest are details and special ways of combining sequences of
words that interact by interchanging data on the stack. This mechanism
enables a style of programming characterized by 'extreme
factorization'. It enables you to divide the solution to a problem
into a large collection of small functions that do exactly one thing,
making them 'hard to get wrong' instead of 'hard to get right'. These
functions can all be tested individually and interactively, which
facilitates writing correct programs.

The most important way to facilitate this process of combination is to
give things that occur more than once a name. For example the program

      10    1 +   1 +   1 +  

could be simplified to

      10    count count count

if we could find a way to _define_ the meaning of the word 'count'.
This is the second way of remembering things, which is to give them a
name. Note that we can merely substitute 'count' with '1 +' in the
program above to get the same result.

The way this is done in both CAT and PURRR immediately illustrates the
difference between the two languages. To define 'count' in PURRR would
go like this

      : count 1 + ;

which is a Forth like syntax, while doing the same in CAT would be

      (count 1 +) :

In both languages, the word ':' means 'define new word'.

In PURRR, the interpretation goes like this

      (a) execute the word ':'
      (b) this word reads the next word 'count', altering default meaning
      (c) the word 'count' read by ':' is used to create a new symbol
          label that will refer to the code that will be compiled next.
      (d) '1' generates code that does "remember 1"
      (e) '+' generates code that does "add last 2 numbers remembered, 
          remember only result"
      (f) ';' generates code that does "continue what was going on before 
	  this code executed"


In CAT, the interpretation goes like this

      (a) remember the data item (count 1 +)
      (b) execute the word ':' which will take the last data item, and split
          it in two parts 'count' and (1 +). record the name 'count' to be 
	  replaced by the program '(1 +)' whenever it occurs during 
	  the interpretation of a CAT program.


So the main difference between the two languages is this: PURRR is
COMPILED, which means it is always translated to code that, when
executed, will display the behaviour expressed in the code '1 +',
while CAT is INTERPRETED: during execution, symbols are replaced by
the programs they refer to, but the code itself is always stored in
its original symbolic form.

The interesting part for CAT is that the code is always data, until it
is executed (translated to actions on the fly). If we do

        (: count 1 + ;) upload

There are only two actions. The first is to remember a list of symbols,
and the second one is to compile it as PURRR code, and send it to a
live target so it can be executed later on.

In this case we just manually typed the list (: count 1 +) as a data
item to CAT, but nothing prevents us to write a CAT program to
generate this data for us, and only in the end translate it to code
that can be understood by a naked microcontroller by running a word
like 'upload'. This is the basis of the power of the CAT/PURRR
language: you can use CAT to generate PURRR code.

NOTE: it is usually better to use 'record' or the short form 'r'
instead of 'upload', since it will keep the previous microcontroller
state under version control so it is possible to undo an upload.

Now, when you dig deeper, you will see that every PURRR word that is
being compiled will actually execute CAT code that performs the
compilation. Some PURRR words are called _macros_, and do exactly
this. For example the ':' PURRR word is a macro, written in CAT, that
will read the next word and perform the registration of the name to
point to some code on the microcontroller.


3. FORTH MODE
-------------

It is possible to run the system in "Forth Mode" or "Target Mode", as
opposed to "CAT Mode" or "Host Mode". This can be done using the word
'target'. The command interpreter works different in this mode:


numbers	-> load on target stack
symbols -> 1. if one of the special CAT words, execute CAT code
	-> 2. if not, execute live target machine code
	-> 3. none of the above : signal error
lists   -> load on host stack (CAT)

The main benefit of this mode is that you don't have to quote target
words and numbers in lists. For example, the target mode command:

	  1 2 +

would be equivalent to the host mode command

	  (1 2 +) t

which can be convenient. In target mode it is still possible to
compile forth code and upload the resulting machine code snippet to
the target using the 'r' word.


4. DIFFERENCES TO STANDARDS
---------------------------

It's probably quite clear that this is not a standard (ANS) Forth. In
addition to that, there are some big differences between other
non--standard RAM based "self-aware" forths, and the CAT/PURRR combo.

* 8-bit DATA forth

It is commonly believed that 16 bits is the minimum for a workable
forth. I chose not to go that path, and stay closer to the machine,
which is really 8 bit for the 18F series. Taking into account the
limitations imposed by the Harvard architecture, the only important
feature that is lost is the ability to reference arbitrary _code_ with
a single word. If i need late binding for code, i fill this gap using
explicit interpreter words which use either 'route' or 'case
.. endcase' which amounts to some extra level of byte code.

* Harvard architecture : separate program and data memory

This is a big difference. The usual forth paradigm relies heavily on
the fact that code is residing in RAM. This is no longer the
case. FLASH code is modifyable at run--time, but this operation is
slow and causes memory wear. This limitiation is of less imporance in
the incremental dev cycle, which is the core idea of CAT/PURRR, but
does manifest itself in the fact that in practice, code is not
mutable. (See above).

* Forth compiler written in metalanguage, in the form
  of a collection of macros.

This might be the biggest difference. Because of the small size of the
chips targeted, it makes a lot of sense to not make the software
"self-aware". The system is dispersed over two parts: the target runs
only native machine code, while the command console, symbol table and
compiler/assembler reside on a host system. The interface presented to
the user mimicks as much as possible the original forth
interpret/compile modes. However, interpret mode is always the higher
level host language. For live interaction, a target mode is provided
which acts as if the target chip is fully self-contained: able to
execute and compile code.