Sat Mar 6 14:57:23 CET 2010

GNU Binutils

It looks like the real problem is that I'm underestimating the
arbitrariness of assembler syntax.  This arbitrariness needs to be
encoded somewhere..

Maybe it would be instructive to see how binutils is implemented?

Some intro here[1]:

  opcodes/ contains the opcodes library. This has information on how
           to assemble and disassemble instructions

  cpu/     contains source files for a utility called CGEN. This is a tool
           that can be used to automatically generate target-specific
           source files for the opcodes library, as well as for the
           SIM simulator used by GDB.

I'm looking at the Microchip binutils extension from
mplabalc30v3_01_A.tar.gz, in acme/opcodes/pic30-opc.c

Much of the necessary information is in
  const struct pic30_opcode pic30_opcodes

Looks like that can be snarfed just fine.

[1] http://www.linuxforu.com/teach-me/binutils-porting-guide-to-a-new-target-architecture/