[<<][am335x][>>][..]
Mon May 21 07:33:48 EDT 2018

PRU toggle

http://processors.wiki.ti.com/index.php/PRU_Assembly_Instructions#Set_Bit_.28SET.29

R30 = output (0=0)
R31 = input  (1=I)

How to use CLR,SET on GPO R30, if reading from GPO reg isn't the same
as writing?  It seems a copy needs to be kept to be able to use:

          CLR R30, TEMPLATE, BIT


That's not a big problem.  Plenty of registers.

Cycle counter?

http://processors.wiki.ti.com/index.php/Programmable_Realtime_Unit

PRU Cycle Count Register (0x000C)
PRU Stall Count Register (0x0010)



So make a minimal modification to the beaglelogic loop.  Try the one
that has 10MHz/8bit sampling.


; Generic delay loop macro
; Also includes a post-finish op
DELAY	.macro Rx, op
	SUB	R0, Rx, 2
	QBEQ	$E?, R0, 0
$M?:	SUB	R0, R0, 1
	QBNE	$M?, R0, 0
$E?:	op
	.endm


samplexm8:
	MOV    R21.b0, R31.b0
	DELAY  R14, NOP
	MOV    R21.b1, R31.b0
	DELAY  R14, NOP
$samplexm8$2:
	MOV    R21.b2, R31.b0
	DELAY  R14, NOP
	MOV    R21.b3, R31.b0
	DELAY  R14, NOP
	MOV    R22.b0, R31.b0
	DELAY  R14, NOP
	MOV    R22.b1, R31.b0
	DELAY  R14, NOP
	MOV    R22.b2, R31.b0
	DELAY  R14, NOP
	MOV    R22.b3, R31.b0
	DELAY  R14, NOP
	MOV    R23.b0, R31.b0
	DELAY  R14, NOP
	MOV    R23.b1, R31.b0
	DELAY  R14, NOP
	MOV    R23.b2, R31.b0
	DELAY  R14, NOP
	MOV    R23.b3, R31.b0
	DELAY  R14, NOP
	MOV    R24.b0, R31.b0
	DELAY  R14, NOP
	MOV    R24.b1, R31.b0
	DELAY  R14, NOP
	MOV    R24.b2, R31.b0
	DELAY  R14, NOP
	MOV    R24.b3, R31.b0
	DELAY  R14, NOP
	MOV    R25.b0, R31.b0
	DELAY  R14, NOP
	MOV    R25.b1, R31.b0
	DELAY  R14, NOP
	MOV    R25.b2, R31.b0
	DELAY  R14, NOP
	MOV    R25.b3, R31.b0
	DELAY  R14, NOP
	MOV    R26.b0, R31.b0
	DELAY  R14, NOP
	MOV    R26.b1, R31.b0
	DELAY  R14, NOP
	MOV    R26.b2, R31.b0
	DELAY  R14, NOP
	MOV    R26.b3, R31.b0
	DELAY  R14, NOP
	MOV    R27.b0, R31.b0
	DELAY  R14, NOP
	MOV    R27.b1, R31.b0
	DELAY  R14, NOP
	MOV    R27.b2, R31.b0
	DELAY  R14, NOP
	MOV    R27.b3, R31.b0
	DELAY  R14, NOP
	MOV    R28.b0, R31.b0
	DELAY  R14, NOP
	MOV    R28.b1, R31.b0
	DELAY  R14, NOP
	MOV    R28.b2, R31.b0
	DELAY  R14, "ADD    R29, R29, 32"
	MOV    R28.b3, R31.b0
	DELAY  R14, "XOUT   10, &R21, 36"
	MOV    R21.b0, R31.b0
	DELAY  R14, "LDI    R31, PRU1_PRU0_INTERRUPT + 16"
	MOV    R21.b1, R31.b0
	DELAY  R14, "JMP    $samplexm8$2"



So, for sure, the first thing I'm going to do is to make a generator
for this that exposes the 3 different tasks:

- DELAY

- sample: R21-R28  8 x 4 bytes = 32 bytes

preroll	MOV    R21.b0, R31.b0
	MOV    R21.b1, R31.b0

loop	MOV    R21.b2, R31.b0
	MOV    R21.b3, R31.b0
	MOV    R22.b0, R31.b0
	MOV    R22.b1, R31.b0
	MOV    R22.b2, R31.b0
	MOV    R22.b3, R31.b0
	MOV    R23.b0, R31.b0
	MOV    R23.b1, R31.b0
	MOV    R23.b2, R31.b0
	MOV    R23.b3, R31.b0
	MOV    R24.b0, R31.b0
	MOV    R24.b1, R31.b0
	MOV    R24.b2, R31.b0
	MOV    R24.b3, R31.b0
	MOV    R25.b0, R31.b0
	MOV    R25.b1, R31.b0
	MOV    R25.b2, R31.b0
	MOV    R25.b3, R31.b0
	MOV    R26.b0, R31.b0
	MOV    R26.b1, R31.b0
	MOV    R26.b2, R31.b0
	MOV    R26.b3, R31.b0
	MOV    R27.b0, R31.b0
	MOV    R27.b1, R31.b0
	MOV    R27.b2, R31.b0
	MOV    R27.b3, R31.b0
	MOV    R28.b0, R31.b0
	MOV    R28.b1, R31.b0
	MOV    R28.b2, R31.b0
	MOV    R28.b3, R31.b0
	MOV    R21.b0, R31.b0
	MOV    R21.b1, R31.b0

- transfer

  	ADD    R29, R29, 32
        XOUT   10, &R21, 36
	LDI    R31, PRU1_PRU0_INTERRUPT + 16
	JMP    $samplexm8$2


The important part in the sample vs transfer, is that XOUT is called
right before the next sample is read into the first register R21.

How to abstract this out, so it becomes easy to weave it?


Basic principles:
- Try to do this without modifying beaglelogic ARM code
- Keep PRU0 code the same as well, modify PRU1 only


So analysis needs to be limited to PRU1 code.

- Main loop is straightforward:
  - Sample into R21-R28
  - XOUT to PRU0
  - Interrupt PRU0


How large is the code memory?  Important?  If I use delay loops, I
shouldn't run out.

Main loop register use?

R0      intermediate (DELAY)
R14     delay count
R21-R28 buffer
R30,R31 IO



Goal: 

- write an alternative PRU1 firmware similar to PRUDAQ, keeping the
rest the same.

- create a weaver

There should be plenty of time to create some call/return functions.

PRU has a pseudo CALL instruction, which is enough for a single stack
level stored in a register.  I.e. a code threader.

JAL REG, ADDR : jump and link
JMP REG.w0    : return







[Reply][About]
[<<][am335x][>>][..]