Do loops revisited

Counted loops are indispensable in Forth. They hide one item,
probably more, that would otherwise clutter up the stack to
the point of no return.
The Forth loops were designed with one eye on primitive
machines; it is now time to design them with one eye on
modern optimisers.

Chuck Moore introduced FOR NEXT a long time ago, undoubtedly
to avoid the complications of Forth's DO LOOP construct.
The introduction of quotations gives us an opportunity
to reconsider the loops.

Simple quotations
-----------------
With quotations I will refer to
simple quotations, that are just a sequence of instructions
where all components are simply compiled in the current context.
In this way a quotations becomes a denotation: a sequence
of symbols that results in a constant, traditionally
called an execution token. This is a compile time constant
and can be subject to all kinds of optimisation,
contrary to the elendhafte sophistication of quotations that
drag a not-so-well-defined environment from compile time
into the run time.
For these simple quotations (s.q.'s) the notation
[: ;] is both clunky and misleading. Clunky because so
basic a notation is worthy of single characters brackets.
Misleading because the [ tells us to switch to compile
mode, while a s.q. does not depend on which STATE we're in.
Misleading also because the : tells us that a name follows,
which is not the case.
As luck will have it neither { nor } are standardized.
The bottom line is that I can and will use { } in the following to
replace both the :NONAME ... ; construct (interpret STATE)
and the [: ... ;] (compile STATE) construct.
This doesn't make a s.q. STATE-smart, constants - some call them
literals - are never STATE-smart.

Loops revisited
---------------
Let us have a look at other programming language, how they
implement loops.
Invariably we have the following components in a counted loop,
which we will give Forthy names:
* ii     \ The loop counter
* lim    \ The inclusive loop limit
* xt     \ What is to be executed in each loop
* inc    \ The loop increment

The C loop construct is
for (ii=1; i<=lim; i=i+inc )
{
  xt
  xt
  xt
}

The FORTRAN loop construct is
DO 1000, i=1, lim, inc
  xt
  xt
  xt
1000 CONTINUE

The algol68 is similar but not line oriented:
'for' i 'from' 1 'to' lim 'by' inc 'do'
   xt
   xt
   xt
'od'

The Python loop construct is
for i in range(1,lim-1,inc):
    xt
    xt
    xt

Python is no good, it requires constructing a data
structure. We will not deviate from the Forth way
and throw some single cell variable haphazardly
into memory, preferably the return stack.

Combining the Fortran/algol and C examples we arrive at
    1 lim inc { xt xt xt } DO[..]
as a reasonably way to express loops in Forth.
From FORTRAN/algol we borrow the notion that the `inc
cannot be changed. Recomputing `inc in the loop
is a bad idea. It flies in the face of the meaning
of a counted loop, and thus is confusing to human
and optimiser alike. This allows another significant
advantage. By making the increment negative we
have a decrementing loop. And if that is known at compile
time, the optimiser will have a field day.

Algol68 features defaults. "'by' inc" is optional
and defaults to "by 1". Likewise the 'from' defaults
to 1.
Where Forth is not afraid of introducing more words
we will have defaults for frequent cases.

Forth loop constructs
---------------------
At last know we know how we want Forth loops to look.

We already had
    low high lim inc { xt xt xt } DO[..]
for the most general case.
The following two words have a default for `inc of 1.
Without a default lower limit:
   low lim { xt xt xt } DO[]
With a default lower limit of 1:
   lim { xt xt xt } DO]

Mathematics knows the [a,b] notation for intervals a<=x<=b and
[a,b) for intervals a<=x<b.
The above notation leans on that, leaving out the left
bracket if the left limit is default.
Non inclusive upper limits are important to Forth
so we add also
   lim { xt xt xt } DO)
Here `lim is a non exclusive upper limit, the increment
is still 1 and the lower limit is - of course - 0, or it
wouldn't make sense to introduce DO)

A word to access the index within the loop body
is essential, but we need to call the index IX
to avoid a clash with the default I.

Examples
--------
In the following examples the part before "S:" is the code
executed, after "S:" you see the stack content.
The output by . is not shown, but they are the same than
the stack content.
It may be disturbing but the do-words are not compiling
words. Simple quotations do all the compilation needed.
For the experts:
 No problems are to be expected by POSTPONE-ing do's
 because the do's have "default compilation behaviour".

EXAMPLE:  : test 3 { 12 } do) ; test S: 12 12 12
EXAMPLE:  : test 0 { 12 } do) ; test S:
EXAMPLE:  : test -1 { 12 } do) ; test S:
EXAMPLE:  : test 3 { 12 } do] ; test S: 12 12 12
EXAMPLE:  : test 3 { IX DUP . } do] ; test S: 1 2 3
EXAMPLE:  : test 3 { IX DUP . } do) ; test S: 0 1 2
EXAMPLE:  : test 2 5 { IX DUP . } do[] ; test S: 2 3 4 5
EXAMPLE:  : test 2 7 2 { IX DUP . } do[..] ; test S: 2 4 6
EXAMPLE:  3 { 12 } do)  S: 12 12 12
EXAMPLE:  0 { 12 } do)  S:
EXAMPLE:  3 { 12 } do]  S: 12 12 12
EXAMPLE:  3 { IX DUP . } do]  S: 1 2 3
EXAMPLE:  3 { IX DUP . } do)  S: 0 1 2
EXAMPLE:  2 5 { IX DUP . } do[]  S: 2 3 4 5
EXAMPLE:  2 7 2 { IX DUP . } do[..]  S: 2 4 6
EXAMPLE:  7 2 -2 { IX DUP . } do[..]  S: 7 5 3
EXAMPLE:  7 1 -2 { IX DUP . } do[..]  S: 7 5 3 1


Implementation of the new do-constructs
---------------------------------------

The implementation is short, at least if you're into VARIABLE's.

: do[..]   xt ! DUP inc ! 0< 1+ + lim ! ii !
    BEGIN ii @ lim @ - inc @ XOR 0< WHILE xt @ EXECUTE inc @ ii +! REPEAT ;

The "XOR 0<" takes care of decrementing loops.

The following is an implementation on ciforth that requires the latest
snapshot release of lina32
https://github.com/albertvanderhorst/ciforth

Notes:
1. Only the examples require s.q.'s. If you don't have them you can replace
them by :NONAME ; or [: ;] constructs with great prejudice.
2. This is a half decent practical implementation that stores everything
on the return stack, in a traditional way. Of course this can only
be accomplished with some assumptions, in this case that the return
stack is addressable, a word RSP@ is available, and that it grows down.
3. :I introduces a macro. This is an immediate word that inlines its content.
Otherwise factorisation of return stack manipulation would be a cream
(i.e. French for crime, this is a Dutch expression).
4. Two's complement is (probably) required for the XOR technique to work. 

\ -----------------------8< --------------------- 8<------------------
WANT [: :I INCLUDE REGRESS ALIAS

'[: ALIAS {
';] ALIAS }
:I R1 RSP@ ;
:I R2 R1 CELL+ ;
:I R3 R2 CELL+ ;
:I R4 R3 CELL+ ;
'RDROP ALIAS R-
'R1 ALIAS ii     \ The loop counter
'R2 ALIAS lim    \ The inclusive loop limit
'R3 ALIAS xt     \ What is to be executed in each loop
'R4 ALIAS inc    \ The loop increment
:I IX R2 @ ;    \ Assumed called from do-body (simple quotation).

\ A macro, it will do all the looping you want.
:I do[..]'   ( R1..R4:  ii lim xt inc )
    BEGIN ii @ lim @ - inc @ XOR 0< WHILE xt @ EXECUTE inc @ ii +! REPEAT ;
\ A simpler macro, it will do almost all the looping you want.
\ The loop increment is one.
:I do'    BEGIN ii @ lim @ < WHILE xt @ EXECUTE 1 ii +! REPEAT ;

\ The looping constructs
: do)  >R >R 0 >R   ( R:  xt lim ix )  do' R- R- R- ;
: do]  >R 1+ >R 1 >R ( R:  xt lim ix )  do' R- R- R- ;
\ " HIGH 1+ LOW DO <BODY> LOOP " translates to "LOW HIGH { <BODY> } DO[] "
: do[]  >R   1+ >R >R ( R:  xt lim ix )  do' R- R- R- ;

: do[..]  OVER >R >R 0< + 1+ >R >R   do[..]'  R- R- R- R- ;
: SPACES 'SPACE SWAP do] ;

\ -----------------------8< --------------------- 8<------------------

optimisation
------------
I got sidetracked into the issue of loops, because the traditional
Forth loops are hard to optimise. On the other hand, if your
loops don't optimise well, you might as well forget about optimisation.
My optimiser is pretty good at inlining simple quotations, and the
idea is that with the 16 registers of a decent processor (did you hear
that, Pentium 32?) a large part of the data stack *and* the return
stack can be kept in registers.
Peep-holing push-pop sequences into oblivion does the rest.

You may remember that I did a theoretical article about optimisation
of high level code, based on knowledge of stack effects.
That has been implemented on ciforth, and the next step is to replace
a high level code sequence of assembler words by collating the assembler
code. This must work even for the run time code of DO LOOP.
So even without the new do's the situation is reasonably bright.

Design criteria
---------------
You may be weary about having quotations all over the place.
Simple quotations are sufficiently lightweight to not increase
the number of characters to type.

Let's see

"
  S[ ] OK 10 0 DO &A EMIT LOOP
  AAAAAAAAAA
  S[ ] OK 10 { &A EMIT } DO]
  AAAAAAAAAA
  S[ ] OK
"

If we had to use :NONAME , it would be clunky and
incomprehensible.
But maybe then we also should forego -scripting- and
the other example would then be
"
  S[ ] OK :NONAME 10 0 DO &A EMIT LOOP ; EXECUTE
  AAAAAAAAAA
"
Other Forth lectures
Go to the home page of Albert van der Horst