Counted loops are indispensable in Forth. They hide one item, probably more, that would otherwise clutter up the stack to the point of no return. The Forth loops were designed with one eye on primitive machines; it is now time to design them with one eye on modern optimisers. Chuck Moore introduced FOR NEXT a long time ago, undoubtedly to avoid the complications of Forth's DO LOOP construct. The introduction of quotations gives us an opportunity to reconsider the loops. Simple quotations ----------------- With quotations I will refer to simple quotations, that are just a sequence of instructions where all components are simply compiled in the current context. In this way a quotations becomes a denotation: a sequence of symbols that results in a constant, traditionally called an execution token. This is a compile time constant and can be subject to all kinds of optimisation, contrary to the elendhafte sophistication of quotations that drag a not-so-well-defined environment from compile time into the run time. For these simple quotations (s.q.'s) the notation [: ;] is both clunky and misleading. Clunky because so basic a notation is worthy of single characters brackets. Misleading because the [ tells us to switch to compile mode, while a s.q. does not depend on which STATE we're in. Misleading also because the : tells us that a name follows, which is not the case. As luck will have it neither { nor } are standardized. The bottom line is that I can and will use { } in the following to replace both the :NONAME ... ; construct (interpret STATE) and the [: ... ;] (compile STATE) construct. This doesn't make a s.q. STATE-smart, constants - some call them literals - are never STATE-smart. Loops revisited --------------- Let us have a look at other programming language, how they implement loops. Invariably we have the following components in a counted loop, which we will give Forthy names: * ii \ The loop counter * lim \ The inclusive loop limit * xt \ What is to be executed in each loop * inc \ The loop increment The C loop construct is for (ii=1; i<=lim; i=i+inc ) { xt xt xt } The FORTRAN loop construct is DO 1000, i=1, lim, inc xt xt xt 1000 CONTINUE The algol68 is similar but not line oriented: 'for' i 'from' 1 'to' lim 'by' inc 'do' xt xt xt 'od' The Python loop construct is for i in range(1,lim-1,inc): xt xt xt Python is no good, it requires constructing a data structure. We will not deviate from the Forth way and throw some single cell variable haphazardly into memory, preferably the return stack. Combining the Fortran/algol and C examples we arrive at 1 lim inc { xt xt xt } DO[..] as a reasonably way to express loops in Forth. From FORTRAN/algol we borrow the notion that the `inc cannot be changed. Recomputing `inc in the loop is a bad idea. It flies in the face of the meaning of a counted loop, and thus is confusing to human and optimiser alike. This allows another significant advantage. By making the increment negative we have a decrementing loop. And if that is known at compile time, the optimiser will have a field day. Algol68 features defaults. "'by' inc" is optional and defaults to "by 1". Likewise the 'from' defaults to 1. Where Forth is not afraid of introducing more words we will have defaults for frequent cases. Forth loop constructs --------------------- At last know we know how we want Forth loops to look. We already had low high lim inc { xt xt xt } DO[..] for the most general case. The following two words have a default for `inc of 1. Without a default lower limit: low lim { xt xt xt } DO[] With a default lower limit of 1: lim { xt xt xt } DO] Mathematics knows the [a,b] notation for intervals a<=x<=b and [a,b) for intervals a<=x<b. The above notation leans on that, leaving out the left bracket if the left limit is default. Non inclusive upper limits are important to Forth so we add also lim { xt xt xt } DO) Here `lim is a non exclusive upper limit, the increment is still 1 and the lower limit is - of course - 0, or it wouldn't make sense to introduce DO) A word to access the index within the loop body is essential, but we need to call the index IX to avoid a clash with the default I. Examples -------- In the following examples the part before "S:" is the code executed, after "S:" you see the stack content. The output by . is not shown, but they are the same than the stack content. It may be disturbing but the do-words are not compiling words. Simple quotations do all the compilation needed. For the experts: No problems are to be expected by POSTPONE-ing do's because the do's have "default compilation behaviour". EXAMPLE: : test 3 { 12 } do) ; test S: 12 12 12 EXAMPLE: : test 0 { 12 } do) ; test S: EXAMPLE: : test -1 { 12 } do) ; test S: EXAMPLE: : test 3 { 12 } do] ; test S: 12 12 12 EXAMPLE: : test 3 { IX DUP . } do] ; test S: 1 2 3 EXAMPLE: : test 3 { IX DUP . } do) ; test S: 0 1 2 EXAMPLE: : test 2 5 { IX DUP . } do[] ; test S: 2 3 4 5 EXAMPLE: : test 2 7 2 { IX DUP . } do[..] ; test S: 2 4 6 EXAMPLE: 3 { 12 } do) S: 12 12 12 EXAMPLE: 0 { 12 } do) S: EXAMPLE: 3 { 12 } do] S: 12 12 12 EXAMPLE: 3 { IX DUP . } do] S: 1 2 3 EXAMPLE: 3 { IX DUP . } do) S: 0 1 2 EXAMPLE: 2 5 { IX DUP . } do[] S: 2 3 4 5 EXAMPLE: 2 7 2 { IX DUP . } do[..] S: 2 4 6 EXAMPLE: 7 2 -2 { IX DUP . } do[..] S: 7 5 3 EXAMPLE: 7 1 -2 { IX DUP . } do[..] S: 7 5 3 1 Implementation of the new do-constructs --------------------------------------- The implementation is short, at least if you're into VARIABLE's. : do[..] xt ! DUP inc ! 0< 1+ + lim ! ii ! BEGIN ii @ lim @ - inc @ XOR 0< WHILE xt @ EXECUTE inc @ ii +! REPEAT ; The "XOR 0<" takes care of decrementing loops. The following is an implementation on ciforth that requires the latest snapshot release of lina32 https://github.com/albertvanderhorst/ciforth Notes: 1. Only the examples require s.q.'s. If you don't have them you can replace them by :NONAME ; or [: ;] constructs with great prejudice. 2. This is a half decent practical implementation that stores everything on the return stack, in a traditional way. Of course this can only be accomplished with some assumptions, in this case that the return stack is addressable, a word RSP@ is available, and that it grows down. 3. :I introduces a macro. This is an immediate word that inlines its content. Otherwise factorisation of return stack manipulation would be a cream (i.e. French for crime, this is a Dutch expression). 4. Two's complement is (probably) required for the XOR technique to work. \ -----------------------8< --------------------- 8<------------------ WANT [: :I INCLUDE REGRESS ALIAS '[: ALIAS { ';] ALIAS } :I R1 RSP@ ; :I R2 R1 CELL+ ; :I R3 R2 CELL+ ; :I R4 R3 CELL+ ; 'RDROP ALIAS R- 'R1 ALIAS ii \ The loop counter 'R2 ALIAS lim \ The inclusive loop limit 'R3 ALIAS xt \ What is to be executed in each loop 'R4 ALIAS inc \ The loop increment :I IX R2 @ ; \ Assumed called from do-body (simple quotation). \ A macro, it will do all the looping you want. :I do[..]' ( R1..R4: ii lim xt inc ) BEGIN ii @ lim @ - inc @ XOR 0< WHILE xt @ EXECUTE inc @ ii +! REPEAT ; \ A simpler macro, it will do almost all the looping you want. \ The loop increment is one. :I do' BEGIN ii @ lim @ < WHILE xt @ EXECUTE 1 ii +! REPEAT ; \ The looping constructs : do) >R >R 0 >R ( R: xt lim ix ) do' R- R- R- ; : do] >R 1+ >R 1 >R ( R: xt lim ix ) do' R- R- R- ; \ " HIGH 1+ LOW DO <BODY> LOOP " translates to "LOW HIGH { <BODY> } DO[] " : do[] >R 1+ >R >R ( R: xt lim ix ) do' R- R- R- ; : do[..] OVER >R >R 0< + 1+ >R >R do[..]' R- R- R- R- ; : SPACES 'SPACE SWAP do] ; \ -----------------------8< --------------------- 8<------------------ optimisation ------------ I got sidetracked into the issue of loops, because the traditional Forth loops are hard to optimise. On the other hand, if your loops don't optimise well, you might as well forget about optimisation. My optimiser is pretty good at inlining simple quotations, and the idea is that with the 16 registers of a decent processor (did you hear that, Pentium 32?) a large part of the data stack *and* the return stack can be kept in registers. Peep-holing push-pop sequences into oblivion does the rest. You may remember that I did a theoretical article about optimisation of high level code, based on knowledge of stack effects. That has been implemented on ciforth, and the next step is to replace a high level code sequence of assembler words by collating the assembler code. This must work even for the run time code of DO LOOP. So even without the new do's the situation is reasonably bright. Design criteria --------------- You may be weary about having quotations all over the place. Simple quotations are sufficiently lightweight to not increase the number of characters to type. Let's see " S[ ] OK 10 0 DO &A EMIT LOOP AAAAAAAAAA S[ ] OK 10 { &A EMIT } DO] AAAAAAAAAA S[ ] OK " If we had to use :NONAME , it would be clunky and incomprehensible. But maybe then we also should forego -scripting- and the other example would then be " S[ ] OK :NONAME 10 0 DO &A EMIT LOOP ; EXECUTE AAAAAAAAAA "