Stack limits are pretty large. It's a natural way to express many problems such as graphs (and their subsets trees).
When people talk of "Tail Call Optimization" (TCO) they are talking about a way to recurse without leaving things on the stack (tail-call meaning that the recuse call is the last thing in the function so the state of the function doesn't need pushed on the stack, it can be discarded.
> Stack limits are pretty large. It's a natural way to express many problems such as graphs (and their subsets trees).
Coincidentally, I had a stack overflow while working on some trees a few months ago, it ate all my ram within about 2 seconds. Although my stack was sizeable, my program was blind and unaware of the stack limit.
Other languages probably do this too, but some Scheme implementations don't have a stack limit, or no stack at all. So you can define certain functions (map being a notable one) using their natural recursion and not worry about exhausting the stack with large inputs.
That's because of TCO, not because of a stack limit or its absence. A non-tail-call version of map would still blow up when you run out of memory, whether the implementation uses a stack like C, or keeps the stack on the heap (as I understand some scheme implementations do, particularly for dealing with continuations).
Yes, of course you can exhaust the heap, but there are some tricks that can be done to make it not much of an issue. For instance, "Guile arranges to return the unused part of the stack to the operating system, but without causing the stack to shrink." [0]
So, thanks to that, we can define map the clear and easy way:
(define (map f l)
(if (pair? l)
(cons (f (car l))
(map f (cdr l)))
'()))
Further down in that document, and in your quoted portion, it's clear that this doesn't prevent stack growth to the limit of the system's memory. It only mitigates the issue. They double the size of the stack (by default) when they hit the current stack limit. When the GC happens they take the still unused portion of the stack memory and turn it back over to the OS by freeing it. If the recursion is still too large for the overall system memory it'll still hit a limit (system memory), they've just succeeded in delaying it for non-tail recursive functions.
That said, it is a nice way to reduce the chance or delay the occurrence of hitting the stack limit (whether it's on the heap or an explicit stack that's still what's happening).
Well yeah, if you can make a procedure less than O(n) in space by using tail recursion, you should. But in the case of map, you might as well use the stack, since you have to reverse the list in the tail recursive version anyway.
Oh, I definitely see the elegance of expression programs or concepts recursively.
Is TCO used in C++? I've only ever seen it mentioned in reference to Scheme and Haskell. It's a compiler optimization, so it would be transparent to the programmer, right?
Most C++ compilers perform TCO when convenient, but the C++ language doesn't guarantee that TCO will be done, so it's unwise to write code that depends on it. In practice, every compiler decides when to do TCO in its own way. Also, it's useful to be able to run code with optimization disabled. Also, it's easy to accidentally have a variable that needs a destructor call which makes TCO impossible. This is all in contrast to Scheme, which does give you a guarantee, and which makes it easy.
It's implementation-dependent; a particular compiler may do it, or may not. Scheme, OTOH, has it written in the spec--you WILL do tail call optimization, so programmers can count on it.
TCO is available in plenty of other languages as well. Scala, Clojure, Erlang, and Elixir support it out of the box. Ruby has a VM setting for it, and there was a proposal to add it to JavaScript. I'm sure there are plenty of others.
Per ch4s3's comment on Clojure, Scala also does not have TCO, at least, last I checked. The JVM doesn't allow it. It has a specialized handling of functions that call themselves, which it can optimize, but A that calls B that calls A that etc is not (nor, in the more general case where any function that ends with a function is removed from the stack).
To add a finer point, Clojure doesn't have TCO per se because the JVM doesn't have tco. Recur optimizes the special case of "tail-recursive self-call and not the generalized tail call."[0] That is to say when the tail of A calls A, but not when A calls B which calls A.
That's not accurate. In fact, in the .NET 2.0 timeframe the opposite was the case (the 64-bit JIT would fail to honor the IL tail. prefix in many cases, though it would also make tail calls in other cases even when the prefix wasn't applied)[1]. But as of .NET 4.0, both platforms honor the tail. prefix except in a very limited set of circumstances[2] (e.g. calls from untrusted code to trusted code).
As far as I know, F# is the only reasonably popular .NET language to emit the tail. prefix, though[3].
The problem with TCO in C++ is destructors; unless you're using a very limited set of types, they all but guarantee you're going to have code to execute after returning from what would otherwise be a tail call.
TCO isn't really transparent to the programmer. Some recursive calls cannot be optimized, so the programmer has to know how to write recursive functions that can be.
You can't count on it in the general case even with gcc (even when taking care to heed the restrictions like not passing pointers to data on the stack, and putting all functions into the same compilation unit). I did tests a couple years ago where it would compile up to about a handful of mutually recursive functions with TCO, then when you added one more, it would disable TCO. I guess because some algorithm will stop analyzing the call graph past a certain mark and then disallow TCO from being applied.
Admittedly recursion is not done often, but in some cases it's very useful.
For example, recently I created an application with a tree record structure where the user could delete a node and it needed to delete all sub-nodes.
With recursion it's easy to tell: 'call yourself for all subnodes you have, then delete yourself'. This is something that's much more difficult to express with the usual for/while constructs.
Commonly when traversing hierarchical data structures. If you're certain the data structure isn't accidentally cyclic or extremely deep, there should be no problems.
How does the code handle a stack overflow? Stack overflows seems opaque to me, while preallocating memory (malloc) has mechanism to signal failures for proper handling. Can you point me to some code?
Try it, just write a tiny function that calls itself and has a local variable (make sure to use the variable after the function call so it's not optimized away).
Something like:
int recurse(int foo) {
int bar;
printf("%d\n", foo);
bar = recurse(foo + 1);
printf("%d\n", bar); //this is just to avoid bar being optimized away
return bar;
}
By me it seg-faults after 174,594 calls. Presumably that has to do with the size of the stack and the memory allocated for the function.
It is big, yes, but, in writing robust code, you have to anticipate and handle a stack overflow cleanly. How do you do this?
In principle, you have that problem with almost any language that uses the system stack to pass arguments when it makes a function call. It just happens to be easier to trigger it if you use a recursive algorithm, because it’s unlikely that you would write a program with 200,000 distinct functions each of which just called the next in a chain until something fell over.
In practice, recursive algorithms are often used to walk recursive data structures and often have logarithmic complexity in the size of the data structure. Even structures that fill the memory of a supercomputer probably aren’t going to need more than a few hundred levels of recursion in that scenario, while the available stack space is probably several orders of magnitude larger. So to be blunt, unless you really are talking about something where failure really isn’t an option (safety critical systems and the like), you probably just treat this as the same kind of catastrophic failure that running out of system memory would be, and use some OS-level support to abort as gracefully as you can if the sky falls.
If you really are working in a more constrained environment — safety-critical code, say, or perhaps an embedded system with relatively little memory — then it might be the case that you simply avoid the recursion altogether unless you’re using a language that guarantees tail recursion will be dealt with robustly. Some coding standards forbid the use of dynamically allocated memory for much the same reason, and you just have to use different tools to solve those problems if you’re working with such constraints.
> you have to anticipate and handle a stack overflow cleanly. How do you do this?
You use sigaltstack and catch SIGSEGV.
It's probably easier to handle if you hit a soft limit rather than the heap. Then you raise the limit, set a flag to make your functions unwind and reduce the limit again.
> Stack overflows seems opaque to me, while preallocating memory (malloc) has mechanism to signal failures for proper handling.
On Linux (and probably on other OSs) it's not easy to detect that the system is out heap memory because of lazy memory allocation policies.
Physical memory is not allocated when you call malloc() but when you first read or write the corresponding memory page.
Because of this, malloc() does not necessarily return NULL even if there isn't enough memory available on the system at the time you called malloc(). But if the system is still out of memory when your first access the allocated memory, your program will crash with a segfault.
So in absence of extra precaution, using heap memory isn't any safer than using stack memory.
It's typically handled by ignoring the issue and bumping up the maximum stack size if it becomes a problem. For example, I once wrote a Java indexer that simply failed on very long chains of function calls.
I've written some nice, very clear C code using recursion; a nice example was a simple copying garbage collector which has something like 2-3 words of overhead on the stack per recursive call (made use of some global variables to reduce the number of arguments passed between functions though). The actual moving code was only maybe 10 lines.
Recursion can be a good fit for anything that's using a naturally-recursive process: for instance, when processing files in a directory along with all the files in subdirectories.
Stack limits may or may not be a concern, depending on the application. In the case of directories, if you've got them nested 10k+ deep you've got other problems.