Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Python refcounts and it has a real mark+sweep collector for collecting cycles. It's not a dichotomy, you know.


The reference counting causes the problem here. When multithreading means that incrementing/decrementing a reference count is no longer deterministic, you need locks, or all hell breaks loose. Adding a mark&sweep GC isn't going to fix that.

While I'm not familiar with the specifics of Python's GC, a mark&sweep phase is usually added to reference counting so that if there's garbage which contains references to itself but has no external references, it will eventually be collected. (_Garbage Collection_ by Richard Jones and Rafael Lins is an excellent resource on GC details, btw. There's also a decent overview in the O'Reilly OCaml book (http://caml.inria.fr/pub/docs/oreilly-book/html/book-ora082.... )). In other words, it plugs the worst memory leaks caused by reference counting.

How to do multiprocessor / multithread GC well is still an area of active research. In the mean time, one simpler solution is to have several independent VM states, each running in their own thread (or process), and communicating via message passing. Lua makes this easy, but its VM is considerably lighter than Python's.


The mark and sweep collector is a little more complicated than that. It doesn't use a single "seen" bit the way a normal GC header would, instead it uses the refcount itself in very, very clever ways




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: