PyPy 1.5 Released: Catching Up

amix · on April 30, 2011

I did a benchmark of Unladed Swallow 1.5 years ago and at this time PyPy looked like a very academic project. It could not really run any interesting code. At this time Unladed Swallow looked like a great project since improving cPython rather than starting from scratch seemed much easier, especially since Unladed Swallow based their JIT optimization on LLVM's JIT engine.

I also recall reading PyPy's blog, especially a blog post about their Prolog based JIT prototype and I thought "wow, this sure seems like a complicated way to implement a JIT engine, I wonder if they will ever implement something that can run real code".

Fast forward to today and Unladed Swallow is dead and PyPy has implemented a Python implementation that's compatible with Python 2.7 and beats cPython 2.6 on various benchmarks. Pretty impressive and kudos to the PyPy team.

chubot · on May 1, 2011

Exactly my thoughts -- it seemed like a crazy research project, generating an interpreter written in C, and ones for jvm and clr. And it has 3 different pluggable garbage collectors apparently.

But I'm impressed that it actually runs my code, in contrast to the other alternative Python implementations I've tried.

Look forward to seeing more of sandboxed mode, which has been a much desired feature of CPython for many years now. And stackless mode. I just hope they can get the executable size and build times to a reasonable level.

luismgz · on April 30, 2011

Many people seem to ignore that pypy is not a new project. It's been on the works for more than 8 years now.

illumen · on May 1, 2011

It's also had a lot of funding, and more full time workers than on CPython (who has zero full time coders).

etal · on May 4, 2011

Unladen Swallow isn't dead, actually -- it just merged into the main CPython code base. The first few quarters of optimizations are in Python 2.7 (it's noticeably faster than Python2.6) and the more adventurous bits are on separate branches in SVN.

andrewcooke · on May 1, 2011

somewhere on the net are comments i made about european idealists v american pragmatists (the whole "worse is better" thing).

i have never been so glad to be proved completely wrong :o)

sciurus · on April 30, 2011

I'm impressed that they support the features of CPython 2.7 already. PyPy seems to be developing much faster than other alternate python implementations.

kingkilr · on April 30, 2011

IronPython actually beat us to 2.7, however I will note that it was pretty fast. We went from 2.5 to 2.7 support in under 12 months.

kbd · on April 30, 2011

I will love it if in a couple of years the mainstream implementation of Python is PyPy. People are going to be upgrading their Python installations anyway when they move to Python 3. If y'all can get Python 3 support in, maybe people will switch to PyPy at the same time.

I just wish Google would throw some support your way.

sho_hn · on April 30, 2011

Indeed, what are your current plans for Python 3.x?

kingkilr · on April 30, 2011

Future cloudy, ask again later :) Because of the way our JIT is written it would be entirely possible for us to branch for py3k and put JIT improvements on both with relatively little effort, it's likely that at least our next release will still be 2.x and just feature optimizations though.

sho_hn · on April 30, 2011

Here's one vote for focussing on py3k sooner rather than later, fwiw :). I understand you're anxious to reap some rewards and see (more) production use at this point, but as a big fan of much of what was done to the language in py3k and having made the switch already, PyPy vs. CPython 3.x is an unhappy dilemma for me :).

TillE · on April 30, 2011

For me (and I suspect quite a lot of other people), PyPy and CPython 3 both suffer from the same problem: lack of support for third-party libraries. Moving to Py3k would only exacerbate that problem for PyPy, removing their current support for the likes of Django and Twisted.

sho_hn · on April 30, 2011

I'd assume they'd very likely keep up support for Python 2.7.x for quite some time past the first py3k-compatible release, similar to what CPython is doing.

Although personally I actually wouldn't mind giving up one for the other, as that would allow PyPy to act as an extra incentive to drive py3k adoption in third-party libraries. Of course I understand if the PyPy developers don't just want to use their hard work as a gaming piece to drive the py3k adoption agenda, though :).

carlosedp · on April 30, 2011

I wonder how hard is the integration of Stackless and the JIT features. That would be killer.

kingkilr · on April 30, 2011

There's actually a sprint going on right now (where this release was done), and that was a topic of discussion. I don't think anyone has written up the conclusions yet, but estimate I heard was that it'd be about 1-month of person-work.

carlosedp · on April 30, 2011

Fantastic... Christian worked hard on Stackless and I think it`s a great feature. On Pypy it would be easier to maintain and add functionality, unfortunately, I struggle on Stackless internals and it`s low level C gimmicks so I can't help on it. It would be great to have something like a good framework in the future to make it`s adoption easier. We have the Stackless Examples project to address this initial learning curve thing. http://code.google.com/p/stacklessexamples/

carlosedp · on April 30, 2011

Nice, the CCP guys are great maintaining Stackless... Kristjan and Richard do a great job on it. If there are any clues or tips on how can I help out, it would be a pleasure.

euccastro · on May 1, 2011

Have you asked in the Stackless mailing list?

kingkilr · on April 30, 2011

Yes, unfortunately Christain couldn't make it to the sprint, we would have liked to have him there. However, Kristjian (sp?) from CCP who maintains Stackless was there.

euccastro · on May 1, 2011

Kristján.

ch0wn · on April 30, 2011

Any chance you could update your Ubuntu PPAs soon?

thristian · on May 1, 2011

My sentiments exactly: I would love to have a reasonably up-to-date PyPy around to try things in, but I'm not yet so invested in the project that I'm willing to watch for release announcements and download source-tarballs (or binary tarballs). Having a reliably updated PPA is pretty much exactly what I'm looking for.

stavros · on May 1, 2011

Since you can't see votes, pretty please yes, this.

va_coder · on April 30, 2011

Maybe this is a dumb question but why not just work on making CPython faster?

Edit: I guess this provides insights:

http://codespeak.net/pypy/dist/pypy/doc/architecture.html

kingkilr · on April 30, 2011

I wish I could find the video the talk where we explained this in great detail, but basically: it's hard, it's been tried and failed (notably by Armin Rigo, one of the leads of PyPy). Adding a new GC to CPython is insanely hard, and manually writing a JIT for Python even moreso.

riobard · on April 30, 2011

Unladen Swallow seems to indicate that making CPython substantially faster in the general way is not feasible as of now.

erez · on April 30, 2011

I'm really interested in this project. The abundance of C as "the new machine language" for language implementation, as well as the JVM-based languages, makes me wonder what it means for the programming language landscape. I would be very interested in learning whether this un-C implementation made any difference to Python, and what can be learned from it in way of other language implementations

nickik · on May 1, 2011

I'm not really intressted in the Python interpreter that pypy is but the other part pypy (the toolchain) is maybe the best thing since sliced bread.

I have the plan to start working on the scheme implmentation.

The thing I don't really like about the project is that it has one name for two things. I can understand how that came about historicly but I think the should make two names out of it.

kingkilr · on May 1, 2011

We've started referring to it as the "RPython translation toolchain" which will hopefully reduce confusion.

nickik · on May 1, 2011

You should think of a snappier name. Other than that keep up the good work. I'm atm reading all the pypy papers, very cool.

fijall · on May 2, 2011

you should have a look at scheme interpreter that was written already maybe: https://bitbucket.org/pypy/lang-scheme/overview

justincormack · on April 30, 2011

The benchmark here http://attractivechaos.github.com/plb/ suggests they are doing much better on performance than they were a while back. Its a straight numeric benchmark so not relevant for everything but its better than V8 which is good work.

kingkilr · on April 30, 2011

1.5 should be a lot better on these, the Loop Invariant Code Motion really helps on very tight loops, as these benchmarks often have.

justincormack · on April 30, 2011

Thats great. Dynamic languages at say Java like speeds is a real game changer over the PHP/Rubylevel speeds. At the moment we only have LuaJIT proving its possible, but PyPy is the next contender.

swannodette · on April 30, 2011

While this is great news for Python, there are quite a few dynamic languages now beyond LuaJIT showing that it's possible to get stellar perf - JavaScript, Racket, Clojure.

kragen · on May 1, 2011

Last I heard, no JS engine was close to LuaJIT's performance; they were all worse by a factor of five or so. Clojure being in the LuaJIT ballpark would surprise me (do you have benchmarks?) but Racket wouldn't. (But I wouldn't call Scheme a dynamic language!)

swannodette · on May 1, 2011

http://shootout.alioth.debian.org/u64/which-language-is-best...

Clojure 1.2.0 is already faster than Racket. Version 1.3.0 will probably bring it close to Go territory on those benchmarks.

For my own work I've found that getting Clojure in the Java ballpark is certainly possible.

igouy · on May 2, 2011

http://shootout.alioth.debian.org/u64/benchmark.php?test=all...

kragen · on May 2, 2011

Yeah, I was wrong about Racket: http://shootout.alioth.debian.org/u64/benchmark.php?test=all...

kragen · on May 1, 2011

You're right; it looks like Racket is not in the C/LuaJIT ballpark, either. It's too bad the LuaJIT results are no longer on the web site.

swannodette · on May 1, 2011

Last Alioth results I saw LuaJIT was slower than Java, and certainly not in the C ballpark at all.

kragen · on May 1, 2011

http://lua-users.org/lists/lua-l/2009-10/msg01098.html shows LuaJIT beating GCC 4.3.2 on some parts of SciMark, and no more than 3× worse on any part. I'm importing the shootout CVS repository into Git, so hopefully we can make some more definitive comparisons soon.

mikemike · on May 3, 2011

That post is ancient ... here are newer SciMark results for LuaJIT:

http://lua-users.org/lists/lua-l/2010-12/msg00924.html

kragen · on May 3, 2011

Thank you very much! Looks like LuaJIT's beating GCC on almost everything now, and the JVM on most things. But that's with LuaJIT-specialized code that won't run on Lua, which is a little less exciting.

swannodette · on May 1, 2011

Those overall results show that LuaJIT doesn't really compete much the JVM running under server mode.

kragen · on May 2, 2011

On the contrary, although the JVM is always faster, LuaJIT is within about 15% of it on SOR, and never worse by even a factor of 2.

euccastro · on May 1, 2011

> But I wouldn't call Scheme a dynamic language!

Care to elaborate?

kragen · on May 1, 2011

Scheme operations are mostly early-bound, even more so than Common Lisp; you standardly use vector-ref or string-ref, for example, instead of nth. (There are Schemes where this is not true, such as RScheme, but Racket is not among them.) Vectors aren't resizable; for that kind of thing, you must use lists. You can't attach arbitrary properties to some arbitrary object, the way you can in JS or Ruby. You can't introduce new variable bindings in the middle of a block. (What you do instead is make a new nested block.) Finite maps (assoc, make-hash-table) are second-class citizens. Standard Scheme is nonreflective; you're supposed to do your metaprogramming with macros, not reflection and interposition. (I don't know how much reflection Racket supports, but I'd guess almost none.) The object-literal syntax is quasiquote, which is clumsy.

It does have dynamic typing, but that's not what it means to be a "dynamic language". Dynamic languages are late-bound; everything is up for grabs at runtime. In standard Scheme, this is true in the fairly useless sense that everything depends on the bindings in the global namespace, so you could in theory (set! car somethingweird), but this is very rarely practical; it almost serves only to make optimizing Scheme compilers difficult to write. Aside from that, the language leans much further toward early binding, doing things at compile-time, and using efficient data structures at the cost of flexibility.

euccastro · on May 1, 2011

I don't think the meaning you give to "dynamic" here has much bearing on the difficulty of just-in-time compilation, except for reflection.

In Scheme you can define and create new functions and macros at runtime. There is not much else that is dynamic about the Scheme standard[1] because... there's not much else to the Scheme standard.

As for reflection, I see that mainly as a tools/debugging aid, and therefore very implementation-specific stuff. "Dynamic" languages have reflection in their definition because they are mostly defined by implementation.

That said, the Scheme standard doesn't preclude reflectivity at all, nor seems particularly compile-oriented to me. Nowhere it is assumed that implementations be compilers at all.

I think most of the things you find missing there were not left out for efficiency, but because it was rightly considered that they don't belong in the language spec at all.

[1] By the "Scheme standard" I mean R5RS.

kragen · on May 1, 2011

I agree, none of these features have much to do with the difficulty of just-in-time compilation. The stellar performance results Mike Pall is getting with LuaJIT, which has all of those features, kind of prove that. They do, however, have a lot to do with the usability and dynamic feeling of a language. This makes their absence from most Schemes less understandable.

> As for reflection, I see that mainly as a tools/debugging aid

You can use it that way, but you can also use it for metaprogramming.

> "Dynamic" languages have reflection in their definition because they are mostly defined by implementation.

Plausible.

> That said, the Scheme standard doesn't preclude reflectivity at all, nor seems particularly compile-oriented to me. Nowhere it is assumed that implementations be compilers at all.

True.

swannodette · on May 1, 2011

Most of the things you've said I imagine don't apply to Racket, I didn't say anything about Scheme in my original post.

kragen · on May 1, 2011

Well, I imagine they do, but I'd be interested in finding out whether my imagination is misleading me. Which things I said don't apply?

justincormack · on May 1, 2011

Indeed. The LuaJIT interpreter with JIT disabled is a similar speed to V8 in most benchmarks.

dman · on April 30, 2011

Done leave common lisp out - sbcl is no slouch and hasnt been for quite some time.

swannodette · on April 30, 2011

Totally forgot, Common Lisp's performance w/o sacrificing dynamism is what got me excited about Lisp in the first place.

kragen · on May 1, 2011

My very limited experience with SBCL has been that, although you can get performance that's considerably better than CPython, you don't get close to C performance without sacrificing dynamism and safety.

megaman821 · on April 30, 2011

Wow, it looks like Python (PyPy) is with an order of magnitude of C on every benchmark. Nice work!

john7 · on May 1, 2011

Can numpy be used with pypy? Or are there plans for that?

kingkilr · on May 1, 2011

There are plans, but nothing yet, this reddit thread has some decent information: http://www.reddit.com/r/Python/comments/h0uuv/pypy_15_releas...

hyperbovine · on May 1, 2011

I'd like to second the request for a very detailed blog post written by one of the PyPy devs on how interested parties can contribute to the development of Numpy on PyPy. Lack of Numpy support is the showstopper for me and a lot of other people in terms of switching over -- almost everybody who wants faster Python is using Numpy in some way or another :-) This is something I would be really interested in working on.

cdavid · on May 1, 2011

Not really, numpy depends too much on cpython internals. There is an interest in rewriting numpy for pypy, but I am not sure whether this is the best way of doing it. From what I understand, the choice is between better pypy's compatibility with C extensions (easier, but slower because of cpython emulation in pypy) vs rewriting it for pypy (harder, but would benefit more from pypy).

wnoise · on April 30, 2011

Why did they decide on a name that can easily be (mis)pronounced as a synonym for urine? That just seems like a poor branding decision, up there with "blur-ray".

timknauf · on April 30, 2011

I've always pronounced it like "pie pie". (But then again, maybe that's because I already knew the "py" was the first syllable of the word "python"?)

wriq · on April 30, 2011

Worked out ok for Nintendo on the Wii...

wnoise · on April 30, 2011

Fair enough.

Symmetry · on May 1, 2011

I wouldn't say Blu-ray has much to worry about on the "blur-ray" front. Plays on words like that are damaging when they call attention to real or perceived weaknesses in a product, but while I can think of a lot of mean things to accuse Blu-ray of, blurry simply isn't one of them.

njharman · on May 1, 2011

What language uses "ee' sound for "y"?

And who pronounces blue, blur?

euccastro · on May 1, 2011

Spanish does.

Not agreeing with grandparent, though. Lots of people admit to playing with their Wiis for hours a day.

wnoise · on May 1, 2011

It's not pronouncing "blue" as "blur", but extending or duplicating the r.