Porting to Python 3: An offer for you

bdarnell · on Dec 10, 2011

A lot of the recent discussion around python3 has been about how 2to3 is slow and inconvenient, so this seems like a good time to plug my attempt to improve the process. https://github.com/bdarnell/auto2to3 uses an import hook to run 2to3 on changed files automatically, and cache the results on disk between runs so it doesn't have to run 2to3 over the entire project every time (and the caches are ordinary files so you can see exactly what 2to3 is producing).

This is how I added Python 3 support to Tornado. If you intend to keep supporting Python 2.5, I definitely recommend using 2to3 instead of the contortions required to use a single codebase across all versions (if you're only supporting 2.6+ it gets much easier to do everything in one codebase without 2to3, although even then 2to3 can help keep the primary (2.x) codebase cleaner).

saurik · on Dec 10, 2011

The entire concept of 2to3, however, was stupid... what you need is 3to2. You want to be encouraging people to write new code, and support old versions as best as possible, not write old code and support new versions as best as possible. 2to3 should be used as a one-off migration tool, but it is seriously suggested to use it as part of people's builds.

bdarnell · on Dec 10, 2011

In the long term I think we need both 2to3 and 3to2, but for now 2to3 is more important. As long as both my dependencies and the users of my library are mostly on 2.x, I want that to be my primary development platform. If the only way to support both platforms was to do most of my work in python 3 and then backport to 2.x, I'd have never gotten started because that's too much of a change to take on at once. 2to3 gives me a way to support 3.x now (even as a second class citizen), and then a path to flip things around and make 3.x the primary platform when I'm ready.

saurik · on Dec 10, 2011

I just find the "for now" pretty hilarious: we are already seeing a shipping version of Python 3.2, with Python 3.x having been released for years, and we still need to encourage people to develop code for Python 2.x and use a build step to convert their code to Python 3.x. The effect of this attitude directly contributes to the behavior described in that recent article by the Jinja2 developer: where Python 3.x support is often considered by people to be just "look, I did it, and while it is a horrible hack and much slower than the 2.x version, it sort of works; now leave me alone", as opposed to helping people slowly embrace Python 3.x.

zzzeek · on Dec 10, 2011

I was just thinking of mocking up something like this...so why don't you write us up a blog post about this and get a little publicity ? If we could add on an "auto2to3" as easily as we build a virtualenv this has the potential to neutralize the "it's too hard to wait for 2to3" complaint.

pak · on Dec 10, 2011

The problem is, people have been told for so long not to care about Python 3 that it's hard to imagine what could make people actually start caring about it. Practically the only people that have even tried it are Python fanatics or compiler devs.

I understand the desire to make fundamental language changes and why it's cleaner to make Python 3 backward incompatible. I don't agree that it was the smartest thing to prevent any kind of interoperability, though. The way people could begin using Python 3 is if you could still import Python 2 modules from Python 3 code. OK, there would be some tough nuts to crack to get the interpreter to switch contexts gracefully and translate objects from one side to the other. But at least people could start using Python 3 and then begin switching things over one at a time. To ask everyone to keep running 2to3 to maintain a second codebase that nobody uses is kind of silly and unrealistic, and causes this détente where too few people care enough to make a first move.

Also, gripe here. Why the heck is def method(self, actual_first_arg, ...) still all over the place in Python 3? They had a golden opportunity to kick that bizarre holdover from when Python couldn't do OO [1], and they didn't [2]. The only solid argument Guido has against it is about decorators, and it really only applies to @classmethod and @staticmethod, which arises primarily from the stodgy way Python implements these OO features and which could have been nipped in the bud with their own language changes.

[1] http://beust.com/weblog/2008/10/28/pythons-enigmatic-self/

[2] http://neopythonic.blogspot.com/2008/10/why-explicit-self-ha...

zzzeek · on Dec 10, 2011

I love `self`. If it were just magically there, that would break with the entire rest of how the language works - everything is explicitly from something - even the "built in" functions are right there from __builtins__. What appeals to me about Python is the namespacing model is completely consistent throughout and explicit `self` is part of that.

pak · on Dec 10, 2011

So, the namespacing is completely consistent, but the number of arguments between method definition and method calling is not? That breaks consistency both internally (because regular functions do have the same number of arguments) and with just about every other language I've used. Something tells me adding one little keyword that pokes into the namespace would not be the end of the world. It would, however, save me (and anyone else that has to juggle multiple languages) from several groaner bugs per day.

Guido can repeat as many times as he likes that calling an instance method is morally equivalent to passing the instance as an additional argument. It may even be implemented that way. The reality is that nobody except compiler devs actually think like that--to everyone else it's an implementation detail. It also doesn't naturally explain why it would be shoved in as the first argument as opposed to appended as the last.

And the fact that self is an idiom, and not a keyword, just feels wrong to me. If nobody would actually ever use any other word in sane code, there is no point in preserving the flexibility of doing so by defining the same word over and over again.

DasIch · on Dec 10, 2011

Methods and functions are not different.

Functions implement the descriptor protocol so that when you access them as an attribute of a class they basically return a partially applied function.

If you remove self and introduce methods as a concept a lot of things become a lot more difficult or impossible for the sake of a few people who think that making self implicit is somehow cleaner without any real argument behind it.

pak · on Dec 11, 2011

> Functions implement the descriptor protocol so that when you access them as an attribute of a class they basically return a partially applied function.

Exactly. You just spoke like a true Pythonista: you used an implementation detail in the language to justify a language design decision. In my experience, people diving into Python will not care (or want to know) about the descriptor protocol (it's not mentioned in the majority of books about the language). They will not be messing with __get__ or __set__ the vast majority of the time. Only much later, when they want to start doing fairly magic things with attributes, will they even realize it exists. There is no analogue for the descriptor protocol in any other currently popular language. How can you consider a magic-method protocol that curries all functions when accessed as an attribute the "explicit" or "simple" way to design this?

I guess this is why proposing this is so hopeless. Methods do not have to be distinguished from functions to create an implicit self. Instead of having FooClass.bar() and foo_obj.bar() do any of this [1] (please tell me with a straight face that the average Python programmer knows or cares about any of that) have the former add to the local namespace FooClass as self and the latter add foo_obj as self when the function is called, defined as the behavior of the language's syntax. Like, I dunno, JavaScript, or PHP, or Java, or C#, or... For the 1% of people that care, allow it to be overridden by redefining __get__ and so on. Maybe even redefine the deprecated apply() [2] so people can specify a different self in the course of calling of function, like in JavaScript.

[1] http://docs.python.org/howto/descriptor.html#functions-and-m... [2] http://docs.python.org/library/functions.html#apply

anamax · on Dec 10, 2011

I love self too, but I'd like the syntax to be

def self.meth(arg0) instead of def meth(self, arg)

so the declaration looked ike the call. (And yes, the self can be a different name so a similar syntax can work for classmethods.)

phaylon · on Dec 10, 2011

The new developments in Perl syntax mostly solve this by having an explicit separator for invocant specification:

    method foo ($bar, $baz) { ... }

has $self, $bar and $baz, while

    method foo ($class: $bar, $baz) { ... }

has $class, $bar and $baz.

briancurtin · on Dec 10, 2011

> The problem is, people have been told for so long not to care about Python 3

Who has been saying this "for so long"? I seem to remember Guido saying something along those lines at PyCon 2008 or 2009, but I guarantee that is not his current position, nor has it been in recent times. That original position was to not care about it for your production code - keep on doing your thing, but keep an eye on 3.x enough that you know what's going on and can make the switch when ready.

The rest of your post has nothing to do with the topic.

phzbOx · on Dec 10, 2011

My 2 cents: stick with 2.7, it works well with all libraries and it's well supported everywhere. 3.0 provides nothing worth wasting million dollars on. The current mentality is "Avoid adding new feature to P3.0 to give time for people to port to it". What's happening? "Why would I switch to P3 if there's nothing worth switching for."

regebro · on Dec 10, 2011

That is incorrect, there is no such mentality and has never been any such mentality. Of course no new features is added to 3.0, because it's unsupported nowadays, Python 3 is up to 3.2, and 3.3 is coming soon.

There was a moratorium for Python 3.2 where no more features was added to the language, to get other implementations than CPython a chance to catch up. But that was not a question of porting, and that did not include new features in general, just to the language. New features in the standard library and other implementation improvements was done in 3.2, and in 3.3 new language features are added again.

So the claim of a mentality of "avoid adding new features" is nothing but FUD and complete hogwash.