Note that Python doesn't use green threads. It uses real, native threads. The thing is that the interpreter will prevent any other threads from accessing it until the GIL is released.
The problem with the GIL is nearly the opposite. The lock is dropped for system calls but kept when Python code is running. That's not a problem for IO bound workloads but does not scale well over multiple CPUs for CPU bound tasks that do most of the work in Python-land.
I not sure dropping the GIL is the best solution. Something like Erlang message passing sounds like a better model. I'm not sure if that's a good fit either though since it works best with functional style programming.