For fairness sake, quote from the same post (Dawkins changed his mind by 2012):
"So you can think of the protein-coding genes as being sort of the toolbox of subroutines which is pretty much common to all mammals -- mice and men have the same number, roughly speaking, of protein-coding genes and that's always been a bit of a blow to self-esteem of humanity. But the point is that that was just the subroutines that are called into being; the program that's calling them into action is the rest [of the genome] which had previously been written off as junk."
This makes sense to me (as a programmer :). But I wonder if someone managed to figure out the language in which the "caller" code is written. How does it say: call subroutine at position X, then if (condition), call subroutine at position Y, etc... I was not able to find any info on this.
That's a great question! As you know, DNA -> RNA -> Protein where the first arrow is transcription (polymerase) & the second one translation.
Genes are transcribed by a polymerase, and a couple of factors determine if the polymerase can be "recruited" there. Here are a couple of those factors:
1) Is the site accessible? That part of the genome needs to be "open chromatin" for it to be transcribed. Next question of course is what makes some chromatin open some closed..
2) Histone signals: DNA spools around wheel-like things called histones. Histones get marked in a variety of ways to indicate "status" of that part of the genome. Some of them are pro-transcription.
3) Polymerase is usually recruited there by DNA-binding proteins called "transcription factors". These proteins bind to the regulatory / "promoter" region upstream of a protein-coding gene. There are over a thousand different transcription factors, each with a specific "motif" they recognize.
The global state is detected in various ways (receptors on the cellular mebrane etc.), which triggers these transcription factors, which then go bind DNA near the genes,
to activate a new subroutine.
Here's a paper:
http://www.pnas.org/content/107/20/9186
Title: "Comparing genomes to computer operating systems in terms of the topology and evolution of their regulatory control networks"
Thank you, the article is fascinating indeed. But your explanation is not very accessible for a layman. I will tell you the questions a typical programmer will be asking:
Main question: suppose we have to build a heart from scratch. It has certain shape, at every point inside it, there's a specific type of cell (they are all certainly different). E.g., there's a complicated network of blood vessels inside, and what not. There's a process that builds the heart. Question: who tells this process that at point (x,y,z), it should add a cell of type T, and connect it to neighbors in a certain way? Is it something like cellular automation, where the property of cell at (x,y) is determined by properties of neighbors only? All neighbors? And how this function is calculated? And what is the CODE for calculation? Where is it stored? And where something that was formerly called "JUNK", and now looks like a master program, fits into this picture?
To explain it to a programmer, it's better to avoid the terminology and specifics, but focus only on functional relations - e.g. type of cell at point (x,y) is a FUNCTION of types (or something) of cells (x-1,y), (x+1, y) and maybe something else. I can't find it anywhere.
BTW, I am not a believer in cellular automation stuff a la Wolfram, just trying to frame the question in a way I could understand.
(just in case you don't want to reply here, my fake email
address is in my profile)
While this is not my specialty, the field of "evo-devo" and developmental biology is the field trying to answer those questions. Honestly, we still know very little about how the genome results in morphology.
It is a lot more like a cellular automaton than anything else. From minute gradient differences inside the fertilized egg, recursively structure spreads out in very much a cellular automata like fashion. Cells respond to the bio-mechanical stress around them and to the signals they bathe in, to decide cellular fate, and in turn release new signals..
It's a very different paradigm of computation than the one we think of. Each cell is like a docker instance, containing the full source code. Perhaps these links explain a bit:
Analog computer switching to digital mode, and back to analog, and back to digital, etc... ? Something is extremely fishy here. Looks like highly intelligent series of operations, much beyond whatever we can imagine.
"So you can think of the protein-coding genes as being sort of the toolbox of subroutines which is pretty much common to all mammals -- mice and men have the same number, roughly speaking, of protein-coding genes and that's always been a bit of a blow to self-esteem of humanity. But the point is that that was just the subroutines that are called into being; the program that's calling them into action is the rest [of the genome] which had previously been written off as junk."
This makes sense to me (as a programmer :). But I wonder if someone managed to figure out the language in which the "caller" code is written. How does it say: call subroutine at position X, then if (condition), call subroutine at position Y, etc... I was not able to find any info on this.