Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Perhaps they are trying to explain clustering illusion? The phenomenon that even random data will produce clusters. You can take that further and state random data WILL produce clusters. If you don't have clusters then your data is not random and some pattern is at play.

This really tricks up our mind as our mind tries to find patterns everywhere. If you try and plot random dots you will usually put dots without clusters. A true random plot will have clusters.

https://en.wikipedia.org/wiki/Clustering_illusion

Edit: Note your professor said "often" which means they did not make an absolute statement



Ipso factum all "natural" variables are related to bounded random walk which produces clusters (Markovian process), or otherwise have complex chaotic (e.g. fractal) mechanics, which also produces clusters. This follows from physics.

Maximum entropy as well as zero entropy is a very rare state to observe.


does this imply that the universe somehow rewards structures that engender 'compressibility' (coarse graining)? it does seem like our brains subjectively enjoy identifying it, to the point of over-optimization in the form of phenomena like pareidolia


The universe doesn’t “reward” it so much as it’s just a consequence of random events. For example, if you flip a coin many times, you’ll see long sequences of heads. From the central limit theorem it follows that sufficiently many random events will form a normal distribution, which exhibits clustering phenomenon. Take a look at a Galton board in action.


That's ignoring anything related to actual life we observe and Gaussian distributed data does not have to exhibit clustering either. (But it allows that.)

About the only thing that is naturally uniform so far within bounds is large scale homogeneity and isotropy of universe. Which is an unsolved mystery potentially involving dark matter.


I would argue if it didn't do clustering then there was some sort of pattern/bias at play that caused it.


>"The phenomenon that even random data will produce clusters."

You don't really mean "random", you mean i.i.d. You can have a statistical model where the probability of something happens is random, but not independent of the past values (eg, the next step a markov chain).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: