Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Anthropic may or may not have claimed this was evidence of a world model; I'm not sure. I say this is a world model because it is a objectively a model of the world. If your concept of a world model requires something else, the answer is that we don't know whether they're doing that.

Long-term memory and object permanence don't seem necessary for thought. A 1-year-old can think, as can a late-stage Alzheimers patient. Neither could get through a 400-page book, but that's irrelevant.

Listing human capabilities that LLMs don't have doesn't help unless you demonstrate these are prerequisites for thought. Helen Keller couldn't tell you the weight, direction, or size of a rolling ball, but this is not relevant to the question of whether she could think.

Can you point to the speed-of-light analogy laws that constrain how LLMs work in a way that excludes the possibility of thought?



> I say this is a world model because it is a objectively a model of the world.

a world model in AI has specific definition, which is an internal representation that the AI can use to understand and simulate its environment.

> Long-term memory and object permanence don't seem necessary for thought. A 1-year-old can think, as can a late-stage Alzheimers patient

Both those cases have long term memory and object permanence, they also have a developing memory or memory issues. But the issues are not constrained by their context window. Children develop object permance in the first 8 months, and similar to distinguishing between their own body and their mothers that is them developing a world model. Toddlers are not really thinking, they are responding to stimulus, they feel huger they cry. They hear a loud sound they cry. Its not really them coming up with a plan to get fed or attention

> Listing human capabilities that LLMs don't have doesn't help unless you demonstrate these are prerequisites for thought. Helen Keller couldn't tell you the weight, direction, or size of a rolling ball

Helen Keller had understanding in her mind of what different objects were, she started communicating because she understood the word water with her teacher running her finger through her palm.

Most humans have multiple sensory inputs (sight, smell, hearing, touch) she only had one which is perhaps closer to an LLM. But conditions she had that LLMs dont have are agency, planning, long term memory etc.

> Can you point to the speed-of-light analogy laws that constrain how LLMs work in a way that excludes the possibility of thought?

Sure, let me switch the analogy if you dont mind. In the chinese room thought experiment we have a man who gets a message and opens a chinese dictionary and translates it perfectly word by word and the person on the other side receives and read a perfect chinese message.

The argument usually goes along the idea of whether the person inside the room "understands" chinese if he is capable of creating 1:1 perfect chinese messages out.

But an LLM is that man, what you cannot argue is that the man is THINKING. He is mechanically going to the dictionary and returning a message that can pass as human written because the book is accurate (if the vectors and weights are well tuned). He is neither an agent, he simply does, and he is not crating a plan or doing anything beyond transcribing the message as the book demands.

He doesnt have a mental model of the chinese language, he cannot formulate his own ideas or execute a plan based on predicted outcomes, he cannot do but perform the job perfectly and boringly as per the book.


> But an LLM is that man

And the common rebuttal is that the system -- the room, the rules, the man -- understands chinese.

The system in this case is the LLM. The system understands.

It may be a weak level of understanding compared to human understanding. But it is understanding nonetheless. Difference in degree, not kind.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: