I speculate something like this is happening. Specficially, children are often given tailored curricula to learn from, which is both progressive and interactive - two features that LLM datasets don't possess. But I suspect the problem goes much deeper than just the training regime itself - applying curriculum learning to language models has reduced the training time significantly in LLMs, but nowhere near the learning efficacy of humans. So something structural needs to change for even larger, human-like learning.