When I read "from scratch", I assume they are doing pre-training, not just finet...

		luke-stanley 7 months ago \| parent \| context \| favorite \| on: ETH Zurich and EPFL to release a LLM developed on ... When I read "from scratch", I assume they are doing pre-training, not just finetuning, do you have a different take? Do you mean it's normal Llama architecture they're using? I'm curious about the benchmarks!