An excessive amount of duplication
Some stage of competitors and parallel improvement is wholesome for innovation, however the present state of affairs seems more and more wasteful. A number of organizations are constructing comparable capabilities, with every contributing an enormous carbon footprint. This redundancy turns into significantly questionable when many fashions carry out equally on normal benchmarks and real-world duties.
The variations in capabilities between LLMs are sometimes refined; most excel at comparable duties similar to language era, summarization, and coding. Though some fashions, like GPT-4 or Claude, could barely outperform others in benchmarks, the hole is often incremental moderately than revolutionary.
Most LLMs are educated on overlapping knowledge units, together with publicly obtainable web content material (Wikipedia, Frequent Crawl, books, boards, information, and so on.). This shared basis results in similarities in information and capabilities as fashions soak up the identical factual knowledge, linguistic patterns, and biases. Variations come up from fine-tuning proprietary knowledge units or slight architectural changes, however the core common information stays extremely redundant throughout fashions.