OpenAI and Googles AI systems are powerful. Where are they taking us? – Vox.com

Posted: Published on June 10th, 2024

This post was added by Dr Simmons

What should we make of large language models (LLMs)? Its quite literally a billion-dollar question.

Its one addressed this week in an analysis by former OpenAI employee Leopold Aschenbrenner, in which he makes the case that we may be only a few years away from large language model-based general intelligence that can be a drop-in remote worker that can do any task human remote workers do. (He thinks that we need to push ahead and build it so that China doesnt get there first.)

His (very long but worth reading) analysis is a good encapsulation of one strand of thinking about large language models like ChatGPT: that they are a larval form of artificial general intelligence (AGI) and that as we run larger and larger training runs and learn more about how to fine-tune and prompt them, their notorious errors will largely go away.

Its a view sometimes glossed as scale is all you need, meaning more training data and more computing power. GPT-2 was not very good, but then the bigger GPT-3 was much better, the even bigger GPT-4 is better yet, and our default expectation ought to be that this trend will continue. Have a complaint that large language models simply arent good at something? Just wait until we have a bigger one. (Disclosure: Vox Media is one of several publishers that has signed partnership agreements with OpenAI. Our reporting remains editorially independent.)

Among the most prominent skeptics of this perspective are two AI experts who otherwise rarely agree: Yann LeCun, Facebooks head of AI research, and Gary Marcus, an NYU professor and vocal LLM skeptic. They argue that some of the flaws in LLMs their difficulty with logical reasoning tasks, their tendency toward hallucinations are not vanishing with scale. They expect diminishing returns from scale in the future and say we probably wont get to fully general artificial intelligence by just doubling down on our current methods with billions more dollars.

Whos right? Honestly, I think both sides are wildly overconfident.

Scale does make LLMs a lot better at a wide range of cognitive tasks, and it seems premature and sometimes willfully ignorant to declare that this trend will suddenly stop. Ive been reporting on AI for six years now, and I keep hearing skeptics declare that theres some straightforward task LLMs are unable to do and will never be able to do because it requires true intelligence. Like clockwork, years (or sometimes just months) later, someone figures out how to get LLMs to do precisely that task.

I used to hear from experts that programming was the kind of thing that deep learning could never be used for, and its now one of the strongest aspects of LLMs. When I see someone confidently asserting that LLMs cant do some complex reasoning task, I bookmark that claim. Reasonably often, it immediately turns out that GPT-4 or its top-tier competitors can do it after all.

I tend to find the skeptics thoughtful and their criticisms reasonable, but their decidedly mixed track record makes me think they should be more skeptical about their skepticism.

As for the people who think its quite likely well have artificial general intelligence inside a few years, my instinct is that they, too, are overstating their case. Aschenbrenners argument features the following illustrative graphic:

I dont want to wholly malign the straight lines on a graph approach to predicting the future; at minimum, current trends continue is always a possibility worth considering. But I do want to point out (and other critics have as well) that the right-hand axis here is ... completely invented.

GPT-2 is in no respects particularly equivalent to a human preschooler. GPT-3 is much much better than elementary schoolers at most academic tasks and, of course, much worse than them at, say, learning a new skill from a few exposures. LLMs are sometimes deceptively human-like in their conversations and engagements with us, but they arefundamentally not very human; they have different strengths and different weaknesses, and its very challenging to capture their capabilities by straight comparisons to humans.

Furthermore, we dont really have any idea where on this graph automated AI researcher/engineer belongs. Does it require as many advances as going from GPT-3 to GPT-4? Twice as many? Does it require advances of the sort that didnt particularly happen when you went from GPT-3 to GPT-4?Why place it six orders of magnitude above GPT-4 instead of five, or seven, or 10?

AGI by 2027 is plausible ... because we are too ignorant to rule it out ... because we have no idea what the distance is to human-level research on this graphs y-axis, AI safety researcher and advocate Eliezer Yudkowsky responded to Aschenbrenner.

Thats a stance Im far more sympathetic to. Because we have very little understanding of which problems larger-scale LLMs will be capable of solving, we cant confidently declare strong limits on what theyll be able to do before weve even seen them. But that means we also cant confidently declare capabilities theyll have.

Anticipating the capabilities of technologies that dont yet exist is extraordinarily difficult. Most people who have been doing it over the last few years have gotten egg on their face. For that reason, the researchers and thinkers I respect the most tend to emphasize a wide range of possibilities.

Maybe the vast improvements in general reasoning we saw between GPT-3 and GPT-4 will hold up as we continue to scale models. Maybe they wont, but well still see vast improvements in the effective capabilities of AI models due to improvements in how we use them: figuring out systems for managing hallucinations, cross-checking model results, and better tuning models to give us useful answers.

Maybe well build generally intelligent systems that have LLMs as a component. Or maybe OpenAIs hotly anticipated GPT-5 will be a huge disappointment, deflating the AI hype bubble and leaving researchers to figure out what commercially valuable systems can be built without vast improvements on the immediate horizon.

Crucially, you dont need to believe that AGI is likely coming in 2027 to believe that the possibility and surrounding policy implications are worth taking seriously. I think that the broad strokes of the scenario Aschenbrenner outlines in which an AI company develops an AI system it can use to aggressively further automate internal AI research, leading to a world in which small numbers of people wielding vast numbers of AI assistants and servants can pursue world-altering projects at a speed that doesnt permit much oversight is a real and scary possibility. Many people are spending tens of billions of dollars to bring that world about as fast as possible, and many of them think its on the near horizon.

Thats worth a substantive conversation and substantive policy response, even if we think those leading the way on AI are too sure of themselves. Marcus writes of Aschenbrenner and I agree that if you read his manuscript, please read it for his concerns about our underpreparedness, not for his sensationalist timelines. The thing is, we should be worried, no matter how much time we have.

But the conversation will be better, and the policy response more appropriately tailored to the situation, if were candid about how little we know and if we take that confusion as an impetus to get better at measuring and predicting what we care about when it comes to AI.

A version of this story originally appeared in the Future Perfect newsletter. Sign up here!

Youve read 1 article in the last month

Here at Vox, we believe in helping everyone understand our complicated world, so that we can all help to shape it. Our mission is to create clear, accessible journalism to empower understanding and action.

If you share our vision, please consider supporting our work by becoming a Vox Member. Your support ensures Vox a stable, independent source of funding to underpin our journalism. If you are not ready to become a Member, even small contributions are meaningful in supporting a sustainable model for journalism.

Thank you for being part of our community.

Swati Sharma

Vox Editor-in-Chief

We accept credit card, Apple Pay, and Google Pay. You can also contribute via

Continue reading here:

OpenAI and Googles AI systems are powerful. Where are they taking us? - Vox.com

Related Posts
This entry was posted in Artificial General Intelligence. Bookmark the permalink.

Comments are closed.