Like many people, you may have been blown away recently by the possibility of ChatGPT and other large language models (LLMs) like the new Bing or Googles Bard.
For anyone who hasn’t somehow come across them, which is probably unlikely since ChatGPT is reportedly the fastest growing app of all time, here’s a quick summary:
LLMs are software algorithms trained on huge text data sets, which enable them to understand and respond to human language in a very realistic way.
The best known example is ChatGPT, a GPT-4 LLM based chatbot interface that has taken the world by storm. ChatGPT is able to converse like a human and generate anything from blog posts, letters and emails to fiction, poetry and even computer code.
Impressive as they are, until now, LLMs have been limited in a significant way. They tend to be able to complete only one task, such as answering a question or generating a piece of text, before requiring more human interaction (known as a prompt).
This means they are not always good at more complicated tasks that require multi-step instructions or depend on external variables.
Enter Auto-GPT, a technology that attempts to overcome this obstacle with a simple solution. Some believe it may even be the next step towards the holy grail of AI, the creation of general or strong AI.
Let’s first look at what this means:
Strong AI versus Weak AI
Current AI applications are typically designed to do one task, getting better and better as they receive more data. Examples include analyzing images, translating languages, or navigating self-driving vehicles. For this reason, they are sometimes referred to as “specialized AI”, “narrow AI”, or “weak AI”.
A generalized AI is one that is theoretically capable of many different types of tasks, even those it was not originally created for, in much the same way as a naturally intelligent entity (such as a human). It is sometimes called strong artificial intelligence or “artificial general intelligence” (AGI).
AGI is perhaps what we traditionally thought of when we imagined what AI would have been like in the days before machine learning and deep learning made weak/narrow AI an everyday reality around the start of the previous decade. Think of the sci-fi AI demonstrated by robots like Data in Star Trek who can do just about anything a human can.
So what is Auto-GPT?
The easiest way to look at it is that Auto-GPT is capable of doing more complex and multi-step procedures than existing LLM-based applications by creating its own prompts and returning them to itself, creating a loop.
Here’s one way to think about it: Getting the best results from an application like ChatGPT requires careful thought about how you phrase the questions you ask. So why not let the application build the question itself? And while you’re at it, have him also ask what the next step should be and how he should go about it and so on, creating a loop until the task is done.
It works by splitting a larger task into smaller sub-tasks and then separating independent Auto-GPT instances to work on them. The original instance acts as a kind of “project manager”, coordinating all the work done and compiling it into a final result.
In addition to using GPT-4 to construct sentences and prose based on the text it has studied, Auto-GPT is able to browse the Internet and include the information it finds there in its calculations and output. In this respect, it’s more like the new GPT-4-enabled version of Microsoft’s Bing search engine. It also has better memory than ChatGPT, so it can build and remember longer command chains.
Auto-GPT is an open source application that uses GPT-4 and was created by one person, Toran Bruce Richards. Richards said he was inspired to develop it because traditional AI models, “while powerful, often struggle to adapt to tasks that require long-term planning or are unable to self-refine their feedback-based approaches in time.” real.
It is part of a class of applications that are called recursive AI agents because they have the ability to autonomously use the results they generate to create new prompts, chaining these operations together to complete complex tasks.
Another such agent is BabyAGI, which was set up by a partner at a venture capitalist firm to help him with day-to-day tasks that were simply too complex for something like ChatGPT, such as researching new technologies and companies.
What are some applications of Auto-GPT and AI agents?
While apps like ChatGPT have become well known for their ability to generate code, they tend to be limited to relatively short and simple software programming and design. Auto-GPT, and potentially other AI agents that work in a similar way, can be used to develop end-to-end software applications.
Auto-GPT is also able to help companies grow their net worth on their own by examining their processes and making intelligent recommendations and insights into how they could be improved.
Unlike ChatGPT it can also access the internet, which means you can ask it to conduct market research or do other similar tasks, such as find me the best set of golf clubs for under $500.
An extremely disruptive task he has been assigned is to destroy humanity and the first subtask he has set himself to accomplish this was to start researching the most powerful atomic weapons of all time. Since its output is still limited to text creation, its creator assures us that it won’t actually get very far with this task, hopefully.
Apparently auto-GPT can also be used to improve itself, its creator claims he can create, evaluate, review and test updates to your code that can potentially make it more capable and efficient.
It can also be used to create better LLMs that could form the basis of future AI agents, speeding up the modeling process.
What could this mean for the future of AI?
Ever since the applications of Generative AI started to emerge, it was clear that we were only at the beginning of a very long journey, in terms of how AI will evolve and impact our lives and society.
Are Auto-GPT and other agents following the same principles the next step on that journey? It certainly seems likely. At the very least, we can expect AI tools that allow us to do much more complex tasks than the relatively simple things ChatGPT can do to start becoming commonplace.
Before long, we’ll start to see AI output that’s more creative, sophisticated, diverse, and useful than the simple text and images we’ve become accustomed to. These will no doubt eventually have an even greater impact on the way we work, play and communicate.
Other potential positive impacts include reducing the cost and environmental impact of creating LLMs (and other machine learning related activities) as autonomous and recursive AI agents find ways to make the process more efficient.
However we also have to consider that by itself it doesn’t really solve any of the problems associated with generative AI. These include the varying accuracy (to put it nicely) of the output it creates, the potential for abuse of intellectual property rights, and the possibility that it will be used to disseminate distorted or harmful content. In fact, by spawning and executing far more AI processes to do larger tasks, it could potentially amplify these problems.
The Potential Problems Don’t Stop There Prominent AI expert and philosopher Nick Bostrom recently said that he believes the latest generation of AI chatbots (like GPT-4) are even starting to show signs of sensitivity. Which could create a whole new moral and ethical dilemma if we as a society are planning to start creating them and operating them on a large scale.
Follow me on Chirping or LinkedIn. Watch my website or some other work of mine here.