175 billion parameters, but gpt-3 is not “intelligent”

With 175 billion parameters, gpt-3 has made amazing progress, but it is not general artificial intelligence. Gpt-3 shows us the ability of language model. Can we use this ability to build a model to better understand the world around us? < / P > < p > although they are close to human capabilities in some areas, they seem unable to make universal intelligence. In many cases, gpt-3 is more like alphago. < / P > < p > gpt-3 performs well in imitating human beings, but it has no memory of past interactions, can’t carry out “multi round dialogue”, and can’t track targets or develop greater potential. However, language modeling is very different from chess or image classification. In essence, natural language encodes the information of the world, and its expression is much richer than any other way. < / P > < p > the goal of the language model is only to maximize the possibility of the model on natural language data, and the autoregression used by gpt-3 means that it will predict the next word as much as possible. < / P > < p > generally speaking, gpt-3 pays more attention to text differences such as grammar and spelling, rather than semantic and logical coherence. The latter is the key to intelligence. When the autoregressive model approaches perfection, the only way to continue to improve is semantic understanding and logical construction. < / P > < p > in extreme cases, if the loss of any model reaches the Shannon entropy of natural language, it will be completely indistinguishable from the real human works in any way, and the closer we get to it, the less likely it will be to detect the impact of the improvement of the loss on the quality. < p > < p > Shannon entropy: due to the inherent randomness of language, a language model may achieve the lowest loss in theory. The lower the loss, the more like “human language”. < / P > < p > in other words, using Markov chain to string words allows you to complete 50% of the task, while the other 50% requires you to understand the grammar, consider cross paragraph topics, and more importantly, be logically consistent. < / P > < p > the important point of gpt-3 is that as long as the size of the model increases, the loss can be reduced until it reaches the Shannon entropy of the text. It doesn’t need smart architecture or complicated manual rules to inspire, just enlarge it, and you can get a better language model. < / P > < p > some reddit netizens said, “various experiments show that gpt-3 often fails in world modeling, solving more problems, but adding more parameters.”. < / P > < p > let’s assume that a bigger model will develop a better world model. As the loss approaches Shannon entropy, its world modeling ability will become as good as ordinary people on the Internet, which can be attributed to two problems: < / P > < p > when there are 1 trillion, 10 trillion and 100 trillion parameter models available, we need a long time to verify whether this hypothesis is correct. If gpt-x shows its incredible predictive ability in the real world, it may work. < p > < p > paperclip maximizer is a classic thought experiment, which shows an AgI, even a reasonably designed and harmless intelligence can destroy human beings. This thought experiment shows that seemingly friendly AI can also pose a threat. < / P > < p > choosing paperclip maximizer as a goal can integrate the contingency of human values: an extremely powerful optimizer can find a goal completely different from ours, such as consuming the resources necessary for our survival to achieve self-improvement. < / P > < p > if you go to Amazon and say “I want to buy a paper clip”, the platform will sort it according to the price. If you choose one, how many paper clips can you buy for 100 yuan? < / P > < p > if you use the language model, it is very likely that “paper clip” will be followed by “price”, and “price” will be followed by a series of price lists. We can quickly figure out which paper clips are available and how much it costs to buy a particular one. < / P > < p > so now, in order to estimate the state action value of any operation, we can simply use the Monte Carlo tree to search! < / P > < p > starting from a given agent state, we use the world model to expand the action sequence. By integrating all the results, we can know how much expected reward each agent can get. < / P > < p > every action may be very advanced, such as “find out the cheapest way to buy a paper clip”. But thanks to the flexibility of words, we can use a short token sequence to describe very complex ideas. < / P > < p > once an agent decides an action, in order to actually execute these abstract actions, the action can be decomposed into smaller sub goals using a language model, such as “find the cheapest paper clip on Amazon”, which is similar to hierarchical reinforcement learning. < / P > < p > according to the ability of the model and the abstract degree of the action, the action can even be decomposed into a detailed instruction list. We can also express the state of agent as natural language. < / P > < p > because the state of an agent is only a compressed representation of the observed value, we can let the language model summarize the important information of any observed value to represent its own internal world state. The language model can also be used to periodically delete information in the state in order to make room for more observations. < / P > < p > in this way, we can get a system that can transmit observation information from the outside world, spend some time thinking about what to do, and output an action in natural language. < / P > < p > starts with an input module, which can convert various observations into summary text related to the current agent state. For example, web pages, sounds, and images can all be converted to text and mapped to the state of the agent. < / P > < p > finally, in order to make the model really work in the real world, we can use the language model again to translate natural language into code, shell commands, key sequences and many other possible ways. < / P > < p > like input, there are countless different methods to solve the output problem. Which method is the best depends on your specific use scenario. The most important thing is that you can obtain various forms of input and output from the pure text agent. < / P > < p > an example of an input module, which combines the screenshot input with the current state of the agent to convert the image information into the observation of the agent. < / P > < p > this approach relies to a large extent on the main assumption that larger future models will have better world modeling capabilities. However, this may be the closest opportunity we’ve ever had to AgI: there’s now a concrete path to AgI. Continue ReadingDeveloped a “plug and play” solar power generation scheme, and “5B” won a $12 million round a financing