MIT’s Latest AI Research Using Deep Neural Networks Explains How The Brain Processes Language

Source: https://www.biorxiv.org/content/biorxiv/early/2020/10/09/2020.06.26.174482.full.pdf

Language modeling uses various statistics and probability techniques to predict the sequence of words appearing in a sentence. These models are widely used in natural language processing applications that generate output text. A notable example is the AI ​​model trained to predict the following words in a text string based on the preceding words. This technology helps search engines and SMS applications predict the next word before the user types it. The implementation of this technology is not limited to prediction, but has been found to be useful for answering questions, synthesizing documents and completing the story.

Although the models were delineated to predict the next word in a text, a new study by neuroscientists at MIT finds that the functioning of these models resembles the function of language processing centers in the human brain. It is observed that computer models that perform other language tasks do not have any commonalities with the human brain. This proves that the human brain uses next word prediction to power language processing.

The recently developed models belong to a class called Deep Neural Networks, a category of machine learning that works on the basis of the organization and activities of the human brain. These networks contain compute nodes which establish connections of different strengths and layers which transmit information between them in a stipulated manner.

Over the past decade, scientists have developed models that perform object recognition as efficiently as the brains of primates. MIT researchers have also shown that the functioning of visual object recognition models is comparable to the structuring of the visual cortex of primates.

The researchers compared 43 different language models with the language processing centers of the human brain. One of these next word prediction models is the Generative Pre-trained Transformer 3, abbreviated as GPT-3, which generates text similar to what a human would produce as a result of a prompt.

Analysis

The researchers presented each model with a string of words and measured the activity of the nodes that make up the deep neural network. Three language tasks have been considered to draw parallels with the functioning of the human brain, including:

  • Listen to stories.
  • Read one sentence at a time.
  • Read sentences in which one word is revealed at a time.

Human datasets consisting of functional magnetic resonance (fMRI) data and intracranial electrocorticographic measurements have been collected from people undergoing brain surgery for epilepsy. A performance parameter such as the reading speed of a given text is used to perform the benchmarking. The best performing next word prediction model is observed to have patterns similar to those detected in the human brain.

One of the main features of the GPT-3 predictive model is an aspect called unidirectional predictive transformer, which can make predictions based on previous sequences. An essential feature of this transformer is that it can make predictions based on a very long past context and not just on the previous word.

One of the main lessons from this study is that language processing is a highly constrained problem and a significant difficulty is a real-time aspect. The idea of ​​the AI ​​network was not to mimic the workings of the human brain, but it ended up becoming a brain-like model. This suggests that a convergent evolution has taken place between AI and nature.

The researchers propose to build variations of these models in the future and assess how a slight change in the design would affect their performance and suitability for human neural data. The idea is to use them to understand how the human brain works. The subsequent action in the trajectory would be to integrate the high performance language models with previously developed computer models to make them capable of performing complex tasks such as building perceptual representations of the physical world.

The goal is to get closer to more efficient AI models that give an accurate explanation of how other parts of the brain work and to understand how intelligence emerges, making a comparison with the past.

The references:

https://www.youtube.com/watch?v=LpOaXDwP1fQ&t=2s

Source link

Comments are closed.