Let's answer this question seriously:
How does artificial intelligence understand natural language? To answer this question in a targeted manner, it is necessary to clearly define its issue boundary. If this question is understood as how to use computer tools to process and analyze natural language to achieve effective communication between humans and computers through natural language, then you can get a relatively narrow answer; if you want to sort out "artificial intelligence", "understand", Concepts in issues such as "natural language" can then be explored in a relatively broad sense.
In a narrow sense, the use of computers for linguistic analysis is an interdisciplinary subject between linguistics and computer science, which the academic community calls "Computational Linguistics," or "Natural Language Processing." Abbreviation: NLP). If the program is understood as "data structure + algorithm", NLP can be analogized as "linguistic category + computing model". Among them, the category of linguistics refers to the linguistic concepts and standards (such as words, part-of-speech, grammar, semantic roles, discourse structure, etc.) defined by linguists, and most of the tasks for NLP processing come from this; concretely implemented computing models or algorithms It is usually developed by computer scientists.
Generally speaking, the general basic NLP is always directly related to the categories in the field of linguistics. Research includes: stemming, lemmatization, word segmentation, and part-of -speech (POS), Named Entity Recognition (NER), Word Sense Disambiguation (WSD), Chunk Recognition, Syntactic Analysis (e.g. Phrase Structure Grammar Parsing, Dependency Parsing), Semantic Role Labelling (SRL), Coreference Resolution, Discourse Analysis, etc. There are also some NLP studies that are not directly related to the field of linguistics, but are oriented towards text processing applications, such as: machine translation, text abstracts, information extraction, sentiment classification, information retrieval, question answering systems, and so on. Rarely depends on several types of basic research on NLP introduced earlier. For example, the basic NLP technologies that text summaries may use generally involve word segmentation and named entity recognition.
In terms of computational model research, there are two research lines that rationalism and empiricism can take, namely the so-called "rule method" and "statistical method". Since natural language is essentially a symbol system produced by human society due to communication needs, and its rules and reasoning characteristics are clear, the early research of NLP mainly adopted the rule method. However, on the one hand, human language is not a formal language after all, and rule patterns often exist implicitly in the language (for example, Chinese grammar rules are quite vague and inaccurate), and the formulation of rules is not easy; on the other hand, the complexity of natural language It makes it difficult for rules to be non-conflicting and cover all linguistic phenomena, so this rational approach based on rules makes NLP research stay in a small available Toy stage for a long time. Until the construction of large-scale corpora and the popularization of statistical machine learning methods, NLP research gradually moved towards the road of practicality. The statistical method saves a lot of burden of manually compiling rules, and automatically evaluates feature weights in model generation, which has better robustness. However, when we want to get a good natural language processing result, in terms of designing the model structure reflecting the insights of linguistic phenomena (Insight) and the appropriate feature design, it is still inseparable from the in-depth understanding of language and the intelligence of NLP researchers. support.
It can be seen that the processing method of NLP is to treat the process of understanding natural language as a mathematical modeling of language phenomena. On the one hand, researchers are required to have a solid knowledge of linguistics, and on the other hand, they must have deep mathematical background and machine learning experience. In this way, when facing a specific natural language processing problem, it can be decomposed into operable modeling tasks. From this perspective, NLP does not really understand natural language, but just treats language processing as a computing task.
If understanding natural language is not simply regarded as mathematical modeling, then what does artificial intelligence mean by natural language in a broad sense? First, we need to clarify the so-called "natural language", "artificial intelligence", and "understanding".
The meaning of "natural language" is relatively clear, and generally refers to languages that are gradually invented and evolved in human society for communication, such as speech, sign language, written language, and so on. For the convenience of discussion here, the scope of the discussion is constrained to the language in which the text form is the carrier, mainly written natural language, but also the language of oral expression.
"Artificial intelligence" is not a clearly defined concept. Broadly speaking, artificial intelligence refers to the intelligent activities that machine agents (Agents) simulate humans, including the ability of humans to perceive the outside world, the ability to make decisions and reasoning, and even the ability to possess emotions and will. From the composition of extension, artificial intelligence includes two aspects: research content and methodology. The research content is various research topics that are well-known in scientific research institutions, including: knowledge expression and reasoning, speech recognition, computer vision, natural language processing, automatic planning and scheduling, robotics, etc. Methodology refers to the perspective and guiding principles of artificial intelligence to simulate human intelligence. There are three main viewpoints and perspectives for implementing artificial intelligence: symbolism, connectionism, and behaviorism. Symbolism (Symbolism) believes that we should simulate human intelligent activities from the perspective of mathematical logic deduction. The development of a series of theories and technologies such as knowledge engineering and expert systems are all affected by the trend of symbolism. Connectionism originates from bionics research on human brain models. Researches such as the MP model of neuron brain model proposed by McCulloch and Pitts, the learning rules of neurons proposed by Hebb, and the concept of the perceptron of Rosenblatt have simulated the human brain structure from the perspective of bionics as much as possible. Later BP back-propagation algorithms and the introduction of restricted Boltzmann machines to deep learning have greatly expanded the large-scale application of neural network models from the perspective of computability. Behaviorism will focus on observable human behaviors, believing that humans get adaptive through interaction between behaviors and the external environment, thereby gaining intelligence. Common implementation techniques in behaviorism research include evolutionary computing (genetic algorithm), reinforcement learning, and so on. The current mainstream natural language processing technology combining rules and statistics, the rule side is consistent with the perspective of symbolic deductive reasoning; the statistical side focuses on mining the general laws of linguistics from the data, Belongs to inductive thinking. In recent years, distributed representations of linguistic knowledge such as word vectors (such as word2vec) have become popular. Such distributed representations can be naturally connected to neural network models for data induction learning, which promotes connectionist nature to a certain extent. Development of language processing.
When it comes to "understanding", the consensus of most people is that machines cannot really understand Ran, but humans can understand it. A typical piece of evidence comes from the refutation of the Turing test by the American philosopher John Searle through the "Chinese Room" thought experiment. The Turing test is used to determine whether a machine has human intelligence. The experimental ideas for this test are: Ask an unsuspecting interrogator to ask a computer and a volunteer. After passing multiple rounds of tests, if the interrogator still cannot tell who is the computer and the volunteer, then the computer passed the Turing test, which means that the computer has human intelligence. With understanding. Searle used the "Chinese room" thought experiment to reject the Turing test. The idea of this thought experiment is that an English-speaking person in the room communicates with the outsiders in Chinese by looking up a Chinese comparison table. From the outsider's point of view, the person in the room speaks fluent Chinese, but in fact he does not understand Chinese at all. In the opinion of the author of this article, it is not necessary to worry too much about whether a machine can really understand natural language. In fact, people do not always do well in language communication. For example, when talking about "Lantern Festival", the specific understanding of "Lantern Festival" is different in different parts of the country; when talking about "luxury houses", Hong Kong and the mainland people also have different understandings of house sizes; some temperature feelings Concepts such as "cold / hot", people living in different latitudes have different target temperatures. From the perspective of cognitive linguistics, the semantics of a concept are not static meanings listed in a dictionary. In fact, each person's understanding of a concept is related to his personally specific experience environment. Even with the same concept, different people have different interpretations. For example, in most cases, people with similar life experiences are more likely to have what is called "empathy" when talking about a common topic. This is still the difficulty in semantic understanding, and people also have difficulties in pragmatic understanding in daily conversation. Please see this conversation, A: "Will KTV go at night?" B: "My dad is back from Tianjin." It is impossible to understand B's answer if we only look at the conversation between A and B from the literal semantics. In fact, B implied that he could not accept A's invitation by telling the other party that "my dad is back from Tianjin." This was an indirect rejection involving pragmatics and reflected the true intention of verbal communication. To fully understand the pragmatic intentions of both parties in a conversation requires contextual reasoning. The factors affecting inference include not only the physical context such as the conversation context, the time and place of the conversation, but also the consensus knowledge, personality characteristics, and cultural background of the parties. It is still not easy for people to understand each other in language communication, let alone the true understanding of machines. Then when we use a machine to process natural language, we don't have to worry too much about whether it can really understand the problem itself, but instead focus on how to make the agent simulate as much human intelligence as possible, so that the machine has the same human Functions.
In terms of natural language understanding, although the neural network model represented by connectionism tries to mimic the human brain structure as much as possible at the level of physical representation, it still has huge differences with the human brain in some processing mechanisms. Three issues are discussed here.
I. How does the human brain automatically form inferential symbolic calculations from the underlying connection calculations? The basic structure of the human brain is hundreds of millions of neurons and their connected structures. Information input is in the form of continuous numerical values. However, through layer-by-layer high-level processing of the human brain, the information can be conceptualized eventually, and then efficient symbolic calculation and reasoning can be formed . New knowledge can be obtained through conceptual composition or reasoning, without having to be driven by large-scale data. For example, if the human brain learns the sentence "noun + predicate (verb) + object (noun)" pattern from a large number of textual materials, then when you see a sentence "a1a2b1b2c1", it is known that "b1b2" is a verb, "C1" is a noun, then it is very likely that "a1a2" is the subject of the noun and the sentence. Further, if it is known that "b1b2" is an action that can only be performed by individuals, then it can be inferred that "a1a2" is likely to be a named entity, even if we do not know in advance the internal wording of "a1a2". In the field of image processing, current deep learning technologies can abstract image information layer by layer, learn high-level features spontaneously, and form advanced semantic patterns. This has implications for automated simulation of natural language understanding, but in reality it is much more difficult to process natural language. At present, how to make use of the underlying text input to make machines, like the human brain, automatically generate advanced linguistic discrete symbols and their pattern rules through layer-by-layer information processing, and the formation mechanism is not clear.
How to make the machine realize feedback-like natural language understanding like the human brain? The mainstream method of NLP research is to encapsulate a single natural language task into a module, and the modules are connected in order according to the high and low grades of the natural language task. For example, for the syntactic analysis of a sentence, the usual practice is to segment the word, tag the parts of speech, recognize named entities, and recognize chunks, etc. This information can be used as a feature of high-level syntax analysis. However, the errors of low-level linguistic analysis are also transmitted to high-level linguistic analysis tasks. If the word segmentation is wrong, it will also affect the performance of the final syntax analysis. In contrast, when the human brain understands natural language, it does not always follow the serial analysis of each language analysis module. For example, the sentence "a1a2b1b2c1" given earlier, when we are unable to discern whether "a1a2" is a named entity, put this task for a while, and consider the information behind the sentence. When we gradually analyze the entire sentence may be " In the syntactic pattern of "noun" + "verb" + "noun", such higher-level information as a positive feedback is conducive to the conjecture that "a1a2" is a named entity. In this example, the recognition of named entities instead uses higher-level syntactic information as clues. The existing natural language processing flow of artificial intelligence is fixed, while the human brain's natural language processing flow can be changed according to the actual situation.
Three. Automatic learning of semantic changes. A large number of vocabularies will have different semantics in different historical periods of society, forming semantic changes. For example, the connotation of the word "Miss" has changed since ancient times. "Miss" in Chinese feudal society usually refers to a well-educated unmarried young woman. After the founding of New China, as the role of "Miss" in traditional feudal society gradually disappeared in society, the word "Miss" It is also used less and less; however, when women are called "Miss" for women engaged in pornography, the word "Miss" has a corresponding new meaning. Therefore, due to the objective existence of lexical semantic changes, it is impossible to design a complete and comprehensive machine-readable word dictionary to support the semantic understanding of natural language at one time. When a new interpretation of vocabulary appears in society, it is generally necessary to manually maintain and update the semantic dictionary. If the update of machine concepts and knowledge only stays at the stage of manual input, then the machine will never be able to achieve automatic learning and evolution like humans.
On the one hand, people's research on the language and thinking of the human brain is still insufficient. On the other hand, the existing artificial intelligence has a huge difference with the human brain in understanding the processing mechanism of natural language, so does it mean that there is an insurmountable bottleneck in the development of artificial intelligence in natural language understanding? The answer may not be so pessimistic. If we look at human beings ourselves, we will find that the generation of language and thinking is not only related to the neural connection structure of the human brain, but also influenced by the external language environment. If the ideas of connectionism and behaviorism are combined to “tweak” a machine agent in a bionic simulation way, just like teaching children to learn language in an interactive and stimulating way, then after many generations of updates and iterations The machine intelligence may evolve its own language acquisition device and generate specific language patterns, and these language patterns are stored in a distributed form in the neural weight network in representation, making it extremely difficult for humans to understand. Just like AlphaGo defeated the world Go masters, but its strategy of playing chess has been difficult for the makers to understand. Using a bionic simulation to let the machine evolve means that the creator has given up some control over the machine. As Kevin Kelly said in Out of Control, once machines have evolved intelligence, the price is that humans will eventually lose control of the machine. Humans may end up not only not understanding the mechanism of language and thinking in the human brain, but also how to understand the formation of language and thinking in machine intelligence.
The above is a bit of experience and thinking of the Yunzhisheng NLP team. Due to space limitations, I will share so much here. Everyone is welcome to communicate with us in the comment area.
Respondent related:
As a high-tech enterprise focusing on IoT artificial intelligence services and the world's top intelligent speech recognition technology, Yunzhisheng's NLP team has been working to improve the human-machine conversation interaction experience, from improving semantic understanding to breakthrough pragmatic understanding , Let the machine generate more human response feedback, and gradually make the machine an "knowledge expert" that can answer automatically ... The continuous and unremitting upgrade process is both challenging and interesting. The goal of our efforts is to enable people to interact with the machine in natural language, to conduct multiple rounds of dialogue smoothly, to have powerful pragmatic computing capabilities, to make the machine knowledgeable, able to make decisions, self-learning, and to have character and emotion.
In 2013, we opened the industry's first semantic cloud that supports simultaneous parsing, Q & A, and multiple rounds of dialogue. We also launched the industry's first pragmatic computing engine in 2016, which supports the understanding, generation and interaction of contextual information. frame. Deep learning technology is our main "magic weapon" for improving machine understanding. At present, machines can understand 60+ vertical fields (such as medical, home, etc.), and the average semantic parsing accuracy rate is 93%. There are a lot of points that can be cultivated in NLP. I hope to explore more with children's shoes that are interested in NLP.
0 Comments