Google documents now automatically generate the summary of their content. The resume of the content is when they are available. Although all users can add reports, automatically created offers are currently available only to Google Workspace business clients.
This is achieved due to ML models of understanding the natural language (NLU) and the generation of the natural language (NLG), especially Transformer and Pegasus. The popular method of unification NLU and NLG is the training of the machine learning model using the training from a sequence to the sequence, where the input data is the words of the document, and the output data is the final words. Then the neural network learns to compare input tokens tokens tokens. Early applications of the sequence paradigm to the sequence used recurrent neural networks (RNN) for both the encoder and for the decoder.
The introduction of Transformers ensured a promising RNN alternative thanks to the internal attention for better modeling of long input and output dependencies, which is crucial when summarizing documents. Nevertheless, these models require large volumes of manually marked data for sufficient training, therefore, one appearance of Transformers was not enough to significantly advance in the field of summing up documents.
The combination of Transformers with self -controlled preliminary preparation (Bert, GPT, T5) has led to a large breakthrough in many NLU tasks, for which limited marked data are available. In preliminary training with self -control, the model uses large volumes of an unmarked text to study the general possibilities of understanding the language and generation. Then, at the subsequent stage of the thin setup, the model learns to apply these abilities to a specific task, such as summing up the results or answers to questions.
The work of Pegasus advanced this idea one more step forward, introducing a pre -training target adapted to abstract generalization. In the preliminary training of PEGASUS, also called the forecasting of spaces in sentences (GSP), full sentences from nemarkized news articles and web documents are masked from the input data, and the model is required to restore them depending on the remaining important sentences. In particular, GSP is trying to disguise the proposals that are considered important for the document, using various heuristics, to make preliminary training as close as possible to the task of summing up. Pegasus has achieved the most modern results on a variety of sets of data set for summing.
Using the advantages of Transformer and Pegasus, Google AI researchers carefully cleaned and filtered the data of thin settings so that they contain training examples that were more consistent and represent a coherent definition of the summary text. Despite a decrease in the number of training data, this led to a better model. Then the problem of servicing a high -quality model in production was solved. Although the version of the Transformer architecture of Coder-decoder is a dominant approach to teaching models for the tasks of consistent transforming sequences, such as abstract summation, it can be ineffective and impractical for use in real applications. The main inefficiency is associated with the Transformer decoder, where the output report is generated sequentially through auto -regression decoding. The decoding process becomes noticeably slower when the reports become longer, since the decoder processes all previously generated tokens at each stage. RNN is a more effective architecture for decoding, since when using previous tokens there is no internal attention, as in the Transformer model.
After transferring knowledge from a large model to a more effective model of smaller size, to convert the Pegasus model to the hybrid architecture of the Transformer encoder and RNN decoder, the number of layers of the RNN decoder was reduced to increase efficiency. In the obtained model, delays and memory volume are improved, preserving the initial quality.