When we first joined Google twenty years ago, we only had one problem - how to provide a complete set of excellent quality and comprehensive network information search services for so many different types of networked computers. To this day, despite all the technological challenges we face, Google has basically achieved the overall goal of organizing global information and making it universally accessible. By 2020, as COVID-19 rages the world, we realize that R&D technology can help billions of people around the world better communicate, understand developments and find new ways to work. I am proud of what we have achieved and excited about the new possibilities that are coming.
Google Research Institute’s goal is to solve a range of long-term problems with broad opinions—from predicting the spread of the COVID-19 epidemic to designing algorithms, increasingly powerful automatic translation services, to alleviating bias issues in the machine learning model. Looking at the progress over the past four years, we will look at this upset year of 2020 again in this review. For more details, see our over 800 research articles published in 2020. This article is long, but it is clearly divided into multiple parts. You can learn about the corresponding content through the table below ( can be redirected to the corresponding content through the directory on the left side of the web page ).
COVID-19 and health
As the COVID-19 epidemic has caused huge losses to the daily lives of people around the world, researchers and developers around the world have worked together to develop tools and technologies to help public health officials and policy makers understand and deal with this sudden disease.
Exposure Notifications System (ENS), developed in 2020 by Apple and Google, is a Bluetooth-based privacy protection technology that will immediately alert users once they are exposed to people who have tested positive for COVID-19. ENS provides an effective supplement to traditional contact tracing methods and has been deployed by public health departments in more than 50 countries, states/provinces and regions to help many places curb the spread of the epidemic.
At the beginning of the epidemic, public health officials said more comprehensive data is needed to fight the rapid spread of the virus. Our community mobility report includes anonymous insights on virus mobility trends, which not only helps researchers understand the impact of policy such as home observation and social isolation, but also helps make economic predictions.
Inside Google researchers have also explored using this anonymous data to predict COVID-19 spread through graph neural networks (rather than traditional time series-based models).
Although the research community initially knew little about the disease and the secondary effects, we are exposed to and learning more every day. Our COVID-19 symptoms search trends allow researchers to correlate different symptoms, such as loss of smell—that is, loss of smell due to viral infection. To provide broader support to the research community, we have also launched the Google Health Studies app to open public research channels to the public.
Google team is also providing tools and resources to the broader scientific community to help practitioners work hard to address the health and economic impacts caused by the virus.
Accurate information is crucial to respond to public health threats. We work with multiple product teams within Google to improve the quality of information about COVID-19 in Google News and Searches by supporting fact checks and YouTube diversions.
In addition, through sponsorship Nextstrain.org to publish weekly local outbreak reports and collaborate with Translators Without Borders to develop COVID-19 open source concurrent data sets, we also help multilingual communities access critical COVID-19 information equally.
is challenging to model complex global events, while a more comprehensive epidemiological dataset, the development of novel interpretable models, and agent-based simulators help the public deal with health more calmly. Machine learning technology also provides support to researchers through natural language understanding, rapid screening of COVID-19 scientific literature, application of anonymization technology to achieve privacy protection, and provide rich data sets.More importantly, with the support of Google's technical achievements, the public health department has also made anti-epidemic explorations from the following aspects:
These are just one aspect of many Google's development work, and it also represents that Google helps users and public health authorities to deal with COVID-19 more calmly. For more details, see Using Technological Achievements to Help Respond to COVID-19.
Machine Learning Research in the Field of Medical Diagnosis
We will continue to work hard to help clinicians use the power of machine learning to provide better care for more patients. This year, we have made significant progress in using computer vision to help doctors diagnose and manage cancer patients, including helping doctors not miss potential cancerous polyps during colonoscopy. In addition, we have demonstrated that machine learning systems can achieve higher accuracy, which is comparable to the Gleason grading of prostate tissue by pathologists, and can also help radiologists to significantly reduce the ratio of false negative and false positive tests when checking whether X-rays contain signs of breast cancer.
We have also been studying how to help identify skin diseases, detect age-related macular degeneration (has become the primary cause of blindness in the United States and the United Kingdom, and the third largest cause of blindness worldwide), and try to explore new non-invasive diagnostic methods (such as detecting signs of anemia through retinal imaging).
This year we also brought exciting demonstrations showing how the above detection technology can be introduced into the human genome. Google’s open source code tool DeepVariant uses convolutional neural networks to identify genomic variants within the predicted data and thus win the highest award in this year’s FDA Challenge (with the best accuracy in three of the four categories). Using the same tool, another study conducted by the Dana-Farber Cancer Institute successfully improved the diagnostic accuracy of genetic variants that cause prostate cancer and melanoma in 2367 cancer patients by 14%.
Our research is not only about measuring the accuracy of experiments. Ultimately, to truly help patients get better care, it is necessary to understand how machine learning tools affect people in the real world. This year, we began working with Mayo Clinic to develop a machine learning system to assist radiation therapy projects and explore how to better deploy technology into clinical practice. Through collaboration with Thai partners, we have been able to screen and test cases of eye diseases caused by diabetes, so as to understand how to build a system of people-centered solutions, and gradually realize the fundamental role of diversity, equity and inclusion in improving the overall health of human society.
Weather, environment and climate change
Machine learning can help us better understand the environment and make useful predictions, thereby helping people solve daily problems and overcome natural disasters. Taking weather forecasts and precipitation as examples, a computationally intensive physical model represented by NOAA's HRRR has long been the preferred solution in the industry. But we have shown that machine learning-based prediction systems can predict current precipitation with better spatial resolution (answer "Is it raining in the park near my home?", not just "Is it raining in the city you are currently in) and produce short-term forecasts with a much higher accuracy than HRR in the next 8 hours. This model not only has faster forecasting and calculation speed, but also has higher spatial and temporal resolution.
We have also developed an improved technology called HydroNets that uses neural networks to model real river systems around the world to accurately analyze the interaction between upstream water levels and downstream flooding, thereby making water level predictions and flood forecasts more accurately. Using these technologies, we have expanded the coverage of irrigation alerts in India and Bangladesh by 20 times, bringing stronger life safety protection capabilities to over 200 million residents within 250,000 square kilometers.
With its better satellite image data analysis capabilities, Google users can also accurately grasp the impact and intensity of wildfires (this year, wildfires have had a devastating impact on California and Australia).We demonstrate that even if previous satellite image data are limited, automatic analysis of the latest satellite image can effectively evaluate the losses caused by natural disasters. In addition, this technology can also evaluate the canopy coverage in different cities and design new vegetation planting plans based on this to help cities fight natural disasters. We also demonstrate how to use machine learning technology in the context of tense to help people improve their monitoring of ecology and wildlife.
Based on this work, we are excited to work with NOAA to expand NOAA's environmental monitoring, weather forecasting and climate research through Google Cloud infrastructure using AI and machine learning.
Accessibility
Machine Learning also shows amazing capabilities in improving accessibility because it can learn to convert one sensory input into other forms of sensory input. For example, we published Lookout is an Android app that helps visually impaired users by identifying food allowances in grocery stores and kitchen cabinets in their homes. The machine learning system behind Lookout proves that powerful and compact machine learning models can fully identify nearly 2 million products in real time on mobile phones.
Similarly, it is difficult for people who communicate in sign language to use video conferencing systems. Existing audio-based speech detection systems often fail to recognize their speech actions. To this end, we have developed a real-time, automatic sign language detection model for video conferencing, which allows users who make sign language expressions to correctly identify active speakers.
We also provide powerful Android accessibility to important home customers, including voice access and audio notifications.
Live Caption has also ushered in an expansion, which can support calls on Pixel phones and provides subtitle generation function for voice and video calls. This result is derived from the Live Relay research project, which helps deaf users make calls without help.
machine learning application in other fields
machine learning is also constantly proving its strength in many important scientific fields. In 2020, we established the FlyEM team in collaboration with HHMI Janelia Research Campus to jointly release the Drosophila Hemibrain Connection Group—a large synaptic resolution map that expresses brain connections, in which tissue images captured by high-resolution electron microscopy were reconstructed using large machine learning models. This connection group information will help neuroscientists perform various queries and help us better understand the brain's operating mechanism. Here we recommend that you watch this interactive display of 3-D UI.
The application of machine learning technology in the field of systems biology is also expanding rapidly. Our Google Accelerated Science team worked with colleagues at Calico to introduce machine learning into yeast analysis to better understand how genes work together throughout the ecosystem. We have also been exploring how to use model-based reinforcement technology to design biological sequences with specific properties for medical or industrial purposes—such as DNA or proteins. Model-based reinforcement learning can improve sample efficiency. In each round of experiments, we trained the strategy offline using a feature measurement simulator suitable for previous rounds. In tasks such as designing DNA transcription factor binding sites, designing antimicrobial proteins, and energy optimization of Ising models based on protein structures, we found that model-based reinforcement learning has become an attractive alternative solution.
In collaboration with X-Chem Manufacturing Company and ZebiAI, we have also been developing machine learning technologies to “virtual screen” of promising molecular compounds through calculations. Previous work in this field tends to focus on the treatment of small amounts of related compounds, and in Google's study, we tried to use a DNA-encoded library of small molecules to summarize a wide range of "hits" intervals more accurately. This new approach eliminates slow, inefficient physical processes in physical laboratories and promises to produce viable drug formulas based solely on theory.
We also see successful cases of solving core computer science and computer system problems through machine learning, and this trend has also spawned a conference platform represented by MLSys.In learning-based, C++ Server workload-oriented memory allocation use cases, neural network-based language models can predict context-related lifecycle information of each allocated site object and use this to organize the heap to reduce memory fragmentation. This method can reduce fragmentation by up to 78% when using purely large memory pages (more suitable for TLB behavior). End-to-end convertible Deep RL for graph optimization proposes a deep reinforcement learning idea for end-to-end and convertible graph optimization computing. Compared with the default optimization method in TensorFlow, it achieves a convergence acceleration effect of 33% to 60% on the three graph optimization tasks, completely overwhelming the original computational graph optimization method.
As stated in "Chip Design with Deep Reinforcement Learning", we have also been using reinforcement learning technology to solve the line layout problems in computer chip design. This has long been a time-consuming and laborious task, and it also seriously restricts the speed of chip products from design inspiration to establish a complete design to tablet manufacturing. Unlike previous methods, our new methods can learn ideas from past experience and continuously improve design effects over time. Specifically, the more chip design results we use in training, the better our approach is to produce highly optimized layout solutions through unprecedented design methods. This system can generate overall layouts that are better than human chip design experts, and we have been using this system (running on TPU) to design the main layout for the next generation of TPUs. Menger is our latest infrastructure built specifically for large distributed reinforcement learning and shows exciting performance levels in solving reinforcement learning challenges such as chip design.
Responsible AI
Google AI principles guide us to develop advanced technologies. We will continue to invest in the research of responsible AI, update our technical practices in this field, and regularly publish shared updates on implementation progress - various blog posts and reports released in 2020 are an important part of this.
To help everyone better understand the behavior of language models, we have developed the Language Interpretability Tool (LIT). This toolkit can improve the interpretability of language models, thereby enabling interactive exploration and analyzing decision results. We also develop techniques that measure gender correlation in pre-trained language models, as well as scalable techniques to reduce gender bias problems in Google Translate. We propose a simple method using kernel techniques to estimate the impact of each training data example on a single prediction. To help non-professionals explain machine learning results, we expanded the TCAV technology introduced in 2019, from which we have now established a complete and complete conceptual system. In preliminary TCAV work, we can set "furry" and "long ears" as important prerequisite concepts for the prediction results of "rabbit". Through this work, we can also determine that these two concepts are sufficient to fully explain the predicted results, and no other concepts are needed. The conceptual bottleneck model is another technology that aims to reduce the difficulty of interpreting the model through model training. We first match one of the layers with a predefined professional concept (such as "bone spurs" and "wing color" in the picture below) and then model it. In this way, we can not only explain the reasons for the final prediction, but also enable/off concepts instantly.
Through cooperation with other institutions, we also study the memory effect of language models, proving that extracting training data information is entirely possible to become a real threat to various latest large language models. This discovery, coupled with the information that embedded models may leak, may have a significant impact on privacy protection efforts (especially models trained for private data). In "Thieves of Sesame Street: Model Extraction on BERT-based APIs", we prove that attackers who only access the language model (even if only a small number of API queries are performed on the original model) can fully establish a model with high correlation between the output results and the original model.Later work further proved that attackers can extract smaller models with arbitrary accuracy. Based on AI security principles, we demonstrate that even with adaptive attack assessment schemes deployed, adversaries can bypass 13 public defense methods for adversarial examples. In the future, our work will focus on the methods and means of adaptive attacks, hoping to help the community make more progress in building stronger models.
's inspection method of machine learning systems is also an important area of exploration itself. We work with AI partners to define a framework that can learn from lessons and best practices from the aerospace, medical equipment and finance industries to audit the actual use of machine learning technology in software products. Through collaboration with UT and MIT , we found some ethical issues that may arise when auditing the performance of facial recognition systems. Through collaboration with the University of Washington, we now determine which criteria should be followed to select a subset of data when evaluating the fairness of the algorithm for diversity and inclusion goals. To enable responsible AI to truly serve more and even global users and help the industry understand whether the concept of equity is consistent around the world, we analyzed and created the Indian algorithm fairness framework, which includes multiple components such as datasets, fairness optimization, infrastructure and ecosystems.
The Model Cards project launched by Google and the University of Toronto in 2019 is also steadily achieving steadily growth in influence. In fact, many well-known models (such as OpenAI's GPT-2 and GPT-3), Google's MediaPipe model, and several Google Cloud APIs all use Model Cards to provide machine learning model users with development information about the model and the observed model behavior under different conditions. To make it easier for others to bring Model Cards to their own machine learning models, we also launched the Model Card Toolkit to simplify model transparency reporting. In order to improve the transparency of machine learning development practices, we have demonstrated a series of best practices and specific use cases throughout the dataset development lifecycle, including data requirement specifications and data acceptance testing.
We collaborate with National Science Foundation (NSF) to publish and fund the Human-AI interaction and collaboration project initiated by the National Institute of AI. We also released the MinDiff framework, a new regularization technology provided in the TF model correction library, which can efficiently and conveniently alleviate bias problems in machine learning model training. It also provides machine learning fairness training room function, which can explore the subsequent impact of machine learning decision-making systems after deployment and long-term application through simple simulation systems.
In addition to developing a fair framework, we have also developed methods that can identify and improve the experience and quality of recommendation systems, including using reinforcement learning techniques to improve the security of recommended routes. We are also committed to improving the reliability of machine learning systems and have found that multiple approaches, including generating adversarial examples, help improve robustness and thus lead to stronger fairness performance.
Differential privacy is a clear and quantitative way of privacy protection. We need to rethink various basic algorithms to ensure that they do not disclose any specific personal information during operation. Specifically, differential privacy helps solve the memory effect and information leakage problems mentioned above. In 2020, the industry has seen many exciting developments, allowing us to more effectively calculate how to minimize personal experience risks while maximizing the accuracy of personal cluster generation. In addition, we have also opened up the differential privacy library in the core of Google's internal tools and are paying close attention to preventing leakage problems caused by real floating point representations. In fact, Google is also using these tools to generate differentiated personal COVID-19 mobile reports, which have also become valuable sources of anonymous data in the hands of researchers and policy decisions.
To help developers evaluate the privacy properties of their classification models, we released a machine learning privacy test library in TensorFlow.We hope this library will provide inspiration for other more powerful privacy testing suites, which are now open to machine learning developers around the world.
In addition to promoting the latest progress in the development of privacy algorithms, we also strive to fully integrate privacy factors into the underlying product structure. The "Privacy Sandbox" feature provided by Chrome is the best example, which can change the basic operation of the advertising ecosystem and helps systematically protect personal privacy. As part of the project, we published and evaluated a variety of different APIs, including federated learning (FLoC) targeting specific target groups, and clustered APIs for differential privacy measurements.
, born in 2017, has now formed a complete field of research, with more than 3,000 papers on federal learning published in 2020 alone. Our 2019 survey paper "Advances and Open Problems in Federated Learning" published in partnership with other institutions has been cited 367 times in the past year, and its updated version will soon be published in the "Basics and Trends of Machine Learning" series. Last July, we also held a seminar on federal learning and analysis and published all research reports and TensorFlow federal learning tutorials.
We continue to promote the development of federated learning, including the development of new federated optimization algorithms, such as adaptive learning algorithms, posterior averaging algorithms, and techniques to simulate centralized algorithms in federated environments, substantial improvements to complementary cryptographic protocols, etc. We have published and deployed a joint analytics solution to perform data scientific analysis of raw data stored on the user's local device. Google’s products themselves also provide an important application platform for Federal Learning, including providing context emoji suggestions on Gboard, and Google Health Studies uses this to explore privacy protection medical research, etc. In addition, in the study of privacy amplification through random registration, we also introduced the first privacy accounting mechanism for federated learning.
User security is also a research area that we are very concerned about. In 2020, we continue to deploy new machine learning document scanners to protect against malicious documents and further increase protection for Gmail users. Now, we have increased the average daily malicious Office document detection rate by 10%. With good versatility, our tool also plays an important role in blocking other hostile malware activities and increases detection success rate by up to 150% in specific scenarios.
In terms of account protection, we have released a fully open source security key firmware designed to improve the technical application level in the field of two-factor authentication. Faced with the waves of phishing, security keys have become the best way to protect your account.
Natural Language Understanding
This year, we have made great progress in our natural language understanding ability. Most natural language understanding projects at Google and others generally rely on Transformers, a special neural network model originally developed for language understanding (more evidence is currently available to be applicable to images, videos, speech, protein folding, and many other fields). An important advance in the field of
dialogue system is that the current dialogue system can chat with users about content they are interested in and supports multiple interactions during the period. But the success stories in this field so far have mostly required the creation of specific topics, such as Duplex, and therefore cannot have common forms of conversation. To create a system with higher open dialogue capabilities, we released Meena in 2020. This is a common sense dialogue agent willing to discuss with users on any topic. Meena also scored high on SSA dialogue system metrics, meaning it has good response sensitivity and specificity. Based on observations, we found that as the scale of the Meena model expands, its ability to adapt to conversational content is becoming stronger and stronger. And according to the explanation of the relevant paper, the stronger the adaptability (i.e., the lower the confusion in the conversation), the higher the SSA score.
There is a well-known problem in generative language models and dialogue systems - when discussing factual data, the model capacity is often not enough to remember every specific detail related to the topic, which means that the results given by the model are reasonable, but not correct. (Of course, this is not a problem unique to machines, and humans may make similar mistakes.) To solve such problems in conversation systems, we are trying to allow conversation agents to access external sources of information (such as large numbers of documents, document libraries, or search engine APIs) to enhance conversation agents. In addition, we also try to develop new learning methods as additional resources to generate languages consistent with the retrieved text. The work in this field includes integrating searches into language representation models (for it to work properly, a key underlying technology is the effective vector similarity search used in ScaNN and other solutions, thereby effectively matching the required information with the information in the text corpus). Once the appropriate content is found, we can use the neural network to find answers in the table and extract structured data from temporary documents to better establish semantic understanding. We are making progress on PEGASUS, the latest model for abstracting text summary, to automatically create summary for any text segment – a feature that will provide important benefits to conversations, retrieval systems, and a variety of other use cases.
2020, another major focus of ours is to improve the execution efficiency of natural language processing (NLP) models. Technologies such as transfer learning and multitasking learning can help general NLP models to deal with various new tasks with the help of a small amount of calculations. Work in this field includes exploration of transfer learning in T5, sparse activation of models (see the Gshard section below for details), and the use of ELECTRA to improve model pre-training efficiency, etc. We are also working on several other initiatives, hoping to make improvements on the basic Transformer architecture. Taking Reformer as an example, it uses locally sensitive hashing and reversible computing to effectively support a larger attention window; explores the application of Performers (using linear, rather than squared methods) in protein modeling; and implements linear scaling through global and sparse random connections of ETC and BigBird to achieve large structured sequences, etc. We also explore techniques for creating extremely lightweight NLP models that are only one percent of the size of a BERT model, but have nearly the same performance on some tasks, making them ideal for running on edge devices. In Encode, Tag and Realize, we explore new ways to generate text models using editing operations rather than fully general text generation methods. This method has unique advantages in terms of less computing resource consumption, greater generated text control space, and lower training data requirements.
language translation
efficient language translation service can help people with different native speakers communicate smoothly with each other, thereby connecting the whole world more closely. As of now, more than 1 billion users around the world have used Google Translate. Starting last year, we added five new language options (Rwandan, Oriya, Tatar, Turkmen and Uyghur), which are now used by 75 million users. In addition, we have continuously improved the translation quality by improving the model architecture and training methods, better centralized processing of data noise, multilingual transmission and multilingual processing. From May 2019 to May 2020, the overall 100 language options provided by Google Translate ushered in an increase in +5 BLEU scores, while also being able to better use monolingual data to improve the translation effect of resource-scarce languages (i.e. languages with relatively limited written content on the Internet). In fact, we have always emphasized improving the fairness of machine learning systems and providing machine learning technology functions with similar effects to different groups as much as possible.
We firmly believe that the continuous expansion of the multilingual translation model will further improve the translation quality and ultimately bring a better practical experience to billions of users who use resource-scarce languages around the world.In GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding, Google researchers demonstrated that by training sparsely activated multilingual translation models, which contain up to 600 billion parameters, it is possible to achieve much better-than-benchmark translation quality at the BLEU score level in 100 languages. The sixth part of the article shows the three major trends in this work. The specific reprint is as follows:
- Through multilingual training, the BLEU scores of all languages have been improved; among which the improvement effect of resource-scarce languages is even better (the right line in the figure is higher than the left). These languages are mainly distributed among marginalized communities around the world, but the number of users is still as many as billions. Each rectangle in the figure represents the languages that the user has reached a scale of 1 billion. The larger the
- model and the more layers, the more obvious the increase in BLEU scores in all languages (almost no exception).
- large sparse model also proves that compared with training large intensive models, the training computational efficiency of sparse models is 10 to 100 times higher than that of training large intensive models; in addition, its BLEU score is equivalent to or even significantly exceeding the BLEU score of large intensive models (the paper also discusses the computational efficiency issue).
We have been actively working hard to formally introduce the results of GShard's research to Google Translate and train a single model that can cover 1,000 languages, including Divish and Sultanian Arabic, while sharing the challenges we faced and have to solve during the period.
We also developed a technology that creates language-neutral sentence representations for the BERT model, thereby developing a more powerful translation model. To efficiently evaluate translation quality, we introduced BLEURT. This is a new indicator for evaluating language generation tasks such as translation. It not only considers the amount of overlap between words and actual data, but also takes into account the actual semantics of the generated text, as shown in the following table.
machine learning algorithm
We will continue to develop new machine learning algorithms and training methods to enable the system to complete learning faster with less supervised data. By replaying the intermediate results during neural network training, we found that it can effectively fill in idle time on machine learning accelerator, thereby speeding up neural network training. Furthermore, by dynamically changing the connectivity of neurons during training, we also found solutions that outperform statically connected neural networks. We developed SimCLR, a new self-supervised and semi-supervised learning technology, which not only maximizes the consistency of the same image between different transformed views, but also minimizes the consistency between different images between transformed views. This approach significantly exceeds the performance level of the original optimal self-supervised learning technology.
We also extend the concept of comparison learning to the supervision mechanism, and the resulting loss function can greatly improve the cross-entropy of supervision classification problems.
reinforcement learning
The essence of reinforcement learning (RL) lies in summarizing and learning the core basis for making long-term decisions based on limited experience. A key challenge in the field of reinforcement learning is how to make accurate decisions using a very small number of fixed data points and continuously explore improvedly through other agents, ultimately significantly improving the efficiency of reinforcement learning algorithms.
2020, our focus is on offline reinforcement learning. It relies only on fixed and pre-collected datasets (for example from previous experiments or human demonstrations), thereby extending reinforcement learning to applications where training data cannot be collected on the fly. We also introduced dual approaches in reinforcement learning, whereby the improved algorithm developed can be used for non-strategic evaluation, estimating confidence intervals, and implement offline policy optimization. In addition, we work with a wide range of communities to try to release open source benchmark datasets as well as Atari DQN datasets to address these issues.
Another study learned experience from other agents through apprenticeship learning, thereby improving sample efficiency.We have developed new methods that can learn from other trained agents, or learn patterns from distributed matching/adversarial examples of other agents. To improve the exploration mechanism in reinforcement learning, we tried reward-based exploration methods, including how to mimic the structured exploration results generated by agents who already possess prior knowledge of the current environment.
We have also made significant progress in the mathematical theory of reinforcement learning. One of our main research areas is to explore how reinforcement learning is considered as an optimization process. We found the connection between reinforcement learning and Frank-Wolfe algorithm, momentum method, KL divergence regularization, operator theory, and convergence analysis; these insights have led us to establish new algorithms that can achieve optimal performance in challenging reinforcement learning benchmarks, and thus allow polynomial transfer functions to avoid convergence problems about softmax in reinforcement learning and supervised learning. We also make exciting progress under the topic of security reinforcement learning, including how to discover various optimal control rules, including the security policy optimization framework, while following important experimental constraints. We also study how to solve the so-called average field game problem through efficient reinforcement learning algorithms. This game model can help decision makers complete a variety of modeling needs from mobile network deployment to power grid design.
The breakthroughs we have made in the fields of new tasks and new environment generalization have also taken reinforcement learning to expand towards complex practical problems. In 2020, our focus on research is the group-based "learning learning" method, that is, another reinforcement learning or evolutionary agent trains the current reinforcement learning agent group, thereby establishing a learning content table containing a variety of complex emergencies, and ultimately discovering a new reinforcement learning algorithm. This ability to make estimates based on the importance of each data point in the training set and selectively pay attention to certain specific visual input parts will bring us a stronger reinforcement learning algorithm.
Overview of the methods we use in AttentionAgent and data processing diagrams. Top diagram: Input conversion - The input image is divided into multiple smaller blocks by sliding window, and then the result is "flattened" and dimensionality reduction for subsequent processing. Central Figure: Patch Election - After the modification, the Self-Attention Module will vote on each patch to generate the importance vector of the patch. The lower figure: Action Generation—AttentionAgent will select the most important patches, take corresponding features and make decisions based on these features.
In addition, we also demonstrate that learning predictive behavior models can accelerate reinforcement learning speeds, thereby implementing decentralized collaborative multi-agent tasks in different teams, thereby learning long-term behavior models and ultimately making new progress in the field of model-based reinforcement learning. By looking at skills that can trigger predictive changes in the environment, we find that skill does not require supervision. The more accurate the representation form, the more stable the reinforcement learning effect, while the layered potential space and value improvement path can bring better performance.
We also share open source tools for user extended reinforcement learning and production reinforcement learning. To help users further expand the scope of scenarios and problem categories that they can deal with, we have also launched SEED (a large-scale concurrent reinforcement learning agent), a library for measuring the reliability of reinforcement learning algorithms; and also launched the latest version of TF-Agents, which includes distributed reinforcement learning, TPU support, and a complete set of Bandit gambling algorithms. In addition, we have conducted a lot of empirical research on reinforcement learning algorithms, hoping to improve hyperparameter selection and algorithm design capabilities.
Finally, we also cooperated with Loon to train and deploy a reinforcement learning model that can efficiently control stratospheric balloons, hoping to improve the power consumption and navigation capabilities of each balloon networking node.
AutoML
uses learning algorithms to develop new machine learning technologies and solutions (also known as meta-learning) representing a very active and exciting research field. Throughout most of our work, we have been creating search spaces to find ways to integrate complex hand-designed components in unprecedented ways.In "AutoML-Zero: Evolving Code that Learns", we began to adopt a different approach than before, that is, an evolutionary algorithm provides a search space composed of original operations (such as addition, subtraction, variable assignment, and matrix multiplication), to try whether modern machine learning algorithms can be developed from scratch. In fact, learning algorithms with practical value are very rare, but this system has indeed gradually developed increasingly complex machine learning algorithms. As shown in the figure below, the system reproduces many of the most important machine learning discoveries in the past thirty years, such as linear models, gradient descent, corrected linear units, efficient learning rate setting and weight initialization, and gradient normalization.
We also used the meta-learning method to discover a variety of efficient architectures that can detect various objects from static images and videos. In the past year, we have used EfficientDet, an efficient image classification architecture, to make various explorations, and found that its image classification accuracy has been significantly improved and the calculation cost has been reduced. In subsequent research, we published "EfficientNet: Towards Scalable and Efficient Object Detection" and mentioned that EfficientDet can derive new object detection and positioning architectures, while achieving significant improvements in absolute accuracy and computational cost. When reaching the same level of accuracy as previous models, the computational cost of the new model is only one-thirteenth to one-forty-two of the latter.
We propose a meta-learning architecture on SpineNet, which can not only effectively preserve spatial information, but also detect at higher resolution. We also focus on independent learning and developing new and effective architectures for various video classification problems. AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures; AssembleNet++: AssembleModality Representations via Attention Connections; and AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification shows how to use evolutionary algorithms to create an unprecedented new video processing machine learning architecture.
This method can also be used to develop effective model architectures to perform timing prediction. Using AutoML for Time Series Forecasting describes a system that automatically searches in a search space containing multiple underlying building blocks to discover new prediction models. This approach also proved its effectiveness in the Kaggle M5 prediction competition with the generated algorithm. The system ranks 138th out of 5558 contests (top 2.5%). Unlike other competitive predictive models that take months to build manually, our AutoML solution can find the ideal model in a short time, with moderate computational costs (500 CPUs, 2 hours) without human intervention.
Better understanding of machine learning algorithms and models
In-depth understanding of machine learning algorithms and models is crucial for designing and training more effective models and understanding under which situations the model cannot work. Over the past year, we have focused on researching basic issues such as representation ability, optimization, model summary and label noise. As mentioned earlier, the Transformer network has had a huge impact on modeling language, speech, and visual problems, but what are the characteristics categories these models represent? Recently, we demonstrated that transformer belongs to a general-purpose approximator for sequence-to-sequence functions. Furthermore, even if the sparse transformer uses only the number of linear interactions between tokens, it still falls within the category of general approximaterials. We have also been developing new optimization technologies based on hierarchical adaptive learning rates, hoping to improve the convergence speed of transformers.For example, "Large batch optimization for deep learning (LAMB): Training BERT in 76 minutes).
As the neural network continues to expand at the depth and breadth levels, the training speed of related models has been strengthened overall, and the generalization ability has also been improved. Classical learning theory believes that large-scale networks should be overfitted; but it is precisely the previous effect that contrary to classical theory that allows deep learning to take the initiative in practical applications. We are also working to understand neural networks in over-parameterized states. Without width limitations, neural networks can take surprisingly simple forms and are described by neural network Gaussian processes (NNGP) or neural tangent nuclei (NTK). We study this phenomenon from the perspectives of theory and experiment, and published Neural Tangents—a open source software library written by JAX that allows researchers to build and train unlimited width neural networks (i.e., ultra-wide deep networks).
As the limited-width network continues to expand, it will also show a special double decline - as the width increases, its generalization becomes better first, then worse, and then better. We have shown that this phenomenon can be explained by a new bias-variance decomposition and may manifest as triple decline upon further expansion.
Finally, in practical problems, we often need to deal with obvious label noise issues. For example, in large-scale learning scenarios, we tend to only get weak label data from high noise labels. Now we have developed new technologies that can extract effective supervision information from severe label noise to obtain the best results. We further analyzed the effect of using random labels for neural network training, proving that this method can enhance the degree of matching between network parameters and input data; compared with initialization from scratch, the new method can also speed up downstream training. We also discussed whether label smoothing or gradient cropping can alleviate label noise issues, thus bringing new guiding insights to the use of noisy labels to achieve model training.
Algorithm Basics and Theory
2020, we have also achieved significant results in the basics and theories of algorithms, and have published many high-influence papers. In terms of optimization, we explore the paper on edge-weighted online bipartite matching to propose a new online competition algorithm technology, which solved the open problem of edge-weighted variables that have been plagued by people for 30 years. The relevant results have been applied to online advertising distribution. In addition, we have also developed a dual mirror descent technology that is expected to be applied to a variety of models with diversity and fairness constraints. We have also published a series of papers on how to use machine learning to achieve online optimization in the fields of online scheduling, online learning and online linear optimization. Another study resulted in the first breakthrough in the classic binary matching problem on dense graphs. Finally, we addressed the long-standing openness of how to track convex bodies online in another paper—what we are still using an algorithm from The Book.
We also continue to conduct research in scalable graph mining and graph-based learning, and discussed various scalable graph algorithm results including graph clustering, graph embedding, causal reasoning and graph neural networks as a host in the Graph Mining & Learning at Scale Workshop discussion at NeurIPS’20 conference. In this discussion, we show how to extend standard synchronous computing frameworks such as MapReduce through distributed hash tables similar to BigTable, thereby improving the processing speed of some basic graph problems at the theoretical and practical levels. Our extensive empirical research also verifies the practical application potential of the AMPC model, which is inspired by the distributed hash tables we use in large-scale concurrency algorithms for hierarchical clustering and interconnected components. Theoretical results show that this method can solve many such problems in a constant distribution round, thereby greatly improving computational efficiency. We also achieved exponential acceleration results in PageRank and Random Walk Calculation.In the field of graph learning, we released Grale, our self-designed machine learning graph framework. In addition, we also introduce how to build a more scalable graph neural network model and prove that PageRank can significantly speed up inference in GNN.
As a market algorithm in the intersection of computer science and economics, we continue to study how to make improvements to the online market, such as measuring incentive attributes in advertising auctions, bilateral markets, and optimizing order statistics in advertising selection. In the field of repeated auctions, we have developed a variety of frameworks that make the dynamic mechanism more robust, thus preventing predictions or estimation errors from being made on the current and/or future markets, resulting in a more accurate and verifiable dynamic mechanism. Furthermore, we describe when the asymptotic optimal goal can be achieved through geometric standards. We also compared the equilibrium results of a series of budget management strategies used in practice, demonstrated the impact of these strategies on revenue and buyer optimal balance points, and clarified the incentive attributes therein. Furthermore, we continue to study the best auction parameters and solve the complexity and profit loss problems in batch learning. We also designed the optimal repentance mechanism, studied the combination optimization in context auction pricing, and developed a new auction active learning framework to improve the approximation of auction bidding prices. Finally, inspired by the importance of incentives in bidding, we hope to help advertisers study the impact of incentive attributes in bidding activities in depth, and to this end, a data-driven metric is introduced to quantify the degree of deviation between specific mechanisms and incentive compatibility.
machine perception
perceives the world around us—including understanding of vision, hearing and multimodal input forms, modeling and taking action for this—is still an important field of research and has great development potential. Related breakthroughs are expected to significantly improve our daily lives.
In 2020, deep learning technology has derived a new method that can closely combine 3D computer vision with computer images. CvxNet, deep implicit functions used to describe 3D shapes, neural stereo pixel rendering and CoReNet are all typical achievements in this field. In addition, we are also studying how to represent a scene as a neural radiation field (NeRF for short), which is another important case in which Google Research promotes neural volume rendering technology through academic cooperation.
In the paper "Learning to Factorize and Relight a City" in collaboration with UC Berkeley , we propose a learning framework that can decompose outdoor scenes into time-varying lighting conditions and permanent scene factors. Based on this, we can generate all "street view" panoramic lighting effects and scene geometry at will, and even generate videos that are time-lapsed shots throughout the day.
We are also focusing on exploring generative humanoid and joint pose models, hoping to introduce statistical 3D humanoid modeling pipelines that support joint morphology into a fully trainable modular deep learning framework. Such models can reconstruct the human body in 3D posture and shape through a photo, thereby better understanding the scene in the picture.
2020, attempts to use neural networks for media compression are also increasing. In addition to image compression, this technology has also begun to achieve good performance in video compression, depth volume compression methods, and deep deformation neutral image watermarking. Other important topics in
perception research include:
- Better use of data resources (e.g. by self-training by noisy students, learning from simulated data, learning from noisy tags, comparative learning, etc.).
- cross-mode reasoning (e.g., Open Images (V6) updates using cross-mode supervision, audio-visual voice enhancement, language basis, localized feature descriptions—connecting vision and language in multimode annotation).
- develops perceptual methods that are more efficient in execution, especially those that can be run on edge devices (such as fast sparse convolution, structured multihash for model compression, etc.).
- enhances the ability to represent and reason various objects and scenes (such as detecting 3D objects and predicting 3D shapes, reconstructing 3D scenes through a single RGB image, using time context for object detection, learning to view transparent objects and estimating their poses through stereo relationships).
- uses AI to support human creativity (such as automatically creating videos based on web pages, intelligent video reconstruction, using GAN to create fantasy creatures, illuminating portraits, etc.).
We also interact with a wider research community through open source solutions and data sets, hoping to work together to promote the development of perceptual research. In 2020, we opened sourced several new perceptual reasoning functions and solutions in MediaPipe, including device-based face, hand and pose prediction; real-time body pose tracking; real-time iris tracking and depth estimation; and real-time 3D object detection.
With the support of machine learning technology, we are also constantly improving the user experience of mobile devices. We are able to run more complex and powerful natural language processing capabilities on mobile devices, thus enabling a more natural conversational experience. In 2020, we also expanded Call Screen and released Hold for Me to help users handle multiple daily tasks faster; in addition, we also provide natural language-based operation and navigation functions in the Recorder application to improve user productivity.
We also use Google's Duplex technology to initiate calls to various business departments to confirm the function that needs to be temporarily closed. Based on this, we have been able to update our business information globally for 3 million times, and the update results have been received over 20 billion views in maps and searches. We also use text-to-voice technology to allow Google Assistant to read text aloud in 42 languages, thereby reducing the difficulty of accessing pages.
We are also constantly improving the shooting application. We provide lighting adjustment, editing, enhancement and reproduction in Google Photos through more innovative controls and functions, which helps users easily leave precious memories on the Pixel. Starting with the Pixel 4 and 4a models, we introduced Live HDR+ in the photography application, which uses machine learning technology to train the dynamics, exposure and effect balance of HDR+ continuous shooting in the viewfinder in real time. We have also developed a double exposure control, allowing users to adjust the specific brightness of dark and bright parts in the scene in real time in the viewfinder.
Recently, we also launched the Portrait Light Portrait Light feature, a new post capture feature for the Pixel Camera app with Google. This feature adds analog directional light sources to portraits. This feature also uses machine learning technology, has been trained on more than 70 testers, and has completed comprehensive lighting effects learning in a Light Stage computing lighting system with 331 LED beads.
In the past year, Google researchers have also made many explorations on the specific use of Google products, including:
- eased homework help or 3D concept exploration through augmented reality to enhance learning effects.
- implements background blur within the browser, thereby improving the effect of virtual meetings. This feature has been officially introduced to Google Meet.
- provides new ways to help users virtually try out new products at home.
- helps users quickly find the most relevant content through keyframes in the video.
- helps users find the songs they hear by humming.
- Help YouTube identify harmful content for further manual review.
- helps YouTube creators create better videos through automatic sound enhancement and background noise reduction.
Robot
In the field of robot research, we have used several reinforcement learning technologies introduced in the previous article to try to use less data to learn more complex, safer and more robust robot behaviors, and thus have made great progress.
Transporter Networks is a new learning method that can represent robot tasks as a form of spatial displacement.Contrary to absolute position in the environment, Transporter Networks can make a very efficient way to make a correlation between the representation object and the robot end effector, helping the robot quickly learn to act within the current workspace.
In Grounding Language in Play, we show how to teach robots to perform tasks according to natural language instructions (supporting multiple languages). It is obvious that we need a scalable approach to collecting paired data between natural language instructions and robot behavior. Through research, we found that we can easily interact with the robot by calling the robot operator, and then organize the command effects into labels and adjust them, thereby guiding the robot to gradually learn how to execute the instructions correctly.
We also tried to collect more scalable data without using the robot body (a grab rod equipped with a camera) to explore how to deliver visual representations across multiple robot types more easily.
We also studied how to draw inspiration from nature and summarize highly agile robot motion strategies using evolutionary meta-learning strategies, human demonstrations, and deep reinforcement learning training data controllers.
This year, people's attention to security has been further increased: How can we safely deploy delivery drones in the real world? How can we ensure that robots will not fall into irreversible dilemma while exploring the world? How do we prove the stability of learning behavior? Faced with this key research field, we will continue to make active explorations in the future.
Quantum Computing
Our Quantum AI team continues to explore the practical applications of quantum computing technology. We run experimental algorithms on the Sycamore processor to simulate systems related to chemistry and physics. These simulation scenarios have approached the feasibility limit of classical computers in scale, and they also essentially verify the basic idea of using quantum computers to simulate important quantum effect systems proposed by Feynman. We have also released new quantum algorithms, such as performing precise processor calibrations, demonstrating the advantages of quantum machine learning, and testing quantum enhancement optimization effects. We also released qsim, an efficient simulation tool that enables the development and testing of quantum algorithms using up to 40 qubits on Google Cloud.
We are still continuing to explore the development roadmap, hoping to build a general-purpose error-correcting quantum computer. Our next milestone is to prove that quantum error correction can play a role at the practical level. To achieve this, we need to demonstrate that even though there are defects in a single component such as qubits, couplers, or I/O devices, larger qubit grids can still achieve exponential growth in the storage time of logical information. What excites us even more is that now we have our own clean room that can greatly improve the speed and quality of processor manufacturing work.
supports a wider community of developers and researchers
In 2020, TensorFLow celebrated its fifth birthday, and the number of project downloads has been completed 160 million times. The TensorFlow community has also been maintaining staggering growth in scale with new special interest groups, TensorFlow user groups, TensorFlow certificates, AI service partners, and #TFCommunitySpotlight inspired demonstrations. We also bring significant improvements to TF2.x with seamless TPU support, high performance out of the box (best performance on MLPerf 0.7), data preprocessing, distribution strategies, and the new NumPy API.
We are also introducing more new features into the TensorFlow ecosystem, hoping to help developers and researchers handle workflows efficiently: Sounds of India trains with TFX and deploys it in the browser as TF.js, thus completing the entire process from research to production in just 90 days. With Mesh TensorFlow, we break through the boundaries of model concurrency and provide ultra-high resolution image analysis capabilities. We also open source the new TF runtime, TF Profiler for model performance debugging, and a variety of responsible AI tools—such as Model Card Toolkit for model transparency, plus a privacy test library.With TensorBoard.dev, you can host, track and share your own machine learning experiments for free.
In addition, we have further increased our investment in JAX. JAX is a machine learning system that has developed rapidly in the past two years and focuses mainly on academic research. Researchers from Google and other companies are currently using JAX widely, and specific scenarios cover different privacy, neural rendering, networks that follow physical principles, fast attention, molecular dynamics, tensor networks, neural tangent nuclei and neural ODE, etc. JAX has also accelerated the research process of DeepMind, providing strength to the ever-developing library ecosystem, and injecting energy into explorations such as GAN, metagradient, reinforcement learning, etc. We also used JAX to build record-breaking MLPerf benchmarking performance with the Flax neural network library, demonstrating the powerful experience of the next generation of cloud TPU Pods at the NeurIPS conference. Finally, we also strive to ensure that JAX can work seamlessly with various TF ecosystem tools, including TF.data data preprocessing, TensorBoard experimental visualization, and TF Profiler performance debugging. In 2021, we will continue to launch more new features. The continuous improvement of
computing power has ushered in a series of major breakthroughs. Through the TFRC program, we provide researchers around the world with more than 500 trillion cloud TPU computing power resources for free, hoping to help the academic community explore machine learning research topics. Up to now, the academic community has published more than 120 papers supported by TFRC. Without the massive computing resources provided by the project, a considerable number of results would not be possible. For example, TFRC researchers recently developed a wildfire spread simulation model to help users analyze COVID-19 public opinion and vaccine attention changes on social media, and also fortunately our overall understanding of gambling assumptions and neural network pruning. Members of the TFRC community also published experiments on Persian poetry, winning the challenge of fine-grained fashion image segmentation in the Kaggle competition, and all the more important tutorials and open source tools were fully shared. In 2021, Cloud TPU will add new support for JAX and PyTorch outside TensorFlow, so we intend to rename the TFRC plan to TPU Research Cloud plan to more clearly reflect its broad and inclusive positioning.
Finally, 2020 is also a very important year for Colab. Colab usage has doubled, and we have also launched several production-level features to help users get their job done efficiently—including improved Drive integration and access to Colab virtual machines through terminals. We also launched Colab Pro to help users acquire a stronger GPU, extend runtime and use higher memory capacity.
Open Data Sets and Data Set Search
Open Data Sets with clear and quantifiable goals have always played a crucial role in the development of machine learning technology. To help the research community obtain more interesting data sets, we will continue to index various open data sets published by different organizations through Google Dataset Search. We also believe that it is more important to create new data sets for the community to use to develop new technologies, while ensuring that this open data is shared responsibly. In 2020, in addition to the open datasets that help solve the COVID-19 crisis, we have released multiple open datasets in many other different fields:
- uses the dataset search function to analyze online datasets: a set of metadatasets that cover multiple datasets.
- Google Computing Cluster Tracking Data: In 2011, Google released a 29-day computing activity tracking on an internal computing cluster. It turns out that this attempt helped the computer system community better explore job scheduling strategies and helped all parties to understand the utilization of cluster resources more deeply. In 2020, we released a new, larger version covering 8 internal computing clusters and provide more detailed information.
- released Objectron dataset: This dataset contains 15,000 short video materials centered on objects. Each video clip also has a 3D bounding box, capturing a large-scale set of public objects from multiple angles. In addition, the dataset also collected 4 million annotated images from samples with good geographical diversity (covering 10 countries on five continents).
- Open Images V6 now has localized feature description: in addition to inheriting the 90 million annotated images, 36 million image-level tags, 15.8 million bounding boxes, 2.8 million instance segmentation records and 391,000 visual relationships that are owned in the V5 version, the new version also introduces localized feature descriptions. This is a completely new form of multimodal annotation that covers synchronous speech, text and mouse tracks on the described objects. In Open Images V6, localized feature descriptions have covered 500,000 images. To facilitate comparison with previous results, we also published localized feature descriptions covering all 123,000 images for the COCO dataset.
- We collaborated with researchers at the University of Washington and Princeton University to create the Efficient Open-Domain Question Answering Challenge and Workshop, hoping that participants can create systems that can answer any questions. For more details on the competition and discussion, please refer to the technical report.
- TyDi QA: A multilingual question and answer benchmark to explore new multilingual question and answer efficiency benchmarks (most benchmarks in this field currently only support single languages, and we believe that multilingual support capabilities must be expanded).
- Wiki-40B: Multilingual language model dataset. This is a new multilingual model benchmark, covering more than 40 languages and covering several scripts and language families. With approximately 40 billion characters, we hope this new resource can accelerate research in the field of multilingual modeling. We also trained and published high-quality training language models on this dataset, which helps researchers easily compare the differences between different technologies on this benchmark.
- XTREME: A large-scale multilingual multitasking benchmark for evaluating the effectiveness of cross-language generalization, which helps researchers evaluate the level of cross-language generalization in a multi-task environment. How to improve the quality of the problem? (How to Ask Better Questions?) This is a large-scale cube for Rewriting III-Formed Questions that provide 427,719 question/answer pairs across 3030 domains that can be used to train models to rewrite problems with malformed formats into higher quality forms.
- Open-Sourcing Big Transfer (BiT): An open source pre-trained model for exploring the effects of large-scale computer vision pre-training, which can serve as an ideal starting point for a variety of image-related tasks.
- , founded in collaboration with the University of Victoria, Czech University of Science and Technology, and EPFL, aims to launch benchmarking challenges through a set of data sets to solve the problem of capturing 3D structures from motion, including video or static images captured through multiple different angles.
- metadata set: a data set for data sets for learning a small number of samples. This is a set of data sets that cover multiple data sets. The field of machine learning has a long-term goal of building a system that can generalize examples from one task to another without much extra training. This set of metadata sets helps us measure the current progress towards this ultimate goal.
- Google Landmarks Dataset v2—a large-scale benchmark for instance-level recognition and retrieval, used for large-scale, fine-grained instance recognition and image retrieval in artificial and natural landmark scenarios. GLDv2 is the largest dataset of this type to date, containing more than 5 million images and 200,000 different instance tags. Its test set has 118,000 images with real-life annotations, which can be used in various search and recognition tasks.
- enhances the research community's access to street view panoramic materials in "Real Language Task". This is a new open dataset that provides researchers with street view panoramic material to compare real language navigation or other tasks that rely on such data, thereby comparing performance differences between different technical solutions.
Research Community Interaction
We are passionate about supporting and extensively participating in the daily operations of the research community. In 2020, Google researchers published more than 500 papers at various top research conferences, and also served as the organizer of project organizing committees, seminars, tutorial compilation and other activities.For specific contributions to our 2020 major seminars, please refer to the blog posts on ICLR 2020, CVPR 2020, ACL 2020, ICML 2020, ECCV 2020, and NeurIPS 2020.
In 2020, we invested $37 million in external research, including $8.5 million in COVID research funding, $8 million in inclusion and equity research funding, and $2 million in responsible AI research funding. Last February, we released the 2019 Google Faculty Research Award winners, hoping to fund research programs for 150 faculty from around the world. 27% of the winners are from marginal societies in the field of technology in history. We have also announced a new research scholar program that supports young academics currently engaged in Google-related research at an unlimited reward amount. For more than a decade, we have also encouraged doctoral students to apply for Google PhD Fellowships scholarships, helping them apply for research direction guidance while receiving funding, while providing them with opportunities to interact with other Google PhD Fellows researchers.
We are also constantly expanding new inclusive approaches, hoping to bring more new voices into the field of computer science. In 2020, we established a new Inclusive Research Award program to help traditionally low-concern groups provide academic research support in the fields of computing and technology. In the first list of award winners, we selected 16 funding proposals with 25 key researchers, focusing on issues such as diversity and inclusion, algorithmic bias, educational innovation, health tools, accessibility, gender bias, social welfare AI, security and social equity. We also collaborate with the Alliance for Computing (CAHSI) of Hispanic Services and the Alliance of Professors CMD-IT Diversified Future Leaders Program (FLIP), to help more doctoral students on traditional marginalized groups successfully complete the paper publication work within the last academic year.
In 2019, Google's CS Research Guidance Program (CSRMP) provides guidance to 37 undergraduate students to help them gain insight into the research process in computer science. Combined with the successful experience of the 2019/2020 school year, we decided to actively expand the program size in the 2020/2021 school year, organizing hundreds of Google Fellows to provide one-on-one guidance to undergraduates, encouraging more young students from traditional marginal communities to enter the field of computer science research. Finally, in October we offered explorerCSR awards to 50 institutions around the world to reward teachers who held workshops to undergraduate students from traditional marginal groups, guiding more young people to engage in computer science research.
Looking to the future
From developing the next generation of AI models to building a growing community of researchers, we are always looking forward to everything in the future.
We will continue to take the AI principles as a guiding framework, pay close attention to the wide range of social impacts that various technical issues may cause, and hope to ensure that AI technology can have a positive impact responsibly. The responsible AI paper mentioned above is just the tip of the iceberg of Google's related research over the past year. In the related research, we will focus on:
- to improve research integrity: ensuring that Google continues to advance extensive research in appropriate ways and provides a comprehensive scientific perspective on a variety of interesting and challenging topics.
- is committed to the development of responsible AI: we will continue to focus on tackling difficult issues. Google will also continue to create new machine learning algorithms to ensure that machine learning technology is more efficient and accessible, while finding new ways to deal with unfair biases in language models, designing new solutions to protect privacy within learning systems, etc. More importantly, in addition to ardently looking forward to the development of AI technology itself, we will also pay close attention to the efforts of other members of the community to mitigate potential risks and ensure that new technologies have a more equitable and positive impact on human society as a whole.
- promotes diversity, fairness and inclusion: We pay close attention to the construction methods of products and computing systems, and require that these results better reflect the usage habits and vital interests of people around the world. At Google Research and beyond, we call on academic and industry partners to work together to make efforts in this regard.Personally, I have invested hundreds of hours in this goal over the past few years, while providing inclusive support for Berkeley , Carnegie Mellon, Cornell, Georgia Tech , Howard , the University of Washington, and many other organizations. This work is very important to me personally, Google and the computer science community as a whole.
Finally, looking forward to the future, I hope to emerge more general machine learning models that are less dependent on data scale, can handle multiple modes, and can flexibly solve new tasks. Advances in the field of machine learning will bring more powerful products to people, including providing better translation quality, speech recognition effects, language understanding and creative support to billions of people around the world.
Original link:
https://ai.googleblog.com/2021/01/google-research-looking-back-at-2020.html
Extended reading:
1.6 trillion parameters! Google trains a super artificial intelligence language model, equivalent to 9 GPT-3-InfoQ
Follow me and forward this article to get learning materials~ If you want to know more, you can also move to the InfoQ official website to get the latest InfoQ information~