2023 FasterSortingAlgorithmsDiscover

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Automated Algorithm Design, Program Synthesis, Code Optimization, Machine Learning for Code Generation, Sorting Algorithms.

Notes

Cited By

Quotes

Abstract

Fundamental algorithms such as sorting or hashing are used trillions of times on any given day. As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past, making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library. This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach.

Main

Human intuition and know-how have been crucial in improving algorithms. However, many algorithms have reached a stage whereby human experts have not been able to optimize them further, leading to an ever-growing computational bottleneck. The work in classical program synthesis literature, spanning many decades, aims to generate correct programs and/or optimize programs using proxies for latency. These include enumerative search techniques and stochastic search as well as the more recent trend of using deep learning in program synthesis for generating correct programs. Using deep reinforcement learning (DRL), we can take this a step further by generating correct and performant algorithms by optimizing for actual measured latency at the CPU instruction level, by more efficiently searching and considering the space of correct and fast programs compared to previous work.

One of the fundamental questions in computer science is how to sort a sequence. This is taught in elementary computer science classes around the world and is used ubiquitously by a vast range of applications. Decades of computer science research have focused on discovering and optimizing sorting algorithms. A key component of practical solutions is a small sort over a short sequence of elements; this algorithm is called repeatedly when sorting large arrays that use divide-and-conquer approaches. In this work, we focus on two types of small sort algorithm: (1) the fixed sort and (2) the variable sort. Fixed sort algorithms sort sequences of a fixed length (for example, sort 3 can only sort sequences of length 3), whereas variable sort algorithms can sort a sequence of varying size (for example, variable sort 5 can sort sequences ranging from one to five elements).

We formulate the problem of discovering new, efficient sorting algorithms as a single-player game that we refer to as AssemblyGame. In this game, the player selects a series of low-level CPU instructions, which we refer to as assembly instructions, to combine to yield a new and efficient sorting algorithm. This is challenging as the player needs to consider the combinatorial space of assembly instructions to yield an algorithm that is both provably correct and fast. The hardness of the AssemblyGame arises not only from the size of the search space, which is similar to extremely challenging games such as chess and Go, but also from the nature of the reward function. A single incorrect instruction in the AssemblyGame can potentially invalidate the entire algorithm, making exploration in this space of games incredibly challenging.

To play the game, we introduce AlphaDev, a learning agent that is trained to search for correct and efficient algorithms. This agent is comprised of two core components, namely (1) a learning algorithm and (2) a representation function. The AlphaDev learning algorithm can incorporate both DRL as well as stochastic search optimization algorithms to play AssemblyGame. The primary learning algorithm in AlphaDev is an extension of AlphaZero, a well-known DRL algorithm, in which a neural network is trained to guide a search to solve AssemblyGame. The representation function is interchangeable and captures the underlying structure of assembly programs. The primary AlphaDev representation is based on Transformers.

Using AlphaDev, we have discovered fixed and variable sort algorithms from scratch that are both new and more efficient than the state-of-the-art human benchmarks. The fixed sort solutions for sort 3, sort 4 and sort 5 discovered by AlphaDev have been integrated into the standard sort function in the LLVM standard C++ library. This library is used by several million users including universities and numerous international companies. In addition, we analyse the new algorithm discoveries, compare AlphaDev to stochastic search optimization approaches and apply AlphaDev to further domains to showcase the generality of the approach.

...

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2023 FasterSortingAlgorithmsDiscoverDavid Silver
Martin Riedmiller
Demis Hassabis (1976-)
Oriol Vinyals
Julian Schrittwieser
Thomas Hubert
Yujia Li
Jean-Baptiste Lespiau
Daniel J. Mankowitz
Andrea Michi
Anton Zhernov
Marco Gelmi
Marco Selvi
Cosmin Paduraru
Edouard Leurent
Shariq Iqbal
Alex Ahern
Thomas Köppe
Kevin Millikin
Stephen Gaffney
Sophie Elster
Jackson Broshear
Chris Gamble
Kieran Milan
Robert Tung
Minjae Hwang
Taylan Cemgil
Mohammadamin Barekatain
Amol Mandhane
Pushmeet Kohli
Faster Sorting Algorithms Discovered Using Deep Reinforcement Learning2023