2015 GRaMScalingGraphComputationtoth

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Graph Database Engine, GʀᴀM.

Notes

Cited By

2015

Quotes

Abstract

GʀᴀM is an efficient and scalable graph engine for a large class of widely used graph algorithms. It is designed to scale up to multicores on a single server, as well as scale out to multiple servers in a cluster, offering significant, often over an order-of-magnitude, improvement over existing distributed graph engines on evaluated graph algorithms. GʀᴀM is also capable of processing graphs that are significantly larger than previously reported. In particular, using 64 servers (1, 024 physical cores), it performs a PageRank iteration in 140 seconds on a synthetic graph with over one trillion edges, setting a new milestone for graph engines.

GʀᴀM's efficiency and scalability comes from a judicious architectural design that exploits the benefits of multi-core and RDMA. GʀᴀM uses a simple message-passing based scaling architecture for both scaling up and scaling out to expose inherent parallelism. It further benefits from a specially designed multi-core aware RDMA-based communication stack that preserves parallelism in a balanced way and allows overlapping of communication and computation. A high degree of parallelism often comes at the cost of lower efficiency due to resource fragmentation. GʀᴀM is equipped with an adaptive mechanism that evaluates the cost and benefit of parallelism to decide the appropriate configuration. Combined, these mechanisms allow GʀᴀM to scale up and out with high efficiency.

1. Introduction

Graph structures are prevalent and offer valuable information about relations and connections. For examples, social graphs help reveal the influence of each individual and the formation of communities. Distributed graph computation engines (e.g., Pregel [24], PowerGraph [13], GraphX [14], and PowerLyra [8]) have been proposed to mine insights from graphs through multiple iterations of vertex-centric computation and communication along the edges, with iterations often separated by barriers. Graph computation is particularly challenging to scale because graphs are notoriously hard to partition and small random accesses are dominant in such a workload. Yet, we are seeing the real needs to process increasingly large graphs on the order of hundreds of billions and even trillions of edges [10, 19]. Even with detailed performance studies [15] on the existing graph engines, there remains a lack of understanding on the scaling of such graph engines and its cost.

We present GRAM, a new graph engine that takes advantage of the modern hardware available in data centers, with multi-core servers, abundant memory in a cluster, and high-speed network with RDMA (Remote Direct Memory Access) support [17]. GRAM’s design and implementation are further guided by a careful study on performance, scalability, and cost, with the following key design choices. First, GRAM adopts a deceivingly simple model for both scaling up and scaling out, where we affinitize a thread with each core, with threads (even within a server) communicating through message passing. We choose message passing over shared-memory primitives even for the intra-server case because our evaluation results (Section 5) reveal that, with sufficient batching, message passing exhibits better performance than shared-memory primitives due to better locality and fewer inter-core communications. This model fully exposes inherent hardware-level parallelism. The simplicity of the model makes it easy to understand any potential scalability bottleneck introduced by the software stack, while allowing an efficient implementation that scales.

Second, to scale out, GRAM incorporates a multicoreaware RDMA-based communication stack. The communication stack maximizes parallelism through fine-granularity message buffer pools to reduce interference among …

References

  • 1. The Lemur Project: Clueweb12 Web Graph. http://www.lemurproject.org/clueweb12/webgraph.php/.
  • 2. Daniel J. Abadi, Adam Marcus, Samuel R. Madden, Kate Hollenbach, Scalable Semantic Web Data Management Using Vertical Partitioning, Proceedings of the 33rd International Conference on Very Large Data Bases, September 23-27, 2007, Vienna, Austria
  • 3. Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schüpbach, Akhilesh Singhania, The Multikernel: A New OS Architecture for Scalable Multicore Systems, Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, October 11-14, 2009, Big Sky, Montana, USA doi:10.1145/1629575.1629579
  • 4. P. Boldi, S. Vigna, The Webgraph Framework I: Compression Techniques, Proceedings of the 13th International Conference on World Wide Web, May 17-20, 2004, New York, NY, USA doi:10.1145/988672.988752
  • 5. Paolo Boldi, Bruno Codenotti, Massimo Santini, Sebastiano Vigna, UbiCrawler: A Scalable Fully Distributed Web Crawler, Software — Practice & Experience, v.34 n.8, p.711-726, 10 July 2004 doi:10.1002/spe.587
  • 6. Paolo Boldi, Massimo Santini, Sebastiano Vigna, A Large Time-aware Web Graph, ACM SIGIR Forum, v.42 n.2, December 2008 doi:10.1145/1480506.1480511
  • 7. Rishan Chen, Mao Yang, Xuetian Weng, Byron Choi, Bingsheng He, Xiaoming Li, Improving Large Graph Processing on Partitioned Graphs in the Cloud, Proceedings of the Third ACM Symposium on Cloud Computing, p.1-13, October 14-17, 2012, San Jose, California doi:10.1145/2391229.2391232
  • 8. Rong Chen, Jiaxin Shi, Yanzhe Chen, Haibo Chen, PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs, Proceedings of the Tenth European Conference on Computer Systems, April 21-24, 2015, Bordeaux, France doi:10.1145/2741948.2741970
  • 9. Raymond Cheng, Ji Hong, Aapo Kyrola, Youshan Miao, Xuetian Weng, Ming Wu, Fan Yang, Lidong Zhou, Feng Zhao, Enhong Chen, Kineograph: Taking the Pulse of a Fast-changing and Connected World, Proceedings of the 7th ACM European Conference on Computer Systems, April 10-13, 2012, Bern, Switzerland doi:10.1145/2168836.2168846
  • 10. A. Ching. Scaling Apache Giraph to a Trillion Edges. Https://www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-atrillion-edges/10151617006153920.
  • 11. Michael Curtiss, Iain Becker, Tudor Bosman, Sergey Doroshenko, Lucian Grijincu, Tom Jackson, Sandhya Kunnatur, Soren Lassen, Philip Pronin, Sriram Sankar, Guanghao Shen, Gintaras Woss, Chao Yang, Ning Zhang, Unicorn: A System for Searching the Social Graph, Proceedings of the VLDB Endowment, v.6 n.11, p.1150-1161, August 2013 doi:10.14778/2536222.2536239
  • 12. Aleksandar Dragojević, Dushyanth Narayanan, Orion Hodson, Miguel Castro, FaRM: Fast Remote Memory, Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation, April 02-04, 2014, Seattle, WA
  • 13. Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, Carlos Guestrin, PowerGraph: Distributed Graph-parallel Computation on Natural Graphs, Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, October 08-10, 2012, Hollywood, CA, USA
  • 14. Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, Ion Stoica, GraphX: Graph Processing in a Distributed Dataflow Framework, Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation, October 06-08, 2014, Broomfield, CO
  • 15. Minyang Han, Khuzaima Daudjee, Khaled Ammar, M. Tamer Özsu, Xingfang Wang, Tianqi Jin, An Experimental Comparison of Pregel-like Graph Processing Systems, Proceedings of the VLDB Endowment, v.7 n.12, p.1047-1058, August 2014 doi:10.14778/2732977.2732980
  • 16. Wentao Han, Youshan Miao, Kaiwei Li, Ming Wu, Fan Yang, Lidong Zhou, Vijayan Prabhakaran, Wenguang Chen, Enhong Chen, Chronos: A Graph Engine for Temporal Graph Analysis, Proceedings of the Ninth European Conference on Computer Systems, April 14-16, 2014, Amsterdam, The Netherlands doi:10.1145/2592798.2592799
  • 17. InfiniBand Trade Association. Supplement to Infiniband Architecture Specification Volume 1 Release 1.2.2 Annex A16: RDMA over Converged Ethernet (RoCE), 2010.
  • 18. Anuj Kalia, Michael Kaminsky, David G. Andersen, Using RDMA Efficiently for Key-value Services, Proceedings of the 2014 ACM Conference on SIGCOMM, August 17-22, 2014, Chicago, Illinois, USA doi:10.1145/2619239.2626299
  • 19. Raimondas Kiveris, Silvio Lattanzi, Vahab Mirrokni, Vibhor Rastogi, Sergei Vassilvitskii, Connected Components in MapReduce and Beyond, Proceedings of the ACM Symposium on Cloud Computing, p.1-13, November 03-05, 2014, Seattle, WA, USA doi:10.1145/2670979.2670997
  • 20. Haewoon Kwak, Changhyun Lee, Hosung Park, Sue Moon, What is Twitter, a Social Network Or a News Media?, Proceedings of the 19th International Conference on World Wide Web, April 26-30, 2010, Raleigh, North Carolina, USA doi:10.1145/1772690.1772751
  • 21. Aapo Kyrola, Guy Blelloch, Carlos Guestrin, GraphChi: Large-scale Graph Computation on Just a PC, Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, October 08-10, 2012, Hollywood, CA, USA
  • 22. Oliver Lehmberg, Robert Meusel, Christian Bizer, Graph Structure in the Web: Aggregated by Pay-level Domain, Proceedings of the 2014 ACM Conference on Web Science, June 23-26, 2014, Bloomington, Indiana, USA doi:10.1145/2615569.2615674
  • 23. Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, Joseph M. Hellerstein, Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud, Proceedings of the VLDB Endowment, v.5 n.8, p.716-727, April 2012 doi:10.14778/2212351.2212354
  • 24. Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski, Pregel: A System for Large-scale Graph Processing, Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, June 06-10, 2010, Indianapolis, Indiana, USA doi:10.1145/1807167.1807184
  • 25. F. McSherry, M. Isard, and D. G. Murray. Scalability! But at What Cost? In 15th Workshop on Hot Topics in Operating Systems, HotOS '15, 2015.
  • 26. Christopher Mitchell, Yifeng Geng, Jinyang Li, Using One-sided RDMA Reads to Build a Fast, CPU-efficient Key-value Store, Proceedings of the 2013 USENIX Conference on Annual Technical Conference, June 26-28, 2013, San Jose, CA
  • 27. Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, Martín Abadi, Naiad: A Timely Dataflow System, Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, November 03-06, 2013, Farminton, Pennsylvania doi:10.1145/2517349.2522738
  • 28. Jacob Nelson, Brandon Holt, Brandon Myers, Preston Briggs, Luis Ceze, Simon Kahan, Mark Oskin, Latency-tolerant Software Distributed Shared Memory, Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference, p.291-305, July 08-10, 2015, Santa Clara, CA
  • 29. Thomas Neumann, Gerhard Weikum, The RDF-3X Engine for Scalable Management of RDF Data, The VLDB Journal — The International Journal on Very Large Data Bases, v.19 n.1, p.91-113, February 2010 doi:10.1007/s00778-009-0165-y
  • 30. Donald Nguyen, Andrew Lenharth, Keshav Pingali, A Lightweight Infrastructure for Graph Analytics, Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, November 03-06, 2013, Farminton, Pennsylvania doi:10.1145/2517349.2522739
  • 31. Russell Power, Jinyang Li, Piccolo: Building Fast, Distributed Programs with Partitioned Tables, Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, p.1-14, October 04-06, 2010, Vancouver, BC, Canada
  • 32. Vijayan Prabhakaran, Ming Wu, Xuetian Weng, Frank McSherry, Lidong Zhou, Maya Haridasan, Managing Large Graphs on Multi-cores with Graph Awareness, Proceedings of the 2012 USENIX Conference on Annual Technical Conference, p.4-4, June 13-15, 2012, Boston, MA
  • 33. Amitabha Roy, Ivo Mihailovic, Willy Zwaenepoel, X-Stream: Edge-centric Graph Processing Using Streaming Partitions, Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, November 03-06, 2013, Farminton, Pennsylvania doi:10.1145/2517349.2522740
  • 34. A. Roy, L. Bindschaedler, J. Malicevic, and W. Zwaenepoel. Chaos: Scale-out Graph Processing from Secondary Storage. In Proceedings of the 25th ACM Symposium on Operating Systems Principles, SOSP '15, 2015.
  • 35. Bin Shao, Haixun Wang, Yatao Li, Trinity: A Distributed Graph Engine on a Memory Cloud, Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, June 22-27, 2013, New York, New York, USA doi:10.1145/2463676.2467799
  • 36. Julian Shun, Guy E. Blelloch, Ligra: A Lightweight Graph Processing Framework for Shared Memory, Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, February 23-27, 2013, Shenzhen, China doi:10.1145/2442516.2442530
  • 37. D. A. Spielmat, Spectral Partitioning Works: Planar Graphs and Finite Element Meshes, Proceedings of the 37th Annual Symposium on Foundations of Computer Science, p.96, October 14-16, 1996
  • 38. Leslie G. Valiant, A Bridging Model for Parallel Computation, Communications of the ACM, v.33 n.8, p.103-111, Aug. 1990 doi:10.1145/79173.79181
  • 39. Kaiyuan Zhang, Rong Chen, Haibo Chen, NUMA-aware Graph-structured Analytics, Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, February 07-11, 2015, San Francisco, CA, USA doi:10.1145/2688500.2688507;


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2015 GRaMScalingGraphComputationtothMing Wu
Fan Yang
Jilong Xue
Wencong Xiao
Youshan Miao
Lan Wei
Haoxiang Lin
Yafei Dai
Lidong Zhou
G Ra M: Scaling Graph Computation to the Trillions10.1145/2806777.28068492015