GraphX Graph Processing System: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
m (Text replacement - "==References==" to "== References ==")
 
Line 1: Line 1:
A [[GraphX Graph Processing System]] is a [[graph data processing system]].
#REDIRECT [[Spark GraphX Graph Processing System]]
* <B>Context:</B>
** It can be developed by [[Berkeley's AMPLab]].
** It can support: [[PageRank]], [[Connected components]], [[Label propagation]], [[SVD++]], [[Strongly connected components]], [[Triangle count]].
* <B>Counter-Example(s):</B>
** [[Titan DBMS]].
** [[GraphX DBMS]].
** [[GraphBuilder]].
** [[Giraph]].
* <B>See:</B> [[REST Web API]].
----
----
== References ==
 
=== 2017 ===
* https://spark.apache.org/graphx/
** QUOTE: [[GraphX]] is [[Apache Spark's API]] for [[graph processing|graph]]s and [[graph-parallel computation]].
 
=== 2015 ===
* http://en.wikipedia.org/wiki/Graph_database#Distributed_Graph_Processing
** [[GraphLab]] built on the [[Spark cluster computing system]]. Dr. [[Joseph Gonzalez]] is the project lead, the creator of GraphLab.
 
=== 2015 ===
* https://amplab.cs.berkeley.edu/projects/graphx/
** Increasingly, [[data-science application]]s require the [[graph data creation|creation]], [[graph data manipulation|manipulation]], and [[analysis of large graphs]] ranging from [[social network]]s to [[language model]]s.  While existing [[graph system]]s (e.g., [[GraphBuilder]], [[Titan]], and [[Giraph]]) address specific stages of a typical [[graph-analytics pipeline]] (e.g., [[graph construction]], [[graph querying|querying]], or [[graph computation|computation]]), they do not address the entire pipeline, forcing the user to deal with multiple systems, complex and brittle file interfaces, and inefficient data-movement and duplication.  <BR>    The GraphX project unifies graphs and tables enabling users to express an entire graph analytics pipeline within a single system.  The GraphX interactive API makes it easy to build, query, and compute on large distributed graphs.  In addition, GraphX includes a growing repository of graph algorithms for a range of analytics tasks.  By casting recent advances in graph processings systems as distributed join optimizations, GraphX is able to achieve performance comparable to specialized graph processing systems while exposing a more flexible API.  By building on top of recent advances in data-parallel systems, GraphX is able to achieve fault-tolerance while retaining in-memory performance and without the need for explicit checkpoint recovery.  <BR>    GraphX is available as part of the Spark Apache Incubator project as of version 0.9.0, and the active research version of GraphX can be obtained from the github project page.
 
----
__NOTOC__
[[Category:Concept]]

Latest revision as of 21:36, 12 November 2018