2009 TheBerlinSparqlBenchmark

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Berlin SPARQL Benchmark (BSBM).

Notes

Cited By

Quotes

Author Keywords

Abstract

The SPARQL Query Language for RDF and the SPARQL Protocol for RDF are implemented by a growing number of storage systems and are used within enterprise and open Web settings. As SPARQL is taken up by the community, there is a growing need for benchmarks to compare the performance of storage systems that expose SPARQL endpoints via the SPARQL protocol. Such systems include native RDF stores as well as systems that rewrite SPARQL queries to SQL queries against non-RDF relational databases. This article introduces the Berlin SPARQL Benchmark (BSBM) for comparing the performance of native RDF stores with the performance of SPARQL-to-SQL rewriters across architectures. The benchmark is built around an e-commerce use case in which a set of products is offered by different vendors and consumers have posted reviews about products. The benchmark query mix emulates the search and navigation pattern of a consumer looking for a product. The article discusses the design of the BSBM benchmark and presents the results of a benchmark experiment comparing the performance of four popular RDF stores (Sesame, Virtuoso, Jena TDB, and Jena SDB) with the performance of two SPARQL-to-SQL rewriters (D2R Server and Virtuoso RDF Views) as well as the performance of two relational database management systems (MySQL and Virtuoso RDBMS).

5. Related Work

A benchmark is only a good tool for evaluating a system if the benchmark dataset and the workload are similar to the ones expected in the target use case (Gray, 1993; Yuanbo Guo et al, 2007). As Semantic Web technologies are used within a wide range of application scenarios, a variety of different benchmarks for Semantic Web technologies have been developed.

A widely used benchmark for comparing the performance, completeness and soundness of OWL reasoning engines is the Lehigh University Benchmark (LUBM) (Guo et al., 2005). In addition to the experiment in the original paper, (Rohloff et al., 2007) presents the results of benchmarking DAML DB, SwiftOWLIM, BigOWLIM and AllegroGraph using a LUMB (8000) dataset consisting of roughly one billion triples. The LUBM benchmark has been extended in (Ma et al., 2006) to the University Ontology Benchmark (UOBM) by adding axioms that make use of all OWL Lite and OWL DL constructs. As both benchmarks predate the SPARQL query language, they do not support benchmarking specific SPARQL features such as OPTIONAL filters or DESCRIBE and UNION operators. Both benchmarks do not employ benchmarking techniques such as system warm-up, simulating concurrent clients, and executing mixes of parameterized queries in order to test the caching strategy of a SUT.

An early SPARQL-specific performance benchmark is the DBpedia Benchmark (Becker, 2008). The benchmark measures the execution time of 5 queries that are relevant in the context of DBpedia Mobile (Becker & Bizer, 2008) against parts of the DBpedia dataset. Compared to the BSBM benchmark, the DBpedia Benchmark has the drawbacks that its dataset cannot be scaled to different sizes and that the queries only test a relatively narrow set of SPARQL features.

A recent SPARQL benchmark is SP2Bench (Schmidt, et al., 2008a and 2008b). SP2Bench uses a scalable dataset that reflects the structure of the DBLP Computer Science Bibliography. The benchmark queries are designed for the comparison of different RDF store layouts and RDF data management approaches. The SP2Bench benchmark queries are not parameterized and are not ordered within a use case motivated sequence. As the primary interest of the authors is the “basic performance of the approaches (rather than caching or learning strategies of the systems) ” (Schmidt, et al., 2008a), they decided for cold runs instead of executing queries against warmed-up systems. Because of these differences, the SP2Bench benchmark is likely to be more useful to RDF store developers that want to test “the generality of RDF storage schemes” (Schmidt, et al., 2008a), while the BSBM benchmark aims to support application developers in choosing systems that are suitable for mixed query workloads.

A first benchmark for comparing the performance of relational database to RDF mapping tools with the performance of native RDF stores is presented in (Svihala & Jelinek, 2007). The benchmark focuses on the production of RDF graphs from relational databases and thus only tests SPARQL CONSTRUCT queries. In contrast, the BSBM query mix also contains various SELECT queries.

A benchmarking methodology for measuring the performance of Ontology Management APIs is presented in (García-Castro & Gómez-Pérez, 2005). Like BSBM, this methodology also employs parameterized queries and requires systems to be warmed up before their performance is measured.

Ongoing initiatives in the area of benchmarking Semantic Web technologies are the Ontology Alignment Evaluation Initiative (Caracciolo, et al, 2008) which compares ontology matching systems, and the Billion Triple track of the Semantic Web Challenge4 which evaluates the ability of Semantic Web applications to process large quantities of RDF data that is represented using different schemata and has partly been crawled from the public Web. Further information about RDF benchmarks and current benchmark results are found on the ESW RDF Store Benchmarking wiki page 5.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 TheBerlinSparqlBenchmarkChristian Bizer
Andreas Schultz
The Berlin Sparql Benchmark2009