SparqlFilterFlow: SPARQL Query Composition for Everyone

Florian Haag⁷,
Steffen Lohmann⁷ &
Thomas Ertl⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8798))

Included in the following conference series:

European Semantic Web Conference

2032 Accesses
8 Citations

Abstract

SparqlFilterFlow provides a visual interface for the composition of SPARQL queries, in particular SELECT and ASK queries. It is based on the intuitive and empirically well-founded filter/flow model that has been extended to address the unique specifics of SPARQL and RDF. In contrast to related work, no structured text input is required but the queries can be created entirely with graphical elements. This allows even users without expertise in Semantic Web technologies to create complex SPARQL queries with only little training. SparqlFilterFlow is implemented in C#, supports a large number of SPARQL constructs and can be applied to any SPARQL endpoint.

You have full access to this open access chapter, Download conference paper PDF

YASGUI: Not Just Another SPARQL Client

SPARQ $$\lambda $$ : SPARQL as a Function

Snap-SPARQL: A Java Framework for Working with SPARQL and OWL

Keywords

1 Introduction

SPARQL is currently the de facto standard for querying RDF data. It is supported by most triplestores, and many RDF datasets provide SPARQL endpoints [4, 8]. However, writing SPARQL queries is not an easy task and requires knowledge about Semantic Web concepts and technologies. Since average users cannot be expected to have the necessary skills, visual interfaces are needed that hide the SPARQL syntax and provide graphical support for query building.

We present SparqlFilterFlow, a novel approach for visual SPARQL querying based on the filter/flow model.^{Footnote 1} It is implemented in C# and uses the Windows Presentation Foundation (WPF) for the graphical user interface. In contrast to related work, no structured text input is required. Instead, the queries can be created entirely with graphical elements. SparqlFilterFlow considers most features of SPARQL and can hence also be used for the construction of complex query expressions. In particular, it enables the creation of SELECT and ASK queries, though it may also be used for other query forms (i.e. CONSTRUCT and DESCRIBE queries) with only little variation.^{Footnote 2}

2 Related Work

Several attempts to assist in the creation of SPARQL queries have been presented in the last couple of years. For instance, SPARQLViz [9] provides a form-based wizard that guides the user through the query building process. Other form-based approaches are the Graph Pattern Builder of the DBpedia project [6] or Konduit VQB [5]. However, these tools represent the queries in a way that is closely related to the triple syntax of RDF and SPARQL. They do not relieve the users from the need to know how SPARQL queries are structured.

An alternative is the use of visual query languages that provide graphical representations for the different SPARQL elements and combine them to node-link diagrams. NITELIGHT [16], iSPARQL [2], and RDF-GL [13] are examples of tools based on visual query languages. A slightly higher degree of abstraction is provided by approaches that use UML-like diagrams to compose SPARQL queries [7]. While these attempts help to lower the barrier for creating correct queries, they still require knowledge of the structure and syntax of SPARQL.

SparqlFilterFlow is more related to the idea of using visual pipes to process RDF data. This approach is implemented in the tools DERI pipes [15] and MashQL [14], both of which are inspired by the mashup framework Yahoo! Pipes [3]. However, these attempts focus on rearranging, sorting and transforming data and not on the composition of SPARQL queries.

3 Filter/Flow Model

SparqlFilterFlow is based on the filter/flow model originally introduced by Young and Shneiderman in the context of relational databases and SQL querying [17]. The filter/flow model provides an intuitive representation of Boolean expressions that can be used for data filtering. The expressions are visualized as directed acyclic graphs, where the nodes define the filter criteria and the edges depict the flow of data. The thickness of the edges indicates the number of data items contained in the flow. Conjunctions are modeled as sequential paths and disjunctions as parallel paths.

Several improvements to the original filter/flow idea have been proposed over the years. We developed an extended filter/flow model that incorporates the most common ones [11]. In that model, flows are linked to explicit connection points on the filter nodes called receptors and emitters. This allows the filter nodes to receive data from several inbound flows that can be processed in different ways. Likewise, there can be several outbound flows, each representing another filter function. Along with these changes, filter nodes in the extended model are not restricted to atomic operations but can consider several filtering parameters. Finally, the extended model defines special nodes that display the result of filtering and can be placed at arbitrary positions in the graph, like any other filter node. This way, not only the final result set but also intermediate results can be shown.

Overall, the filter/flow graphs of the extended model have a smaller size and complexity, with positive effects on their readability, as we found in a comparative user study [12].

4 SparqlFilterFlow

SparqlFilterFlow implements our approach of applying the extended filter/flow model to SPARQL querying [10]. Users can visually compose queries by adding filter nodes and using drag-and-drop to connect them with flows.

Example. Figure 1 shows a screenshot of a filter/flow graph created with SparqlFilterFlow on the RDF dataset of Faceted DBLP [1]. Examples like this one can be created for different RDF datasets and will be shown in the ESWC demo.

The filtering starts in the initial nodes of the graph, which are the ones without inbound flows, in this case the two type nodes selecting all authors (foaf:Agent) and proceedings papers (swrc:InProceedings). Both sets are then gradually reduced by the subsequent filter nodes. Following the filter/flow metaphor, the thickness of the flows indicates the relative size of the sets. This helps users determine whether a given filter node has a significant effect on the data—which is the case if the thickness of the outbound flows is visibly reduced compared to the inbound flows—or even blocks the whole set.

In the example of Fig. 1, only papers presented at ESWC (dblp-conf:esws) in the years 2011 to 2013 (dcterms:issued) are considered. This set of papers is then used to filter the set of authors by selecting only those people that co-authored one of the papers (dc:creator). The data stream is additionally split up into four sets, with the first three containing the ESWC authors from the individual years, and the fourth containing the ESWC authors from all three years. Finally, the four sets of authors are bundled with a corresponding node.

Filter Nodes. The example illustrates some of the filter nodes provided by SparqlFilterFlow. Their settings can be directly manipulated by users. The basic group of filters compares IRIs or literals, such as strings, numbers or dates, with operators like equality, greater or less than. Certain attributes of a literal can also be restricted in these filters, such as its language tag or its length in characters. Another group of filters examines the RDF graph structure, for instance the existence of a given property. There are also filters that help organize the structure of the filter/flow graph, including filters that bundle different sets to run in a single flow. Finally, there are specializations of general filter nodes that predefine frequently applied restrictions to ease query composition. An example is the type filter in Fig. 1, which is a specialization of a comparison filter.

Result Nodes. Once the desired restrictions have been defined by the combination of filter nodes, users can add result nodes that apply the restrictions in SELECT or ASK queries and display the result. In Fig. 1, two result nodes have been inserted—one showing the number of authors in each of the sets (by using a SPARQL COUNT function along with the SELECT query), and one showing whether there are any results at all (by applying a SPARQL ASK query).^{Footnote 3}

The results reveal that the total number of ESWC authors was lower in the year 2012 than in the other two years. In addition, it gets apparent that the total number of authors throughout the three considered years is barely lower than the sum of the author counts per year, indicating that many of the authors contributed only in one of the years.

SPARQL Queries. Several SPARQL queries are generated and processed during the composition of the graph. Most obviously, the result nodes issue one or more SPARQL queries when they are inserted into the graph to retrieve the values to be displayed. As an example, the SPARQL query generated to get the number of ESWC authors for the year 2011, as given by the first value in the left result node, is shown in Fig. 1.

However, SPARQL queries are also generated at other points in the graph, in particular for every emitter to determine the thickness of the outbound flows. The query expression generator of SparqlFilterFlow traverses the graph in upstream direction, starting at the emitter that issued the SPARQL query. It gradually constructs the query that comprises of the conjunctions, disjunctions and filter functions defined by the partial graph reachable upstream, usually but not exclusively by adding statements to the WHERE clause of the query. Whenever any part of the graph structure or filter node settings changes, all nodes reachable downstream from the changed graph part may be affected and are thus notified, whereupon they reissue their SPARQL queries.

5 Conclusion and Future Work

SparqlFilterFlow enables the composition of SPARQL queries using exclusively graphical elements and simple text strings, while avoiding any structured text input. It requires no knowledge of Semantic Web concepts beyond a basic understanding of the RDF idea. It can be applied to any SPARQL endpoint and allows for creating complex SPARQL queries with only little training. Results from a qualitative user study indicate that the approach is comparatively usable and easy to learn [10].

Future work includes support for the creation of DESCRIBE and CONSTRUCT queries besides SELECT and ASK queries. This will require the integration of additional visualization and interaction concepts, such as an intuitive way to specify the graph structure for the result of the CONSTRUCT query. Another goal of future work is the development of features that suggest appropriate filter nodes and values based on the schema information available in the RDF data.

Notes

1.
While this demo paper presents the interactive implementation, the concept of applying the filter/flow model to SPARQL querying is described more in-depth in [10].
2.
A screencast of SparqlFilterFlow and a lightweight web demo with limited functionality are publicly available at http://www.sparql.visualdataweb.org.
3.
The result node for the ASK query is only added for illustration purposes in this case, as it is somewhat redundant to the result node applying the COUNT function.

References

Faceted DBLP. http://www.dblp.l3s.de
OpenLink iSPARQL. http://www.oat.openlinksw.com/isparql/
Pipes: Rewire the web. http://www.pipes.yahoo.com/pipes/
SPARQL endpoints status. http://www.sparqles.okfn.org
Ambrus, O., Möller, K., Handschuh. S.: Konduit VQB: a visual query builder for SPARQL on the social semantic desktop. In: Proceedings of VISSW ’10, CEUR-WS, vol. 565 (2010)
Google Scholar
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: a nucleus for a web of open data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)
Chapter Google Scholar
Barzdins, G., Rikacovs, S., Zviedris, M.: Graphical query language as SPARQL frontend. In: Proceedings of 13th East-European Conference (ABDIS ’09), pp. 93–107 (2009)
Google Scholar
Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. Int. J. Semant. Web. Inf. Syst. 5(3), 1–22 (2009)
Article Google Scholar
Borsje, J., Embregts, H.: Graphical query composition and natural language processing in an RDF visualization interface. Bachelor thesis, EUR (2006)
Google Scholar
Haag, F., Lohmann, S., Bold, S., Ertl, T.: Visual SPARQL querying based onextended filter/flow graphs. In: Proceedings of AVI ’14 (To appear)
Google Scholar
Haag, F., Lohmann, S., Ertl, T.: Simplifying filter/flow graphs by subgraph substitution. In: Proceedings of VL/HCC ’12, pp. 145–148. IEEE (2012)
Google Scholar
Haag, F., Lohmann, S., Ertl, T.: Evaluating the readability of extended filter/flow graphs. In: GI ’13, pp. 33–36. CIPS (2013)
Google Scholar
Hogenboom, F., Milea, V., Frasinca, F., Kaymak, U.: RDF-GL: A SPARQL-based graphical query language for RDF. Emergent Web Intelligence: Advanced Information Retrieval, pp. 87–116. Springer, Heidelberg (2010)
Google Scholar
Jarrar, M., Dikaiakos, M.D.: MashQL: A query-by-diagram topping SPARQL. In: Proceedings of ONISW ’08, pp. 89–96. ACM (2008)
Google Scholar
Morbidoni, C., Polleres, A., Phuoc, D.L., Tummarello, G.: Semantic web pipes. Technical report 2007–11-07, DERI (2007)
Google Scholar
Russell, A., Smart, P., Braines, D., Shadbolt, N.: NITELIGHT: A graphical tool for semantic query construction. In Proceedings of SWUI ’08, CEUR-WS, vol. 543 (2008)
Google Scholar
Young, D., Shneiderman, B.: A graphical filter/flow representation of boolean queries: a prototype implementation and evaluation. J. Am. Soc. Inf. Sci. 44(6), 327–339 (1993)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Visualization and Interactive Systems (VIS), University of Stuttgart, Universitätsstr. 38, 70569, Stuttgart, Germany
Florian Haag, Steffen Lohmann & Thomas Ertl

Authors

Florian Haag
View author publications
You can also search for this author in PubMed Google Scholar
Steffen Lohmann
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Ertl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Steffen Lohmann .

Editor information

Editors and Affiliations

ISTC-CNR, Rome, Italy
Valentina Presutti
Linköping University, Linköping, Sweden
Eva Blomqvist
EURECOM, Biot, France
Raphael Troncy
Hasso-Plattner-Institut, Potsdam, Brandenburg, Germany
Harald Sack
Ionian University, Corfu, Greece
Ioannis Papadakis
Elsevier B.V., Amsterdem, The Netherlands
Anna Tordai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Haag, F., Lohmann, S., Ertl, T. (2014). SparqlFilterFlow: SPARQL Query Composition for Everyone. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds) The Semantic Web: ESWC 2014 Satellite Events. ESWC 2014. Lecture Notes in Computer Science(), vol 8798. Springer, Cham. https://doi.org/10.1007/978-3-319-11955-7_49

Download citation

DOI: https://doi.org/10.1007/978-3-319-11955-7_49
Published: 16 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11954-0
Online ISBN: 978-3-319-11955-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

SparqlFilterFlow: SPARQL Query Composition for Everyone

Abstract