SPARQL

From BITPlan Wiki
Jump to navigation Jump to search

Content

What is SPARQL

SPARQL is a query language for semantic databases using the Resource Description Framework (RDF) format

Tutorial

There are quite a few tutorials out there for SPARQL e.g.

  1. W3C SPARQL By Example
  2. Apache Jena SPARQL
  3. Wikidata SPARQL tutorial
  4. SPARQL playground (broken link)
  5. source code for SPARQL playground

The W3C tutorial is somewhat outdated and mostly didn't work for me.

The Apache Jena tutorial mostly also works with the Blazegraph database which we'll use in this tutorial. Just the output looks different and some examples won't work as shown in the tutorial.

The Wikidata tutorial is recommended - you might want to try it out after you followed this initial tutorial or if you are not interested to run your own blazegraph installation.

This tutorial is for people which are new to semantic concepts but would like to use an example with a fair amount of data but not too much of complexity in the structure of the data.

Semantic Concepts

Personally I learned Semantic Concepts using Semantic MediaWiki see

  1. Wolfgang Fahl's User page at www.semantic-mediawiki.org
  2. Semantic Concepts Talk at SMWCon 2015

When using SPARQL a tutorial needs to get a slightly different touch, so for those who know the talk above I'll explain some key concepts based on an example using:

  1. Countries
  2. Towns
  3. Municipal Units

Triples

A semantic statement has the form

<subject> <predicate> <object>

e.g.

Dubai is-located-in AE

is such a semantic statement which is also called a Triple.

The natural language statement "Dubai is located in United Arab Emirates" is purposely slightly modified to a more "computer-ready" form. The predicate has been written as is-located-in to make it a proper Identifier. The country-name "United Arab Emirates" has been replaced by its two letter United Nations Location Code AE. A triple like this has a natural graph representation:

TripleStore

A Triplestore is a database that can store and query triples. In fact for educational purposes I have written a simple Triplestore myself:

For that simple triplestore the triples are supplied in Simple Data Interchange Format. Again that format is mostly for educational purposes although it can also be used for small usecases with just a few thousand triples. Please also note that there is no SPARQL support in that project.

For more than a non-educational use a Triplestore is needed that can handle larger amounts of data and support SPARQL. The Wikipedia List of Subject-Predicate-Object Databases shows you some options. For this tutorial we'll use Blazegraph and Apache Jena

Setting up the Blazegraph Triple Store

blazegraph_logo.png

You need Java to be installed on you machine.

Download the blazegraph.jar file from https://github.com/blazegraph/database/releases and start it with

java -jar blazegraph.jar

In fact it's better if you start the jar file with an option to allow bigger xml files to be handled:

java -Djdk.xml.entityExpansionLimit=0 -jar blazegraph.jar

otherwise you might later run into the error:

org.openrdf.rio.RDFParseException: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK

you should see

Welcome to the Blazegraph(tm) Database.

Go to http://localhost:9999/blazegraph/ to get started.

And you might want to do just that and click that link.

Where Blazegraph stores it's data

The default setting for Blazegraphs journal file is to use blazegraph.jnl in the directory where you started the jar file. On my Mac OS Laptop the initial file size is some 200 MBytes.

ls -l blazegraph.jnl 
-rw-r--r--  1 wf  staff  209715200  4 Jan 11:50 blazegraph.jnl

The Blazegraph Web UI

Blazegraph1.png

The Web-UI shows the Tabs:

  1. WELCOME
  2. QUERY
  3. UPDATE
  4. EXPLORE
  5. NAMESPACES
  6. STATUS
  7. PERFORMANCE

Let's start with the UPDATE tab to load some sample data.

The sample Data

The human readable form of some of our sample data and their description is available at:

RDF Version of the data


You might want to download and unzip http://unlocode.rkbexplorer.com/models/dump.tgz. The result should be a directory with the following content:

pan:models wf$ls -l
total 31832
-rw-r--r--  1 wf  staff       265  4 Jan 07:27 catalog-v001.xml
-rw-r--r--@ 1 wf  staff     42194 18 Feb  2009 unlocode-countries.rdf
-rw-r--r--@ 1 wf  staff    228389 18 Feb  2009 unlocode-municipalunits.rdf
-rw-r--r--@ 1 wf  staff  16017733 18 Feb  2009 unlocode-towns.rdf

Now drag and drop the three files:

  1. unlocode-countries.rdf
  2. unlocode-municipalunits.rdf
  3. unlocode-towns.rdf

one after another into the field with the text

(Type in or drag a file containing RDF data, ...

and click the update button below the field after each drag&drop operation. The output will be

Modified: 484
Milliseconds: 430
Modified: 1917
Milliseconds: ...
Running update: 287
Modified: 239567
Milliseconds: 2260

The Milliseconds may vary on your machine. If you run into the 64000 entity limit you may need to restart your blazegraph.jar file with the Java VM options outlined above.

Structure of the Example Data

The following diagram shows the structure of the Example Data. Please note the use of UML diagrams which are usually not used to show Ontology structures.

UML Diagram

Example Data

Direct Town - Country Link

Two step - Town - Municipal-Unit - Country

SPARQL Queries

Select all Triples

Now our environment should be ready to hit the "QUERY" tab and enter our first SPARQL query. You might want to simply cut&paste the code from the SPARQL Query descriptions for each example below into the field with the text

(Input a SPARQL query)

and then hit the "Execute" Button below this field.

SPARQL Query

SELECT * 
WHERE {
  ?subject ?predicate ?object
}

Result

Query running ...

Will be visible shortly than you'll see the result table, which will have total results of 242375 triples, displaying the first 50:

subject	                                        predicate	                                        object
<http://unlocode.rkbexplorer.com/id/AEDHF>	<http://www.aktors.org/ontology/portal#has-longitude>	54.5333333
<http://unlocode.rkbexplorer.com/id/AEDHF>	<http://www.aktors.org/ontology/portal#is-located-in>	<http://unlocode.rkbexplorer.com/id/AE>
<http://unlocode.rkbexplorer.com/id/AEDHF>	<http://www.aktors.org/ontology/support#has-pretty-name>	Al Dhafra
<http://unlocode.rkbexplorer.com/id/AEDHF>	rdf:type	<http://www.aktors.org/ontology/portal#Town>
<http://unlocode.rkbexplorer.com/id/AEDUY>	<http://www.aktors.org/ontology/portal#has-latitude>	25.7780637
<http://unlocode.rkbexplorer.com/id/AEDUY>	<http://www.aktors.org/ontology/portal#has-longitude>	55.9310912
<http://unlocode.rkbexplorer.com/id/AEDUY>	<http://www.aktors.org/ontology/portal#is-located-in>	<http://unlocode.rkbexplorer.com/id/AE>
<http://unlocode.rkbexplorer.com/id/AEDUY>	<http://www.aktors.org/ontology/support#has-pretty-name>	Ras Zubbaya (Ras Dubayyah)
<http://unlocode.rkbexplorer.com/id/AEDUY>	rdf:type	<http://www.aktors.org/ontology/portal#Town>
<http://unlocode.rkbexplorer.com/id/AEDXB>	<http://www.aktors.org/ontology/portal#has-latitude>	25.2500000
<http://unlocode.rkbexplorer.com/id/AEDXB>	<http://www.aktors.org/ontology/portal#has-longitude>	55.2666666
<http://unlocode.rkbexplorer.com/id/AEDXB>	<http://www.aktors.org/ontology/portal#is-located-in>	<http://unlocode.rkbexplorer.com/id/AE>
<http://unlocode.rkbexplorer.com/id/AEDXB>	<http://www.aktors.org/ontology/support#has-pretty-name>	Dubai
<http://unlocode.rkbexplorer.com/id/AEDXB>	rdf:type	<http://www.aktors.org/ontology/portal#Town>

Explanation

SELECT *

asked for a selection

WHERE {
  ?subject ?predicate ?object
}

specified a condition. Since we used question marks for the three triple parts we made all three parts of the triple variable so any/each triple in the database will fulfill the condition.

The query shows all triples you uploaded from the RDF files "as is".

Now you can see that RDF unlike SiDiF mostly uses lenghty URLs to express things. So the Triple for Dubai being in AE gets to be:

<http://unlocode.rkbexplorer.com/id/AEDXB>	<http://www.aktors.org/ontology/portal#is-located-in>	<http://unlocode.rkbexplorer.com/id/AE>

And there are multiple triples for the subject <http://unlocode.rkbexplorer.com/id/AEDXB> So lets select only those.

Select by subject

SPARQL Query

SELECT * 
WHERE {
  <http://unlocode.rkbexplorer.com/id/AEDXB> ?predicate ?object
}

Result

We get 5 results:

predicate	                                          object
<http://www.aktors.org/ontology/portal#has-latitude>	  25.2500000
<http://www.aktors.org/ontology/portal#has-longitude>	  55.2666666
<http://www.aktors.org/ontology/portal#is-located-in>	  <http://unlocode.rkbexplorer.com/id/AE>
<http://www.aktors.org/ontology/support#has-pretty-name>  Dubai
rdf:type	                                          <http://www.aktors.org/ontology/portal#Town>

Explanation

This time the subject in the condition was not variable anymore but fixed to <http://unlocode.rkbexplorer.com/id/AEDXB>. We already new that such a subject existed from the query that selected all triples.

Select multiple predicates of one subject in one query

SPARQL Query

SELECT ?lat ?lon
WHERE {
  <http://unlocode.rkbexplorer.com/id/AEDXB> <http://www.aktors.org/ontology/portal#has-latitude> ?lat.
  <http://unlocode.rkbexplorer.com/id/AEDXB> <http://www.aktors.org/ontology/portal#has-longitude> ?lon.                                            
}

result

We get one result.

lat	        lon
25.2500000	55.2666666

explanation

Instead of the asterisk * we had used for the SELECT so far this time we specified two variables:

SELECT ?lat ?lon

and these where used for two conditions:

  <http://unlocode.rkbexplorer.com/id/AEDXB> <http://www.aktors.org/ontology/portal#has-latitude> ?lat.
  <http://unlocode.rkbexplorer.com/id/AEDXB> <http://www.aktors.org/ontology/portal#has-longitude> ?lon.

keeping the subject fixed at <http://unlocode.rkbexplorer.com/id/AEDXB> but varying the predicate:

  1. <http://www.aktors.org/ontology/portal#has-latitude> assigning the result to the variable lat
  2. <http://www.aktors.org/ontology/portal#has-longitude> assigning the result to the variable lon

Using prefixes

Specifiying subject, predicate and object of triples as fully qualified IRIs is cumbersome and not readable at all. It is one of the major obstacles in working with raw RDF. Prefixes are necessary to remedy the situation. With a prefix the full IRI can be abbreviated with a user choosen name.

SPARQL Query

PREFIX unlocode:<http://unlocode.rkbexplorer.com/id/>
SELECT *
WHERE {
  unlocode:AEDXB ?predicate ?object                                           
}

result

We get five results again

predicate	                                          object
<http://www.aktors.org/ontology/portal#has-latitude>	  25.2500000
<http://www.aktors.org/ontology/portal#has-longitude>	  55.2666666
<http://www.aktors.org/ontology/portal#is-located-in>	  <http://unlocode.rkbexplorer.com/id/AE>
<http://www.aktors.org/ontology/support#has-pretty-name>  Dubai
rdf:type	                                          <http://www.aktors.org/ontology/portal#Town>

explanation

The Prefix specification:

PREFIX unlocode:<http://unlocode.rkbexplorer.com/id/>

will replace each prefix use of:

unlocode:

with the specified url and then appending the value after the colon

AEXDB

so

unlocode:AEXDB

is as if we had written:

<http://unlocode.rkbexplorer.com/id/AEXDB>

Using PREFIX is very useful to make your SPARQL queries a lot more readable.

Specifying the type and selecting values

SPARQL Query

PREFIX unlocode:<http://unlocode.rkbexplorer.com/id/>
PREFIX portal: <http://www.aktors.org/ontology/portal#>
PREFIX support: <http://www.aktors.org/ontology/support#>

SELECT *
WHERE {
  ?subject rdf:type portal:Town.
  ?subject portal:has-latitude   ?lat.
  ?subject portal:has-longitude  ?lon.
  ?subject portal:is-located-in  ?locatedIn.
  ?subject support:has-pretty-name ?name.
}

result

We get 48517 results

predicate	                                          object
subject	lat	lon	locatedIn	name
<http://unlocode.rkbexplorer.com/id/ARANA>	-28.4666666	-62.8333333	<http://unlocode.rkbexplorer.com/id/AR-G>	Anatuya
<http://unlocode.rkbexplorer.com/id/ARAND>	-27.6000000	-66.3166666	<http://unlocode.rkbexplorer.com/id/AR-K>	Andalgala
<http://unlocode.rkbexplorer.com/id/ARCCP>	-27.3333333	-65.5833333	<http://unlocode.rkbexplorer.com/id/AR-T>	Conception
<http://unlocode.rkbexplorer.com/id/ARCTC>	-26.8265906	-65.2203670	<http://unlocode.rkbexplorer.com/id/AR-K>	Catamarca
<http://unlocode.rkbexplorer.com/id/ARELB>	-27.9166666	-65.8833333	<http://unlocode.rkbexplorer.com/id/AR-K>	El Bolson
...

explanation

The condition

?subject rdf:type portal:Town.

made sure we only get the triples for subject of the rdf type "Town". Basically this is the set of triples that we imported from unlcode-towns.rdf in the first place.

LIMIT, ORDER and "a" keyword"

SPARQL Query

PREFIX unlocode:<http://unlocode.rkbexplorer.com/id/>
PREFIX portal: <http://www.aktors.org/ontology/portal#>
PREFIX support: <http://www.aktors.org/ontology/support#>

SELECT *
WHERE {
  ?subject a portal:Town.
  ?subject portal:has-latitude   ?lat.
  ?subject portal:has-longitude  ?lon.
  ?subject portal:is-located-in  ?locatedIn.
  ?subject support:has-pretty-name ?name.
}
ORDER BY ?name
LIMIT 10

result

We get 10 results

<http://unlocode.rkbexplorer.com/id/TOEUA>	-21.3666666	-174.9333333	<http://unlocode.rkbexplorer.com/id/TO>	'Eua Island
<http://unlocode.rkbexplorer.com/id/BEGVO>	50.7500000	5.7500000	<http://unlocode.rkbexplorer.com/id/BE-VLI>	's Gravenvoeren
<http://unlocode.rkbexplorer.com/id/NLSGL>	52.2333333	5.1166666	<http://unlocode.rkbexplorer.com/id/NL>	's-Graveland
<http://unlocode.rkbexplorer.com/id/NLGRA>	51.7833333	4.6166666	<http://unlocode.rkbexplorer.com/id/NL>	's-Gravendeel
<http://unlocode.rkbexplorer.com/id/NLHAG>	52.0833333	4.3000000	<http://unlocode.rkbexplorer.com/id/NL>	's-Gravenhage (Den Haag)
<http://unlocode.rkbexplorer.com/id/NLSGM>	51.6666666	4.8000000	<http://unlocode.rkbexplorer.com/id/NL>	's-Gravenmoer
<http://unlocode.rkbexplorer.com/id/NLSGP>	51.4500000	3.9000000	<http://unlocode.rkbexplorer.com/id/NL>	's-Gravenpolder
<http://unlocode.rkbexplorer.com/id/BEGWE>	51.2666666	4.5500000	<http://unlocode.rkbexplorer.com/id/BE-VAN>	's-Gravenwezel
<http://unlocode.rkbexplorer.com/id/NLGRZ>	52.0000000	4.1666666	<http://unlocode.rkbexplorer.com/id/NL>	's-Gravenzande
<http://unlocode.rkbexplorer.com/id/NLSHB>	52.5333333	6.0166666	<http://unlocode.rkbexplorer.com/id/NL>	's-Heerenbroek

explanation

?subject a portal:Town.

is a shortcut for

?subject rdf:type portal:Town.
ORDER BY ?name

sorts the result by the support-has-pretty-name predicate that we put in the ?name variable

LIMIT 10

Sets the maximum amount of results we get to 10.

lookup relation information

SPARQL Query

PREFIX unlocode:<http://unlocode.rkbexplorer.com/id/>
PREFIX portal: <http://www.aktors.org/ontology/portal#>
PREFIX support: <http://www.aktors.org/ontology/support#>

SELECT *
WHERE {
  ?town a portal:Town.
  ?town support:has-pretty-name ?townname.
  ?town portal:has-latitude ?lat.
  ?town portal:has-longitude ?lon.
  ?town portal:is-located-in ?loc.
  ?loc support:has-pretty-name ?locname.
}
ORDER by ?locname ?townname
LIMIT 7

result

town	                                        townname	lat	        lon	        loc	                           locname
<http://unlocode.rkbexplorer.com/id/AFBAG>	Bagram	        34.9500000	69.2500000	<http://unlocode.rkbexplorer.com/id/AF>	AFGHANISTAN
<http://unlocode.rkbexplorer.com/id/AFBIN>	Bamian	        34.8200748	67.8112337	<http://unlocode.rkbexplorer.com/id/AF>	AFGHANISTAN
<http://unlocode.rkbexplorer.com/id/AFBST>	Bost	        31.5832023	64.3602926	<http://unlocode.rkbexplorer.com/id/AF>	AFGHANISTAN
<http://unlocode.rkbexplorer.com/id/AFCCN>	Chakcharan	33.6500012	62.3166972	<http://unlocode.rkbexplorer.com/id/AF>	AFGHANISTAN
<http://unlocode.rkbexplorer.com/id/AFDAZ>	Darwaz	        31.8372189	67.7843650	<http://unlocode.rkbexplorer.com/id/AF>	AFGHANISTAN
<http://unlocode.rkbexplorer.com/id/AFISQ>	Eslam Qal'eh	34.6666666	61.0666666	<http://unlocode.rkbexplorer.com/id/AF>	AFGHANISTAN
<http://unlocode.rkbexplorer.com/id/AFFBD>	Faizabad	37.1214418	70.5785208	<http://unlocode.rkbexplorer.com/id/AF>	AFGHANISTAN

explanation

The information in the ?loc variable is taken to lookup the has-pretty-name information of the country code "pointer" supplied.

double lookup of relation information with FILTER

SPARQL query

PREFIX unlocode:<http://unlocode.rkbexplorer.com/id/>
PREFIX portal: <http://www.aktors.org/ontology/portal#>
PREFIX support: <http://www.aktors.org/ontology/support#>

SELECT ?townname ?lat ?lon ?regionname ?countryname
WHERE {
  ?town a portal:Town.
  ?town support:has-pretty-name ?townname.
  ?town portal:has-latitude ?lat.
  ?town portal:has-longitude ?lon.
  ?town portal:is-located-in ?region.

  ?region support:has-pretty-name ?regionname.
  ?region portal:is-part-of ?country.

  ?country support:has-pretty-name ?countryname.
 
   FILTER regex(?regionname,"bayern","i")
}
ORDER by ?townname
LIMIT 7

result

townname	lat	        lon	        regionname  countryname
Abensberg	48.8157530	11.8473832	Bayern	    GERMANY
Abtswind	49.7707076	10.3745558	Bayern	    GERMANY
Adelsdorf	49.7000000	10.8833333	Bayern	    GERMANY
Affalterbach	49.6166666	11.2000000	Bayern	    GERMANY
Ahorn	        50.2166666	10.9333333	Bayern	    GERMANY
Aichach	        48.4579674	11.1301884	Bayern	    GERMANY
Ainring	        47.8166666	12.9333333	Bayern	    GERMANY

explanation

This time we have left out the linking variables and only selected the has-pretty-name lookup results. Also we have filtered the result with a regular expression that matches any region that has "bayern" in it's pretty name. "i" is regular expression option to ignore upper and lower case letters.

Group by

PREFIX unlocode:<http://unlocode.rkbexplorer.com/id/>
PREFIX portal: <http://www.aktors.org/ontology/portal#>
PREFIX support: <http://www.aktors.org/ontology/support#>

SELECT ?type (COUNT(?type) AS ?typecount)
WHERE {
  ?subject a ?type.
}
GROUP by ?type

result

type	                                                typecount
<http://www.aktors.org/ontology/portal#Town>	        47998
<http://www.aktors.org/ontology/portal#Country>	          237
<http://www.aktors.org/ontology/portal#Municipal-Unit>	  969
owl:Ontology	                                            2

explanation

SELECT ?type (COUNT(?type) AS ?typecount)

Is an aggregate selection together with a corresponding:

GROUP by ?type

If you do not specify the GROUP by clause you get a

MalformedQueryException: Bad aggregate

BIND and String function replace

select ?result {
  BIND(REPLACE("abc","b","x") as ?result)
}

explanation

  BIND(expression as ?result)

will bind the result of an expression to the variable result

  REPLACE(in,find,replace)

replaces the occurances of the string "find" in the string "in" with the string "replace". Our example uses literals not variables as expressions, though. see https://en.wikibooks.org/wiki/SPARQL/Expressions_and_Functions#REPLACE

result

axc

WikiData

The WikiData Project offers a SPARQL based query service at https://query.wikidata.org/

See the

for a multitude of further examples.

See also https://github.com/componavt/wd_book

Query Examples

Whisky Distilleries worldwide

SPARQL Query

# Find World wide whisky distilleries using the Wikidata Query service
# taken from
# http://wiki.bitplan.com/index.php/SPARQL#Whisky_Distilleries_worldwide
# Created 2018-01 by Wolfgang Fahl BITPlan GmbH
# 
# display the results as a map
#defaultView:Map
#
# select the distillery and its coordinates
SELECT ?distillery ?coord 
WHERE 
{
  # any subject
  # which is an instance of
  # https://www.wikidata.org/wiki/Property:P31
  # Whisky distillery
  # https://www.wikidata.org/wiki/Q10373548
  ?distillery wdt:P31 wd:Q10373548.
  # get the coordinate location value for any found distillery
  # https://www.wikidata.org/wiki/Property:P625
  ?distillery wdt:P625 ?coord.
}

link to this query

result

Wdworld2018-01.png Clipped to Scotland

explanation

#defaultView:Map

A comment that instructs the results of the query to be shown as a map by the Wikidata Query service

?item wdt:P31 wd:Q10373548.

Whisky Distilleries in Scotland

SPARQL Query

#defaultView:Map
SELECT ?item ?coord 
WHERE 
{
  ?item wdt:P31 wd:Q10373548.
  ?item wdt:P131 ?located_in .
  ?located_in wdt:P131* wd:Q22.
  ?item wdt:P625 ?coord.
}

Link to this query

result

Wdscotland2018-01.png

explanation

The condition

?located_in wdt:P131* wd:Q22.

is a transitive link condition. The asterisk symbol "*" states that the property https://www.wikidata.org/wiki/Property:P131 (located in the administrative territorial entity) should eventually lead to scotland. As of 2018-01 the map shows less entries then in the previous query since some distilleries do not have a link-chain via https://www.wikidata.org/wiki/Property:P131 (located in the administrative territorial entity) to the entity https://www.wikidata.org/wiki/Q22 (Scotland).

As an example look at https://www.wikidata.org/wiki/Q49646 (Jura distillery). It has a statement https://www.wikidata.org/wiki/Property:P131 (located in the administrative territorial entity) linking to https://www.wikidata.org/wiki/Q111509 (Jura - island in the inner Hebrides of Scotland). Another https://www.wikidata.org/wiki/Property:P131 (located in the administrative territorial entity) links to https://www.wikidata.org/wiki/Q202174 (Argyll and Bute - unitary authority council area in Scotland) which eventually links to https://www.wikidata.org/wiki/Q22 (Scotland - country in North-West Europe, part of the United Kingdom)

Link-Chain

The following graph explains the link chain from the Jura distillery Entity to the Scotland Entity. The nodes are clickable.

The link from https://www.wikidata.org/wiki/Q111509 to https://www.wikidata.org/wiki/Q202174 was added manually by me on 2018-01-04. Quite a few other links to make the query work are still missing.

Whisky Distilleries in Scotland by geographic coordinates

SPARQL Query

# Whisky distilleries in the box between
# Hardoldswick / Unst / Shetland Islands
# and Sligo Northisland which is a rough estimate of a scotland bounding box
# and might capture some Irish distilleries as well
#defaultView:Map
SELECT ?item ?coord 
WHERE 
{
  wd:Q2642797 wdt:P625 ?NECorner .
  wd:Q190002  wdt:P625 ?SWCorner .
  ?item wdt:P31 wd:Q10373548.
  SERVICE wikibase:box {
      ?item wdt:P625 ?coord .
      bd:serviceParam wikibase:cornerSouthWest ?SWCorner .
      bd:serviceParam wikibase:cornerNorthEast ?NECorner .
    }
}

link to this query

result

Wdscotland2018-01bycoordinates.png

explanation

The Northeast corner of Scotland might be defined by Haroldswick on Unst, Shettland Islands https://www.wikidata.org/wiki/Q2642797

The Southwest corner of Scotland would in fact be in Ireland if you try to create a rectangle so in the western province Connacht of Ireland might qualify https://www.wikidata.org/wiki/Q190002

A Wikidata query may use the wikibase box service to filter coordinates that lie within a box. We are using this to find distilleries, where the location is in this box.

Given that the south west corner of the box is in Ireland we might catch some Irish distilleries but unti of 2018-01-05 none showed. Until I added the Bushmills distillery in Country Antrim, Norhern Island https://www.wikidata.org/wiki/Q268267 :-)

French Castles

try it!

# Find french castles using the Wikidata Query service
# taken from
# http://wiki.bitplan.com/index.php/SPARQL#French_Castles
# Created 2018-12 by Wolfgang Fahl BITPlan GmbH
# 
# display the results as a map
#defaultView:Map
#
# select the castle and its coordinates
SELECT ?castle ?coord ?range WHERE {
  # instance of castle
  ?castle wdt:P31 wd:Q23413.
  # located in france
  ?castle ?range wd:Q142.
  # get the coordinates
  ?castle wdt:P625 ?coord.
}


Hotsprings in Italy

try it!

# Find Italian hot springs using the Wikidata Query service
# taken from
# http://wiki.bitplan.com/index.php/SPARQL#Hotsprings_in_Italy
# Created 2018-11 by Wolfgang Fahl BITPlan GmbH
# 
# display the results as a map
#defaultView:Map
#
# select the hot spring and its coordinates
SELECT ?hotspring ?coord ?range WHERE {
  ?hotspring wdt:P31 wd:Q177380.
  ?hotspring ?range wd:Q38.
  ?hotspring wdt:P625 ?coord.
}

List of Gravitational Wave events Wikipedia Articles

see also https://en.wikipedia.org/wiki/Talk:List_of_gravitational_wave_observations#Wikidata_Query_/_List_of_Gravitational_Wave_events

# Find Gravitational wave events
# Created 2018-10-16 by Wolfgang Fahl BITPlan GmbH
#
# select the events
SELECT ?event ?catalogCode ?eventLabel ?lang ?article 
WHERE 
{
  # any subject
  # which is an instance of
  # https://www.wikidata.org/wiki/Property:P31
  # gravitational wave event
  # https://www.wikidata.org/wiki/Q24748034
  ?event wdt:P31 wd:Q24748034.
  OPTIONAL {
    ?event wdt:P528 ?catalogCode
  }
  SERVICE wikibase:label {               # ... include the labels
        bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en"
  }
  OPTIONAL {
      ?article schema:about ?event .
      ?article schema:inLanguage ?lang .
      FILTER (SUBSTR(str(?article), 1, 25) = concat("https://",?lang,".wikipedia.org/"))
  }
}
order by ?lang

try it!

World Wide Nuclear Power Plants

Query

# Find World wide nuclear power plants using the Wikidata Query service
# taken from
# Created 2018-08 by Wolfgang Fahl BITPlan GmbH
# 
# display the results as a map
#defaultView:Map
#
# select the powerplants and its coordinates
SELECT ?plant ?coord 
WHERE 
{
  # any subject
  # which is an instance of
  # https://www.wikidata.org/wiki/Property:P31
  # Nuclear Power Plant
  # https://www.wikidata.org/wiki/Q134447
  ?plant wdt:P31 wd:Q134447.
  # get the coordinate location value for any found distillery
  # https://www.wikidata.org/wiki/Property:P625
  ?plant wdt:P625 ?coord.
}

PopulationGermanProvinces

# Find population of german provinces using the Wikidata Query service
# taken from
# http://wiki.bitplan.com/index.php/SPARQL#PopulationGermanProvinces
# Created 2020-03-22 by Wolfgang Fahl BITPlan GmbH
#
# select the province and it's population
SELECT ?province ?provinceLabel ?pop 
WHERE 
{
  # any subject
  # which is an instance of
  # https://www.wikidata.org/wiki/Property:P31
  # German Bundesland
  # https://www.wikidata.org/wiki/Q1221156
  ?province wdt:P31 wd:Q1221156.
  # get the population
  # https://www.wikidata.org/wiki/Property:P1082
  ?province wdt:P1082 ?pop.
  SERVICE wikibase:label {               # ... include the labels
        bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en"
  }
}

try it!

COVID-19 - background data Population and Location of Countries and Provinces

# Find population of COVID regions using the Wikidata Query service
# taken from
# http://wiki.bitplan.com/index.php/SPARQL#PopulationGermanProvinces
# Created 2020-03-22 by Wolfgang Fahl BITPlan GmbH
#
# select the province and it's population
SELECT ?region ?isocc ?isocode4 ?regionLabel ?pop ?location
WHERE 
{
  # any subject
  # which is an instance of
  # https://www.wikidata.org/wiki/Property:P31
  
  # German Bundesland
  # https://www.wikidata.org/wiki/Q1221156
  { ?region wdt:P31 wd:Q1221156. }
  UNION 
  # Province of China 
  { ?region wdt:P31 wd:Q1615742. }
  #  autonomous region of the People's Republic of China (Q57362) 
  UNION 
  { ?region wdt:P31 wd:Q57362. }
  #  special administrative region of China (Q779415) 
  UNION 
  { ?region wdt:P31 wd:Q779415. }  
  # US state
  UNION
  { ?region wdt:P31 wd:Q35657. }
  #  province of Canada (Q11828004) 
  UNION
  { ?region wdt:P31 wd:Q11828004. } 
  #  territory of Canada (Q9357527) 
  UNION
  { ?region wdt:P31 wd:Q9357527. }    
  # Australian state  
  UNION
  { ?region wdt:P31 wd:Q5852411. }
  #  mainland territory of Australia (Q14192234) 
  UNION
  { ?region wdt:P31 wd:Q14192234. }  
  #  autonomous country within the Kingdom of Denmark (Q66724388) 
  UNION
  { ?region wdt:P31 wd:Q66724388. }   
  # country of the Kingdom of the Netherlands (Q15304003) 
  UNION
  { ?region wdt:P31 wd:Q15304003. }   
  #  overseas department of France (Q202216) 
  UNION
  { ?region wdt:P31 wd:Q202216. }       
  #  overseas territory (Q2327385) former top-level subdivision of France applied to certain overseas entities
  UNION
  { ?region wdt:P31 wd:Q2327385. }  
  # Overseas France (Q203396) 
  UNION
  { ?region wdt:P31 wd:Q203396. }
  #  overseas collectivity (Q719487) type of French territorial collectivity, used for several overseas islands or archipelagos
  UNION
  { ?region wdt:P31 wd:Q719487. }     
  # British Overseas Territories (Q46395) 
  UNION
  { ?region wdt:P31 wd:Q46395. }        
  # Crown dependency (Q185086) self-governing possession of the British crown
  UNION
  { ?region wdt:P31 wd:Q185086. }        
  #  administrative territorial entity of a single country (Q15916867) 
  UNION
  { ?region wdt:P31 wd:Q15916867 . }
  #  dependent territory (Q161243) 
  UNION
  { ?region wdt:P31 wd:Q161243 . }  
  #  unitary state (Q179164) 
  UNION
  { ?region wdt:P31 wd:Q179164 . }
  # sovereign state
  UNION 
  { ?region wdt:P31 wd:Q3624078 . }
  # get the population
  # https://www.wikidata.org/wiki/Property:P1082
  OPTIONAL { ?region wdt:P1082 ?pop. }
  # # https://www.wikidata.org/wiki/Property:P297
  OPTIONAL { ?region wdt:P297 ?isocc. }
  # isocode state/province
  OPTIONAL { ?region wdt:P300 ?isocode4. }
  # https://www.wikidata.org/wiki/Property:P625
  OPTIONAL { ?region wdt:P625 ?location. }
  SERVICE wikibase:label {               # ... include the labels
        bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en"
  }
}

try it!

Provinces with Population and ISOCODE4

# Find population of regions/provinces using the Wikidata Query service (e.g. for COVID-19 data analysis by Johns Hopkins University)
# taken from
# http://wiki.bitplan.com/index.php/SPARQL#COVID-19_-_background_data_Population_and_Location_of_Countries_and_Provinces
# Created 2020-03-22 by Wolfgang Fahl BITPlan GmbH
# simplified / improved by Martin Schibel / Heidelberg 2020-07-09
# select the province and its population
SELECT ?region ?isocc ?isocode4 ?regionLabel ?pop ?location
WHERE 
{
  ?region wdt:P31/wdt:P279* wd:Q10864048. 
  
  FILTER NOT EXISTS {?region wdt:P576 ?end}
  # get the population
  # https://www.wikidata.org/wiki/Property:P1082
  OPTIONAL { ?region wdt:P1082 ?pop. }
  # # https://www.wikidata.org/wiki/Property:P297
  OPTIONAL { ?region wdt:P297 ?isocc. }
  # isocode state/province
  ?region wdt:P300 ?isocode4. 
  # https://www.wikidata.org/wiki/Property:P625
  OPTIONAL { ?region wdt:P625 ?location. }
  SERVICE wikibase:label {               # ... include the labels
        bd:serviceParam wikibase:language "en"
  }
}
order by (?isocode4)

Try it!

Berlin

# Find boroughs of Berlin using the Wikidata Query service
# taken from
# http://wiki.bitplan.com/index.php/SPARQL#Berlin
# Created 2020-04-30 by Wolfgang Fahl RWTH Aachen
#
# select the borough and its coordinates
SELECT ?borough ?boroughLabel ?pop WHERE {
  # instance of borough of berlin
  ?borough wdt:P31 wd:Q821435.
  # get the population
  # https://www.wikidata.org/wiki/Property:P1082
  ?borough wdt:P1082 ?pop.
  SERVICE wikibase:label {               # ... include the labels
        bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en"
  }
}
ORDER BY DESC(?pop)

try it!

Descendants of Queen Victoria

see also http://royal-family.bitplan.com/index.php/Main_Page

# WikiData SPARQL Query
# see http://wiki.bitplan.com/index.php/SPARQL#Descendants_of_Queen_Victoria
#
# Wolfgang Fahl 2018-01-06
#
# get childen of queen victoria
SELECT ?child ?childLabel ?genderLabel ?dob
WHERE {
#  
# child
# https://www.wikidata.org/wiki/Property:P40
# Queen Victoria
# https://www.wikidata.org/wiki/9439
  wd:Q9439 wdt:P40+ ?child.
# gender
# https://www.wikidata.org/wiki/Property:P21
  ?child wdt:P21 ?gender.
# date of birth
# https://www.wikidata.org/wiki/Property:P569         
  ?child wdt:P569 ?dob
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?dob

try it!

Conference Series

# WF 2020-06-07

SELECT ?item ?itemLabel 
WHERE 
{
  #  scientific conference series (Q47258130) 
  ?item wdt:P31 wd:Q47258130.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

try it!

Conferences

# WF 2021-05-26
# scientific conferences
SELECT ?event ?acronymLabel ?eventLabel ?start_time ?end_time ?countryLabel ?locationLabel ?homepage
WHERE 
{
  #  scientific conference  (Q2020153) 
  ?event wdt:P31 wd:Q2020153.
  ?event wdt:P580 ?start_time.
  
  optional {    ?event wdt:P1813 ?acronym }.
 #    ?event wdt:P1545 ?ordinal.
  optional {   ?event wdt:P17 ?country }. 
  optional {   ?event wdt:P276 ?location }.
  optional {   ?event wdt:P856 ?homepage }.
  optional {   ?event wdt:P582 ?end_time }. 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
} ORDER BY DESC(?start_time)

try it!

battles

Example: Battle of Waterloo

dbpedia

SELECT DISTINCT ?person ?personProp ?work ?year ?country WHERE {
	
	VALUES ?personProp { dbp:commander }
	?work ?personProp ?person;
		  dbo:date ?year;
		  dbo:place ?country .	
	FILTER (isURI(?person))
	FILTER (isURI(?country))
	FILTER (strstarts(str(?person), 'http://dbpedia.org'))
}
ORDER BY ?person

try it!

wikidata

# Battles
# WF 2021-05-07
SELECT ?battle ?battleLabel ?countryLabel ?when WHERE {
  ?battle wdt:P31 wd:Q178561.  
  ?battle wdt:P17 ?country.
  ?battle wdt:P585 ?when.
  SERVICE wikibase:label {
     bd:serviceParam wikibase:language "en" .
  }
}
ORDER BY ?countryLabel ?when

try it!

mines

# truly tabular naive query for 
# Q820477:mine
# generated by trulytabular.py version 0.4.7 on 2023-01-29T17:35:39.675500
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX schema: <http://schema.org/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
# display the results as a map
#defaultView:Map
SELECT ?mineItem ?mine
  ?coordinate_location
WHERE {
  # instanceof Q820477:mine
  ?mineItem wdt:P31 wd:Q820477.
  # label
  ?mineItem rdfs:label ?mine.  
  FILTER (LANG(?mine) = "en").
  # coordinate location (P625)
  OPTIONAL { 
    ?mineItem wdt:P625 ?coordinate_location. 
  }
}

try it!

Generate Java Code - Descendants of Queen Victoria

# WikiData SPARQL Query
#
# Wolfgang Fahl 2018-01-06
#
# get childen of queen victoria
SELECT ?child ?new ?gender ?dob
WHERE {
#  
# child
# https://www.wikidata.org/wiki/Property:P40
# Queen Victoria via VIAF id
  ?subject wdt:P214 "95738652".
# https://www.wikidata.org/wiki/Q9439
  ?subject wdt:P40+ ?child.
# BIND(CONCAT(STR(?child), "=new Person(",STR($genderLabel),",",STR($dob),")") AS ?new)
  BIND(CONCAT("Person ",
              REPLACE(STR(?child),"http://www.wikidata.org/entity/",""),
              "=new Person(\"",
              STR(?child),"\",\"",
              STR(?birthname),"\",\"",
              STR(?gender),"\",\"",
              STR(?dob),"\");") 
  AS ?new)
#  BIND(IRI(?child) as ?new)
# birthname
# https://www.wikidata.org/wiki/Property:P1477
  ?child wdt:P1477 ?birthname.  
# gender
# https://www.wikidata.org/wiki/Property:P21
  ?child wdt:P21 ?gender.
# date of birth
# https://www.wikidata.org/wiki/Property:P569         
  ?child wdt:P569 ?dob
}
ORDER BY ?dob

try it!

Generated Java Code

Person Q116728=new Person("http://www.wikidata.org/entity/Q116728","Victoria Adelaide Mary Louisa","http://www.wikidata.org/entity/Q6581072","1840-11-21T00:00:00Z");	
Person Q20875=new Person("http://www.wikidata.org/entity/Q20875","Albert Edward","http://www.wikidata.org/entity/Q6581097","1841-11-09T00:00:00Z");	
Person Q155566=new Person("http://www.wikidata.org/entity/Q155566","Alice of Saxe-Coburg and Gotha","http://www.wikidata.org/entity/Q6581072","1843-04-25T00:00:00Z");	
Person Q158140=new Person("http://www.wikidata.org/entity/Q158140","Beatrice Mary Victoria Feodore of the United Kingdom","http://www.wikidata.org/entity/Q6581072","1857-04-14T00:00:00Z");

Wikidata Result Views

Wikidata by default shows results in tabular format.

Some of the examples above where presented as maps. There are further alternatives for Result Views

Other Visualizations of WikiData

Links

Stackoverflow Question Comments and Answers

WikiMedia Languages

see https://stackoverflow.com/questions/48235464/validate-language-code-for-wikimedia-languages

# comment to https://stackoverflow.com/questions/48235464/validate-language-code-for-wikimedia-languages
# Wolfgang Fahl http://www.bitplan.com
# 2018-01-15
SELECT  ?subject ?subjectLabel ?code ?isocode ?speakers  { 
  # Wikimedi language code
  ?subject wdt:P424 ?code.
  # Iso language code
  ?subject wdt:P220 ?isocode.
  # number of speakers
  # https://www.wikidata.org/wiki/Property:P1098
  ?subject wdt:P1098 ?speakers         
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
} 
ORDER by desc(?speakers)

try it!

Author: Wolfgang Fahl