Rock Band Data Modeling Competition


DBPedia is a semi-structured knowledge graph of facts that derive from the infoboxes on wikipedia. Being semi-structured, it does not actually have a schema. We would like to have a schema developed which creates a well structured database with a strong schema representing bands.

What you will have when done: A beautiful, checked schema which allows you to explore bands and their members and other data about them.

Step 0: DBPedia

First install TerminusDB desktop client. Open the console and connect to hub. If you don't already have an account sign up for a free account. Click on "Clone" at the top of the screen and you will be presented with a list of options for download. Make a clone or a fork of DBPedia.

Once you have downloaded, you can scroll to DBPedia in your list of databases and click on it. This will bring you to the main overview. Click on the "Query" tab and you are ready for Step 1.

Step 1: Discovery

Since it is semi-structured, we do not necessarily know what is in the database. Luckily the names of objects in the database are also based on wikipedia, so we can easily look up the name of an object in wikipedia. For instance:

triple("doc:The_Police", "v:P", "v:Z")

This will give you a list of every property and every value that is available for the Police.

Step 2: The Schema

Now that you've some idea what properties are there, you can start trying to formulate a schema. You can use the schema builder in a new database which we will use as the scaffold into which we will be able to copy data from DBPedia.

NOTE: When creating the new database, you should select [x] Show Advanced Settings and make sure that you enter the following document and schema prefixes:

Base URL for Data:

Base URL for Schema:

The schema builder allows you to choose data-types and objects. We can safely specify all data-types as strings, but we want to provide a schema of the types of objects that we encounter which themselves may have further properties. For instance:

and(triple("doc:The_Police", "scm:formerBandMember", "v:FormerMember"),
    triple("v:FormerMember", "v:P", "v:Z"))

This will give us all former band members, and all properties and values that these former band members have. By exploring the graph you should be able to find many of the aspects of interest.

Step 3: Populating the Schema (Bonus Round)

Once you have a well structured schema, we can populate it from the data in DBPedia. To do this we need to have constructed the new database. Supposing we called our new Database "bands". Then we can copy a specific property for two already defined objects in our schema (scm:Band and scm:BandMember) into the database as follows:

triple("v:Band", "scm:formerBandMember", "v:FormerMember"),
      and(add_triple("v:Band", "rdf:type", "scm:Band"),
          add_triple("v:Band", "scm:formerBandMember", "v:FormerMember"),
          add_triple("v:FormerMember", "rdf:type", "scm:BandMember")

NOTE: Since cardinality constraints require everything to be carefully done in one transaction to avoid inconsistencies, we recommend that no cardinality constraints are introduced until after copying. You can impose them by altering the schema later.