COMP60411: Modelling Data on the Web

This is the web page where you will find news and information about COMP60411 for 2020/21. There is, additionally, the page from the syllabus.


The course is taught by Andre Freitas and Uli Sattler. All material is found in Blackboard, in weekly chunks, and all coursework is to be submitted there, too.

If you have any questions that might be of interest to others, please feel free to post it on the Blackboard Discussion Board of the week - and feel free to share test cases, join in discussions, answer questions there, too.

Coursework and Timing:

The course starts on Monday, October 26th, 2020, 9:00-10:00 Manchester time, in Blackboard Collaborate.

The deadline for handing in the coursework for the

...but for the quiz: this is due every Wednesday at 20:00 (i.e., first deadline is Wednesday, Oct 28th, 2020).

Coursework, announcements, feedback, and discussions will be handled via Blackboard.

For most coursework, it can be helpful to use the <oXygen/> XML editor: they have given us a free group licence, which is available in Blackboard: check the Week 2 Forum.

Late coursework:

If you have mitigating circumstances (either for lateness or for any other issue), you should fill at the mitigating circumstances form and hand it in to the student support office. The instructors and teaching assistants do not grant extensions or resits for coursework directly: You need to go through the mitigating circumstances committee. (Feel free to come talk to us about problems you are having as early as possible. We will help you navigate the system. But we will adhere to the system.) If you do not have mitigating circumstances, then you will receive 0 marks for work handed in late, regardless of the reason (because we want to discuss the coursework after the deadline).

Concerning Literature:

Please note that all books used for this course are available in the resource center or online (see below in the table for the readings of each week). If you prefer to buy them, please note that

Schedule and materials:

Please note that this course unit is a severely modified version of previous editions: we have dropped a substantial number of topics, and added other topics, so please do not rely too heavily on exams, tales, or additional material from previous years!

Teaching assistants (TAs) will be available to help with your understanding and coursework. We will announce later when. Also, always subscribe to and keep an eye on the Blackboard Discussion Boards: your question is probably already discussed there!

Below is a table with the material from last year for your orientation. Of course this year things will be a little different.

Week Date Topic(s) Resources/Reading Slides
1 Sept. 23, 2019 Course organisation
Tables and Relations:
  1. Data Structure: tables
  2. Schema Language: CSVW & SQL
  3. Data Manipulation: Python & SQL
Learning SQL Week 1 and as a pdf document
2 Sept. 30, 2019 Tree data models:
  1. Data Structure formalisms: JSON
  2. Schema Language: JSON Schema
  3. Data Manipulation: Python
Self-Describing, Variability, Nesting
JSON, XML, and Schemas The Essence of XML

JSON Schema
A JSON Schema validator

Early Clark on XML Namespaces
Later Clark on XML Namespaces
Namespace Myths
Namespaces FAQ (Very extensive!)
Week 2
3 Oct. 7, 2019 Tree data models:
  1. Data Structure formalisms: XML
  2. Schema Language: RelaxNG
  3. Data Manipulation: DOM, XPath
Variability, Nesting, Self-Describing

JSON, XML, and Schemas Learning XML XML in a Nutshell
XML Specification
XML Namespaces 1.0 Recommendation Designing Extensible, Versionable XML Formats

Taxonomy of XML schema languages using formal language theory
XPath Rec
XPath Functions

Week 3
4 Oct. 14, 2019 Tree data models, XML continued:
  1. Data Structure formalisms: XML
  2. Schema Language: Schematron
  3. Data Manipulation: SAX
Tree data models, JSON:
  1. Data Structure formalisms: JSON
  2. Schema Language: JSON Schema
  3. Data Manipulation: Python
Robustness and Error Handling
XQuery formal semantics (heavy going)
The Essence of XML
Influence on the Design of XQuery

Comparing XML Schema Languages
Refining the Taxonomy of XML Schema Languages. A new Approach for Categorizing XML Schema Languages in Terms of Processing Complexity.
Week 4 Slides
5 Oct. 21, 2019 Graph data models:
  1. Data Structure formalisms: RDF
  2. Schema Language: RDFS (and Schematron for XML)
  3. Data Manipulation: SPARQL
Robustness, Retrospective

RDF - Resource Description Framework
RDFS - RDF Schema 1.1 Specification
SPARQL Property Path Expressions

SPARQL EndPoint to DBPedia
SPARQL EndPoint to WikiData
Learning SPARQL

Schematron: validating XML using XSLT
Error handling and Web language design

Week 5 Slides
Oct. 28-Nov. 1: Reading Week.