Student: Aíquis Rodrigues Gomes
Title:Towards Publication of Open Government Data as Linked Open Data using an ontology-based approach
Advisor: Kele Teixeira Belloze
Committee: Kele Teixeira Belloze (president), Laura Silva de Assis (CEFET/RJ), Maria Claudia Reis Cavalcanti (IME)
Day/Time: August 12, 2020 / 14h
Governments are significant producers and publishers of data and have sought to use them to increase transparency and generate more value for society. However, the level of maturity in publishing government data is still low, which results in the publication using formats that make it difficult to connect to other data sets and to be read by machines, in addition to publications that are not really in open formats. Linked Open Data constitutes a set of technologies and standards of the semantic web that allow the connection between different sets of open data published on the web. Through Linked Open Data, governments can achieve a high degree of maturity in the publication of data using a truly open format, which allows machine-reading and can enhance the value generated for society with data initiatives. However, there are some barriers to publishing data using these technologies and standards. One of these barriers is the lack of a guide for its implementation that can direct, in a structured way, the steps to be followed for the publication of a data set as Linked Open Data. This work presents an ontology-based methodology to publish traditional data sets as Linked Open Data. The methodology consists of four stages: (i) identification, analysis, and integration of data; (ii) ontology development; (iii) publication of the data as Linked Open Data; and (iv) publication of a SPARQL endpoint. Two experiments using real government data sets from the electoral and health domains were carried out following the proposed methodology. As a result, there was the production of two ontologies, on the Brazilian elections and the Basic Health Units operating in Brazil, and the availability of the two data sets referring to these ontologies as RDF files with some resources linked to other datasets. With the experiments, it was possible to show that through a structured process, it is possible to evolve in the publication of open data and that the proposed steps can be applied regardless of the data domain.