STIRData provides a user-friendly interface to explore in a uniform manner business registry data from 13 European countries.
The STIRData approach to technical interoperability is based on linked data, and the approach to semantic interoperability is based on a common data specification that reuses the European Core Vocabularies.
The platform adopts a fully decentralised architecture. It assumes that each dataset resides in a separate remote SPARQL endpoint. Apart from some basic information about each dataset, it also centrally stores copies of the shared NUTS, LAU and NACE vocabularies. In addition, to improve performance of the user facing platform, centrally stored precomputed statistics data and indexes have been added as extensions to the basic platform architecture, making it less dependent on the performance characteristics of the source SPARQL endpoints.
For most of the supported countries, the business registry data have been transformed to the common linked data based representation using the SAGE tool, which retrieves the data in their original format from the business registries, maps them to the common model, enriches them using the NUTS, LAU and NACE vocabularies, and finally publishes the resulting linked data representation in the SPARQL endpoints.
The STIRData compliant business registry datasets offered by the platform are discovered automatically by scheduled tasks that periodically check for new datasets in the Official portal for European data, as well as for updates of already included datasets. Datasets not yet available in the official portal for European data can be registered manually; in either case, the only required information is a link to the respective SPARQL endpoint.
The platform enables the end user to make complex search queries to retrieve lists of companies that satisfy conditions based on location, economic activity, and registration date. For example, a user may request all companies registered in the Oslo area in Norway and in the Prague area in Czechia after a certain date that perform one of a specific set of economic activities.
Statistical views are also provided, showcasing an analysis of the distribution of companies in the subregions and subactivities specified in a query. Statistics provide useful, compact overviews of the underlying data and allow users to browse through the location and/or the economic activity hierarchies, displaying the corresponding statistical information.