Confidential Semantic Data Integration Platform
Public Sector Organization
Duration
10 months
Team Size
6 specialists
Industry
Public Data Infrastructure
Impact
0%
Automated domain detection accuracy
0%
Reduction in manual data modeling effort
0x
Faster dataset integration workflows
0+
Supported ontologies across domains
The Challenge
A public-sector organization needed to integrate heterogeneous open datasets coming from multiple European sources. The data was provided in inconsistent CSV formats without metadata, making it difficult to identify the domain, ensure interoperability, and integrate the datasets into existing semantic infrastructures.
Our Solution
We designed and developed an AI-driven semantic data integration system capable of automatically analyzing tabular datasets, detecting their domain, selecting the most appropriate ontology, and converting the data into RDF knowledge graphs. The platform combines natural language processing, semantic similarity models, and ontology-based reasoning to transform raw datasets into interoperable linked data.
Technologies Used
Our Approach
Ontology Discovery & Evaluation
Analyzed hundreds of European ontologies across multiple domains and evaluated them using semantic coverage, interoperability, and freshness criteria to identify the most suitable ontologies for each domain.
Ontology Hub Infrastructure
Developed a centralized ontology hub with a triplestore database, SPARQL endpoint, and a web interface for ontology exploration, semantic search, and visualization of knowledge graphs.
Domain Detection Pipeline
Built an AI-powered pipeline that preprocesses CSV datasets, extracts lexical, structural, and semantic features, and applies zero-shot transformer models to automatically detect the most probable domain.
Semantic Mapping Engine
Implemented a hybrid mapping system combining entity linking, semantic similarity models, and rule-based matching to associate dataset columns with ontology properties.
Knowledge Graph Generation
Developed a transformation module that converts tabular datasets into RDF knowledge graphs with persistent URIs, semantic typing, and interoperability with European data standards.
Ready to build something amazing?
Let's discuss your project.
