Jungletech
/
Menu
Close
Start Project
← Back to work
Public Data Infrastructure2025

Confidential Semantic Data Integration Platform

Public Sector Organization

Duration

10 months

Team Size

6 specialists

Industry

Public Data Infrastructure

Impact

0%

Automated domain detection accuracy

0%

Reduction in manual data modeling effort

0x

Faster dataset integration workflows

0+

Supported ontologies across domains

The Challenge

A public-sector organization needed to integrate heterogeneous open datasets coming from multiple European sources. The data was provided in inconsistent CSV formats without metadata, making it difficult to identify the domain, ensure interoperability, and integrate the datasets into existing semantic infrastructures.

Our Solution

We designed and developed an AI-driven semantic data integration system capable of automatically analyzing tabular datasets, detecting their domain, selecting the most appropriate ontology, and converting the data into RDF knowledge graphs. The platform combines natural language processing, semantic similarity models, and ontology-based reasoning to transform raw datasets into interoperable linked data.

Technologies Used

PythonFastAPIReactSemantic Web (RDF/OWL)SPARQLKnowledge GraphsSentence-BERTTransformer ModelsApache Jena / GraphDBDocker

Our Approach

01

Ontology Discovery & Evaluation

Analyzed hundreds of European ontologies across multiple domains and evaluated them using semantic coverage, interoperability, and freshness criteria to identify the most suitable ontologies for each domain.

02

Ontology Hub Infrastructure

Developed a centralized ontology hub with a triplestore database, SPARQL endpoint, and a web interface for ontology exploration, semantic search, and visualization of knowledge graphs.

03

Domain Detection Pipeline

Built an AI-powered pipeline that preprocesses CSV datasets, extracts lexical, structural, and semantic features, and applies zero-shot transformer models to automatically detect the most probable domain.

04

Semantic Mapping Engine

Implemented a hybrid mapping system combining entity linking, semantic similarity models, and rule-based matching to associate dataset columns with ontology properties.

05

Knowledge Graph Generation

Developed a transformation module that converts tabular datasets into RDF knowledge graphs with persistent URIs, semantic typing, and interoperability with European data standards.

Ready to build something amazing?

Let's discuss your project.

Start a project →