My blog
Linked data for the enterprise: Focus on Bayer's corporate asset register

Linked data for the enterprise: Focus on Bayer's corporate asset register

An overview of COLID, the data asset management platform built using semantic technologies.

Published on by Anton Vasetenkov

Organisations that know how to best leverage all their data position themselves for success in the longer term. As companies, large and small, continue to collect and generate more and more data assets, it gets harder for them to manage and organise those assets without having the modern tools and necessary infrastructure.

Even though there are a number of enterprise data management products available, truly innovative companies choose to build their own solutions that better suit their needs. The Corporate Linked Data Catalog, or COLID for short, is one such solution—an end-to-end cloud-native platform for centrally storing and managing company assets. Built by Bayer, one of the world's largest pharmaceutical companies, COLID has been used by all of Bayer's divisions to organise drug research data, prescriber and sales information, and other company assets.

The case of linked and FAIR data for the enterprise

What makes COLID stand out is the fact that it utilises semantic web technologies. Its main datastore is an RDF graph database, and it makes the data accessible using SPARQL, a query language for RDF. While graph structures are a natural way to describe the interdependencies between things such as company assets, using RDF and SPARQL enables the interoperability and reusability of data, whether it is published on the web or only shared and used internally within an organisation. Together with findability and accessibility, interoperability and reusability constitute the four FAIR properties of data that make it more valuable to an organisation. FAIRness of the data has been the key requirement for building COLID, differentiating it from most other data management solutions.

The metadata and semantic links stored in COLID's RDF graph also help categorise the assets and catalogue their business, technical, and operational characteristics. This aids in the discovery of data inside COLID's data marketplace and its reuse within the organisation.

Bottom line

Bayer's COLID is an end-to-end technological solution for centrally storing and organising company assets such as documents, datasets, ontologies, models, reference data, and so on. An enterprise-grade linked data register, it enables FAIR data within the corporate environment and can drive a wide range of complex decisions and positive outcomes.

Recognising the benefits of metadata and semantic technologies puts innovative companies like Bayer at a significant advantage. Ontology-based solutions like COLID will undoubtedly continue to play the major role in shaping the enterprise data management and governance of tomorrow.

See also

Data exploration on linked COVID-19 datasets
An overview of the available RDF datasets and discovery tools for COVID-19.
What does a knowledge engineer do?
An overview of knowledge engineering and the core competencies and responsibilities of a knowledge engineer.
AstraZeneca's knowledge graph: Drug discovery is a lot about connections
The biomedical knowledge graph built by AstraZeneca helps the company find new drugs and drug targets.
Data discovery at Uber: The continued success of Databook
How Uber's in-house platform powers discovery, exploration, and knowledge at scale.
How a custom solution helps Facebook's engineers discover the data they need
The story of Nemo, Facebook's internal data discovery engine.

Thanks for stopping by my digital playground! If you want to say hi, you can reach out to me on LinkedIn or via email. I'm always keen to chat and connect.

If you really-really like my work, you can support me by buying me a coffee.