Turning data into a tangible asset
Nathi Dube, Data Engineer at PBT Group
Data modelling is the critical enabler for data science at an organisation. It sits at the heart of the entire enterprise data management process. Without it, business processes cannot be mapped, and value cannot be derived from the available data. As such, there are several things to consider before data can be used as a tangible asset.
Effectively, a data model is a visual representation of the data elements within an organisation as well as the relationship between those elements. Data modelling therefore functions as a representation of the content and structure of that data.
Instead of taking a big bang approach and trying to model the organisation all at once, decision-makers need to break the company into specific subject areas. For instance, the HR function, the finance function, the billing function, and so on. By then focusing on these individual models, a company can help facilitate a more agile approach in terms of project management.
Even though data modelling is required at the beginning of any project, data modellers have important roles to play as things change and evolve according to new requirements. They are instrumental to updating the models to reflect the changes that are occurring.
Multi-layered approach
Typically, a data model consists of three layers – conceptual, logical, and physical.
The conceptual layer is very high level. This is where language is used that business users understand. For example, the HR department will look at employees as entities with different attributes (their role, employee number, qualifications, and so on).
The logical layer takes that employee information and transforms it into a table. Effectively, this represents the employee in a logical fashion. All the attributes become columns in a table. Still, it is very generic and not database or environment specific.
Finally, the physically data layer comes in. This is very detailed and specific to an environment. For instance, in Oracle where relationships are specified, meta-data must be accounted for, and other information specific to the system.
Managing best practice
Unfortunately, most companies do not put much emphasis on their data models. And yet it is important to have a person dedicated to data modelling. The data model is the blueprint and specification the organisation will use as the foundation on which to build its entire data solution.
The data modeller will gather business and data requirements upfront before the model is built. They must engage with business analysts to understand the business and data requirements of the model. In turn, the model will be developed iteratively and incrementally. It is important to break the model down to a subject level to ensure nothing in the organisation is missed.
Furthermore, the modeller will use specific tools that can interface with other systems while also determining the level of granularity to ensure the traceability of the data. Ultimately, a data model is a vital communications tool to ensure the company can refer to one source of the truth across the enterprise. It sits at the centre with all the business processes built around it. In effect, a data model carries the entire organisation on its shoulders.