A data model is a description of how data should be used to meet the requirements given by the end user (Ponniah). Data modeling helps to understand the information requirements. Data modeling differs according to the type of the business, because the business processes or each sector is different, and it needs to be identified in the modeling stage. Initial step is the analyzing the situation, gather data. Data modeling process starts with requirement gatherings. When developing the proper data model it is important to communicate with the stakeholders about the requirements. Data modeling is the act of exploring data oriented structures. This can be used for variety of purposes. One of the important functions of data modeling is that, it helps to understand the information requirements. Especially this makes both developers and end users lives easier. As mentioned above, data modeling helps the end users to define their requirements, and the developers are able to develop a system to meet those specified requirements.
Figure 1: The Systems Engineering Process - [7]
Data model is a conceptual representation of data structures required for a database and is very powerful in expressing and communicating the business requirements (Learn Data Modeling). It visually represents the nature of data, business rules that are applicable to data, and how it will be organized in the database. There are three main designs for the data model, namely conceptual design, logical design and the physical design (Itl Education Solutions Limited). Data model is used by both functional team and the technical team in a project. Functional team consists of the business analysts and the end users, and the technical team consists of the developers and the programmers. There are data modelers who are responsible for designing the data model which meets the expectations of the functional team, and provide requirements for the technical team (Chuck Ballard, Dirk Herreman, Don Schau, Rhonda Bell, 1998).
Levels of Data Models
History of Data Models
In 1970s, Peter Chen invented and introduced the entity-relationship
modeling technique. In 1980s the object modeling techniques started
applying to representing information requirements of an organization.
Then the unified modeling language (UML) was introduced to replace the
object modeling methods. (Hay, Requirements Analysis: From Business
Views to Architecture)
Data Modeling Process
Data modeling process starts with analyzing the situation. Here the
analysts are able to gather requirements, when designing a proper data
model it’s important to communicate with the stakeholders about the
requirements. Data modeling is the act of exploring data oriented
structures, which can be used for multiple purposes. Mainly data
modeling is a communication tool among users, which considers as the
blue print of the database system. (Merson, Paulo F.)
Data Analysis
The techniques of data analysis can impact the type of data model
selected and its content. For example, if the intent is simply to
provide query and reporting capability, a data model that
structures the data in more of a normalized fashion would probably
provide the fastest and easiest access to the data. Query and reporting
capability primarily consists of selecting associated data elements,
perhaps summarizing them and grouping them by some category, and
presenting the results. Executing this type of capability typically
might lead to the use of more direct table scans. For this type of
capability, perhaps an ER model with a normalized and/or denormalized
data structure would be most appropriate.
Figure 3: Several methods of data analysis [5]
A data model consists of three different phases. (West)Those are:
Structural part – Consisting a set of rules
Manipulating part – Types of operations allowed, such as updating, retrieving, and changing the database
Integrity part – which validates the accuracy of data.
Figure 4: West & Fowler has identified many benefits of a data model. Above figure depics the details of these benefits of using a data model. [25]
There are four types of data models identified:
Conceptual Data Models
According to Hoffer et al. Conceptual data model is a representation
of organizational data. The purpose of a conceptual data model is to
show as many rules about the meaning and interrelationships among data
as are possible. Conceptual data modeling is typically done in parallel
with other requirement analysis and structuring steps during system
analysis. This is carried out throughout the systems development
process. This is useful for both planning and analysis phases in the
systems development life cycle (Valacich). Conceptual data model
contains about10 - 20 entities and relevant relationships known as group
entities. Conceptual data modeling is the most crucial stage in the
database design process. Peter Chen states entity relationship model as
a “Pure Representation of reality”
Figure 5: Conceptual Data Modeling Process
According to Jarrar, Demey, and Robert, identifies two main differences of conceptual data schemes and ontologies which should be taken into consideration when reusing the conceptual data modeling techniques for building ontologies. Paper further discusses that the successful conceptual data modeling approaches, such as ORM (object role modeling) or EER (Enhanced entity relationship model) became well known because of the methodological guidance in building conceptual models of information systems. (M Jarrar)
Enterprise Data Model (External Data Model)
An Enterprise Data Model is an integrated view of the data produced
and consumed across an entire organization. It incorporates an
appropriate industry perspective. An Enterprise Data Model (EDM)
represents a single integrated definition of data, unbiased of any
system or application. It is independent of “how” the data is
physically sourced, stored, processed or accessed. The model
unites, formalizes and represents the things important to an
organization, as well as the rules governing them.
(Ponniah) (Noreen Kendle)
Figure 6: Enterprise Data Modeling Structure [19]
Logical Data Model
The logical data model is an evolution of the conceptual data model
towards a data management technology such as relational databases.
Actual implementation of the conceptual model is called a logical data
model. To implement one conceptual data model may require multiple
logical data models. Data modeling defines the relationships between
data elements and structures
Figure 7: Logical Data Model
Physical Data Model
Physical data model is a representation of a data design which takes into account the facilities and constraints of a given database management system. Physical data model represents how the model will be built in the database. A physical database model shows all table structures, including column name, column data type, column constraints, primary key, foreign key, and relationships between tables.
Figure 8: Physical Data Model
Multidimensional data modeling
Multidimensional structure is defined as “a variation of the relational
model that uses multidimensional structures to organize data and
express the relationships between data. According to Jensen et al.
multidimensional models view a central data element for the given
domain, which uniquely defined by a combination of dimension
values (Christian S. Jensen)
Newspeak – Tower of Babel Dilemma in Data Modeling
This is the fundamental design problem for information systems.
Creating a standard model for the whole company with different data
interpretation of an organization, this is known as the Newspeak
solution. Allowing multiple and incompatible models to coexist can lead
to Tower of Babel problem. Because of the conflicts the system
designers can either create an enterprise wide data model or create
multiple models to meet each requirement (Federico Fonseca). Problems
can arise due to miscommunication, and when the information system is
not working the way it was designed.
Agent based models
An agent-based model (ABM) (also sometimes related to the term
multi-agent system or multi-agent simulation) is a class of
computational models for simulating the actions and interactions of
autonomous agents (both individual and collective entities such as
organizations or groups) with a view to assessing their effects on the
system as a whole. It combines elements of game theory, complex
systems, emergence, computational sociology, multi-agent systems, and
evolutionary programming. Monte Carlo Methods are used to introduce
randomness. ABM's are also called individual-based models. Nigel
Gilbert has defined Agent-based Modeling as a new analytical method for
social sciences which is quickly becoming popular. Further, agent
based modeling is a computational method that enables a researcher to
create, analyze, and experiment with models composed of agents that
interact within an environment.
There are nine techniques will help to model a agent based system, these techniques include,
Preciseness, accessibility, expressiveness, modularity, complexity
management, excitability, refinability, analyzability, and
openness (Gilber)
Importance of Agent based modeling in systems analysis:
In the paper by Osinga, states how an agent-based model has used as a
modeling method to investigate the relationship between system level
and agent level behavior.
There are three business modeling types:
Agile Modeling and Analysis Techniques
Agile Modeling: Agile modeling is a collection of values,
principles, and practices for modeling software that can be applied on a
software development project in an effective manner. Agile modeling
includes creating several models in applying right artifacts for the
situation, and continue to move forward.
Best Practices of Agile Modeling
Figure 9: Best Practices of Agile Modeling
Agile Analysis
The purpose of analysis is to understand what will be developed, why it should be built, estimate the cost, and prioritize the developing process. The main difference is that the focus of requirements gathering is on understanding your users and their potential usage of the system, whereas the focus of analysis shifts to understanding the system itself and exploring the details of the problem domain. Another way to look at analysis is that it represents the middle ground between requirements and design, the process by which your mindset shifts from what needs to be built to how it will be built. According to the author, there are three major challenges related to roles and responsibilities including conflict of team structure and agile principles, applying product owner role in a large and complex context, and lack of business theme priorities (Ilkka Lehto)
Figure 10: Agile product planning and development
KANBAN approach to Data Modeling and Analysis
KANBAN: Meaning "visible record" in Japanese, it is a system of notification from one process to the other in a manufacturing system. Kanban cards, which may be multicolored, based on priority, are stored in a bin or container that holds the items. They describe the parts, supplier and quantity. When the bin is emptied, the Kanban is used to order more. A two-card Kanban system uses "move" cards to relocate items from one workplace to another and "production" cards to replace the material when it is used or sold (Nelson-Smith). A simple analytics model – for the data analysis purpose we can monitor the analytics of a website visitor. This is another use of data models, because each step using analytics tools, the owner of the website is able to monitor the success of failure of a website/product.
Figure 11: A simple analytics model
Lean approach, common on Kanban teams, to requirements management. There are several key differences between the agile approach to requirements/work management and the lean approach:
Figure 12: Lean Data Modeling
Data modeling is generally performed in the context of an
information systems project with relevant methodology and tools.
Also data modeling is useful in representing and documenting data. Data
model can be used as a map to go from start to finish. It’s the modern
GPS for both IT and business professionals who are involved in a
project to navigate on the correct path.
Advancing the process by eliminating the non value added steps will make it lean, as well as reduce the unwanted steps.