Assessing Data Infrastructure Maturity

Bill asked Peter what he had learned so far. Right now, Peter said, "Proclndustries' data maturity is in its infancy." People are accessing data from silos and spending large amounts of time generating reports that do not enable collaborative analysis. It is very difficult to identify process improvements. People cannot even identify variances in production metrics from one day to the next. (See the box "Industry 4.O.")


Proclndustries' quest to their digital transformation is aligned with the principles in Industry 4.0 (Kennedy 2018). Industry 4.0 is commonly referred to as the fourth industrial revolution. Industry 4.0 fosters what has been called a "smart factory." Within modular structured smart factories, cyber-physical systems monitor physical processes, create a virtual copy of the physical world, and make decentralized decisions. Over the Internet of things (IoT), cyber-physical systems communicate and cooperate with each other and with humans in real time both internally and across organizational services offered and used by participants of the value chain.

  • Interoperability. The ability of machines, devices, sensors, and people to connect and communicate with each other via the IoT or the Internet of people (IoP).
  • Information transparency. The ability of information systems to create a virtual copy of the physical world, also known as a digital twin, by enriching digital plant models through the aggregation of raw sensor data into higher value context information.
  • Decentralized decisions. The ability of cyber-physical systems to make decisions on their own and to perform their tasks as autonomously as possible. Only in the case of exceptions, interferences, or conflicting goals are tasks delegated to a higher level.
  • Technical assistance. First, the ability of assistance systems to support humans by aggregating and visualizing information comprehensively for making informed decisions and solving urgent problems on short notice. Second, the ability of cyberphysical systems to physically support humans by conducting a range of tasks that are unpleasant, exhausting, or unsafe for their human coworkers.

Peter encapsulated the state of Proclndustries by defining the aspects of a mature data infrastructure: current status, target state, and maturity status. Peter defined what he called a maturity state for data availability, accessibility, correlation, analytics, reporting, and the ability to act on new information received. He summarized the refinery's current status and assigned a maturity status to each area, according to five states:

  • 1. Initial/ad hoc
  • • Baseline case
  • • Changes are made to solve a problem, but not always based on best practices.
  • • The solution has not become a standard operating procedure.
  • 2. Repeatable
  • • The solution has become standardized.
  • • Everyone is aware of the solution and knows what to do.
  • • Usually, this means automated event notifications and work- flows when an abnormal situation is detected.
  • 3. Defined
  • • Staff and systems work well together.
  • • Analytics, event generation systems, system notifications, and workflows combine to automate the identification of abnormal situations.
  • • People have access to the data and documentation they need.
  • 4. Managed
  • • Continuous training is given, with an emphasis on continuous improvement.
  • 5. Optimized
  • • The company monitors the utilization and effectiveness of these procedures and integrates them with process and safety management procedures.

Peter said the goals of the digital data infrastructure project were to address the current deficits and move Proclndustries to a mature state to reduce the risk of failure, and improve the reliability and integrity of processes and data (Figure 1.3).

As Peter articulated his vision, he suggested assigning a team to learn more about the current activities that were in progress at the South Texas plant and at Proclndustries in general. It was clear from what he saw so far that everyone had his or her own particular way of doing their job with nonintegrated tools.

Peter observed that the Proclndustries IT department required some attention. Managers shared a consensus view that while IT had done a superb job at the administrative offices, the company had isolated sectors of the plant level that would require network connectivity.

Tom and Bill studied the information Peter presented. Together they created a table to define current status and target status for the refinery for each of the six areas (Table 1.2).


Proclndustries data analytics and operational intelligence maturity matrix. Bottom bars are current states; top bars are optimum targets, with the black arrows indicating current deviation from targets.


Proclndustries Current and Target States


Current Status


Current Status Maturity State



Data is available but not aggregated.

Data definitions are unique to each plant. No hierarchical data model.

Historical data is often lost and difficult to access once offline.

Consolidated data from all available sources is saved in a perpetual historical archive.

Data definitions are common and shared across plants.




Data is manually collected and often takes excessive time to gather.

Real-time and historical data is easily accessed by anyone in the refinery.

Data is also accessible by other systems to automate workflow and business processes.

Data is classified according to operating condition or status for proper validation and aggregation, and the operating condition provides a time context for the data (e.g., operating status could be running OK, in trouble, idle, down, in maintenance).

Initial/ad hoc


TABLE 1.2 (Continued)

Proclndustries Current and Target States


Current Status


Current Status Maturity State



Data correlation for modeling requires individual or group heroics.

Only "point-in-time" correlations can be made.

Correlation across plant sites is nonexistent.

Data is correlated in standardized reports.

Subject matter experts can develop and test new data model correlations.

Root cause analysis is enhanced to eliminate problems.

Cause-and-effect diagrams are available and used together with business intelligence analysis tools.

Approved models can be reused and augmented via collaboration.

Initial/ad hoc


Analytics require individual heroics.

Without validated models, analytics are limited.

The capabilities of subject matter experts are not fully utilized.

Analytics are easily performed on real-time data.

Subject matter expertise is optimized.

Data models capture valuable knowledge.

Initial/ad hoc


Reporting is limited to average and totals without aggregation at the desired level of details.

Reporting is done manually and covers a point in time.

Sharing reports with other plants is not practical or easily achieved.

Real-time and historical trending is available.

Contextualized data is aggregated and consumed at desired levels of detail for different roles within the enterprise.

Consolidated reporting capability across the enterprise. Data-driven real-time alerts are delivered to staff as needed.

Initial/ad hoc

Ability to manage results

Subject matter experts spend time measuring rather than using the information to achieve results.

Departure of personnel creates gaps in knowledge.

Ad hoc procedures are implemented to reduce unplanned excursions (departures from the norm).

Subject matter experts are devoted to keeping the refinery in desired operating ranges.

Role-based key performance indicators are established and standardized, with real-time values readily accessible.

Major excursions are prevented or reduced in duration and impact.

Performance is sustainable at all levels.

New capital expenditures are justified by data.

Initial/ad hoc

< Prev   CONTENTS   Source   Next >