Cases & Applications

Big Data Clustering for Process Control

Concerned with the rapid growth of data, the huge quantity of parameters to measured, and the dynamically changing environment in its factory, the Italian company Whirlpool used a Big Data clustering method to detect real-time variations in a manufacturing process of washing machines. The approach extended traditional clustering algorithms (like k-Means), enabling better comprehension of the nature of the process and more efficient Big Data processing. Furthermore, the developed model can perform root-causes analysis, providing insights regarding process wastes.

The primary goal was to define average values for three parameters (power, rotation speed, and total water inlet) and develop a solution able to identify anomalies in functional tests. The method used a combination of scalable К-means for initial interaction and K-medoids (another partitioning algorithm) enhanced by FAMES (FAst MEdoid Selection). The team has developed a solution for comparing the test series by their type and based on cluster tests. By performing a final comparison with standard samples, the model detects unusual data. The results confirmed the model's effectiveness to detect anomalies and identify specific problems based on the similarity with the cluster [9].

Cloud-Based Solution for Real-Time Process Analytics

Scenarios that involve high complexity and highly distributed process form a perfect condition for cloud computing applications. By having means for extracting and correlating process events, performance analytics can be provided in real time. In [34], cloud-based architecture is proposed to allow for continuous improvement of enterprise processes, measuring the performance of cross-functional activities at meager latency response rates. Using a set of Business Analytics Service Unit nodes and a Global Business Analytics Service component, local (inter-departmental) and global (cross- organizational) complex processes can be monitored and analyzed. With these features, the devised solution collects data generated from distributed heterogeneous systems, stores a massive amount of process data, and infers knowledge based on the acquired information.

The event repository uses a column-oriented NoSQL database (HBase) running on top of the Hadoop Distributed File System, presenting an outstanding performance for timely access to critical data due to its clustering capabilities and in-memory cache distribution. Furthermore, to ensure instantaneous identification of the sequence of events, an event-based model and an event correlation algorithm are implemented. Although the approach was not extended for mining processes and advanced analytics optimization techniques, the system presented an excellent performance for real-time activities monitoring and had the capacity of gathering distributed event logs regardless of operating system technology and location.

< Prev   CONTENTS   Source   Next >