Complexities in Data Management
An effective data management calls for clear visualization of the types of data. Various RDBMS process already exists to manage such generated data, but in the most recent days, the complexities of the data being generated are increasing. Imagine of any educational section is maintaining the digital records of their daily activities; there are various forms of heterogeneous data being massively generated and stored in either local server or in cloud. However, the extent of data being generated from the LMS (collaborative learning) is exponentially high as the streams of data are generated round the clock. Owing to adoption of cloud resources by existing online learning tools (e.g. Cloudera). However, there are specifically three types of data being found to be generated from educational sector as structured data, unstructured data, and semi-structured data.
- • Structured Data: Structured data are those data that can be stored in rows and columns of a table using SQL. Structural data is characterized with relational key and quite adaptable for mapping with pre-designed fields. It should be noted that 5-10% of the structured data can only represent all informatics data.
- • Unstructured Data: Unstructured data mainly consists of multimedia and text-related information and constitutes about 80% of the overall informatics data. Majority of the educational data are unstructured in nature as they consists of audio, video, images, word format documents, email message, PowerPoint, etc. Although such forms of data too posses their own structure but they can never be called as structured as their data cannot suitably fit in the existing database neatly.
- • Semi-Structured Data: Semi-structured data posses the interesting char- ecteristics. Although, such forms of data cannot be resided in a relational database management system, but they have some unique features of organization that makes them more suitable to perform data analysis. Example of semi-structured data is text in the forms of XML, JSON, etc doesn’t have formal structure but it possess specific levels of tags to assist in data management. Another good example of semi-structured data is weblog data, healthcare data, supply chain data, and sensor data.
Both unstructured and Semi-Structured data could be generated by machine or human. Apart from educational system, unstructured data is also found generated from machines e.g. satellite images, surveillance feed, scientific data, and sonar/ radar data. Similarly, generation of unstructured data from human are data generated from social networking sites, website content, mobile data, internal communication system (corporate emails, logs etc). Hence, the complexities surfaced from existing operation function in educational data management are generation of unstructured and semi-structured data, where the corporate spends a lot for storage but doesn’t emphasize on its utility. At present, Hadoop and MapReduce are considered as standards for managing such forms of unstructured and semi-structured data. Hadoop and MapReduce are found compliant of the standard termed as Unstructured Information Management Architecture.