Data mining is a process of analyzing large data sets (i.e., “big data”) to discover previously unknown patterns or relationships. Data mining tools are typically applied to structured data, that is, highly formatted data contained in relational databases such as in an ERP system and, therefore, can be very helpful for accountants.
As illustrated in Figure 1.2, data mining intersects with but is not considered a subfield of Al. As computer scientist Xavier Amatriain notes, “machine learning can be used for data. However, data mining can use other techniques on top of machine learning” (Amatriain, 2016, para. 5).
Like machine learning, data mining relies on reading large amounts of data to discover patterns and arrive at conclusions. However, technology expert Bernard Marr identified notable differences between data mining and machine learning. First, data mining looks for patterns that already exist in historical data, whereas ML attempts to predict future outcomes based on the given data. Second, the rules or patterns are unknown at the beginning of the data mining process, whereas the rules are programmed into the computer to understand the data with machine learning. Third, data mining relies on human intervention throughout the process, whereas much of the learning with ML is automatic. Finally, data mining uses an existing data set, such as a data warehouse, whereas ML learns from a training data set and then makes predictions using new data sets (Marr, n.d.).
By using data mining techniques on general ledger transactions, accountants can potentially unlock valuable insights and improve decision-making. For example, auditors can use data mining to help detect fraudulent purchases or outlier transactions that require further investigation. Management accountants can use data mining to assess the financial risk of a business entity, such as “trading partners, corporate affiliates, investment partners, and takeover targets” (Calderon et al., 2003, p. 7).
Data and Big Data / Mining /
♦ Expert Systems and Other Rules-Based or Fuzzy-
Based Reasoning Systems
♦ Deep Learning
- • Structured Dat/
- • Text Mining (Unstructured Textual Data)
- • Other Unstructured Data (Videos, PhotosAudio Files)
Data and Big Data Mining with Al
Figure 1.2 Relationship between (big) data mining and Al.
Text mining is a process of extracting information from various text sources (such as Word documents, PDF files, social media posts, emails, websites, articles, XML files, and others) to discover patterns, trends, and themes. The text found in these documents is typically unstructured, that is, they are not in a predefined format that can be analyzed through data analytics software such as IDEA or ACL. Text mining is performed in two steps: 1) imposing structure on the text data sources and then 2) using data mining techniques to extract relevant information (Sharda et al., 2014).
Text mining is useful in fields that have a large amount of textual data, as is the case with accounting. For example, auditors reviewcontracts, invoices, legal letters, SEC filings, earnings, conference calls, news articles, and much more.
Professor Aldhizer (2017) of Wake Forest University suggests that forensic and audit practices consider using text analytics for high-risk engagements. Aldhizer notes that text analytics could be used for concept extraction to identify incriminating words from social media posts and emails. Text mining could be used to ensure proper reporting under lease and revenue recognition (Iowa State University, n.d.).
Robotic Process Automation (RPA) and AI
Although RPA is not considered a form of Al by some experts (CPA Canada & A1CPA, 2019), this technology enables machines to perform functions usually carried out by humans. Firms are using RPA, coupled with Al capabilities, to increase productivity and performance. Thus, accountants need to understand what RPA is and how it can be combined with Al technologies. Robotic process automation (RPA) is a software application (robot or bot) that automates a business process by replicating the actions of humans performing tasks within digital systems, such as manipulating or transferring data. In many cases, the automation involves minimal coding and is typically activated by using a smart screen recording. Keep in mind that these robots are not physical steel machines found in the old science-fiction movies or television shows. RPA lends itself well to high-volume and repeatable tasks traditionally performed by humans. For example, RPA can be used to process bulk transactions, such as processing vendor invoices for payment.
By using RPA, businesses can achieve increased productivity and accuracy at a lower cost. The exponential growth in RPA adoption has been facilitated by the availability of software providers such as UiPath, Automation Anywhere, and BluePrism. Many RPA providers will allow individuals to have a limited version of their software to begin building bots. Examples of RPA in accounting include performing bank reconciliations, automating cash applications, and tracking accounts payable.