RELATIVE SIZE FACTOR TEST
The purpose of the relative size factor (RSF) test is to identify anomalies where the largest amount for subsets in a given key is outside the norm for those subsets. This test compares the top two amounts for each subset and calculates the RSF for each. The RSF test is based on Chapter 11 of Mark J. Nigrini's book, Forensic Analytics: Methods and Techniques for Forensic Accounting Investigations.6
In order to identify potential fraudulent activities in invoice payment data, one utilizes the largest and the second-largest amounts to calculate a ratio based on purchases that are grouped by vendors; this is often suggested in fraud-examination literature, such as Principles of Fraud Examination written by Joseph T. Wells.7
FIGURE 5.19 Results Displaying Z-Score Sequenced from Largest to Smallest
In Chapter 11 of Dr. Mark Nigirini's Forensic Analytics book, he introduces the RSF test, which expands on this concept and states, "In this chapter we compare large amounts to a benchmark to see how large they are relative to some norm, hence the name the RSF test. The RSF test is a powerful test for detecting errors. The test identifies subsets where the largest amount is out of line with the other amounts for that subset. This difference could be because the largest record either (a) actually belongs to another subset, or (b) belongs to the subset in question, but the numeric amount is incorrectly recorded. The RSF test is an important error-detecting test."
Subsets in a data file are identified as keys in IDEA. An example would be vendors in an accounts payable file. The test identifies records that are outliers to the rest of the amounts within the subset groups. Outliers may not be the largest amounts in the entire data set, but are large in respects to particular members of the subset. Large differences might be attributed to errors such as the record belongs to another subset or that the amount was posted incorrectly (e.g., a shifted decimal point). Large differences may also be an indication of fraudulent activity, such as occupational accounts payable fraud, falsified invoices for HST or VAT input tax credits, offset money-laundering revenue, or product sales to related companies (offshore transfer pricing). In his book, Dr. Mark Nigrini discusses the forensic results regarding investigations of sales numbers, insurance claim payments, inventory numbers, and healthcare claims using the RSF test.
To enhance the information provided by the RSF test, the step-by-step instructions on how to perform the calculations will include a field that shows average amounts if we were to disregard the largest outlier. This will give the auditor a better feel of the data content relationships.
The Average_X_Largest field is the average of all positive or negative amounts (as determined by the user as the first step to the RSF calculations) excluding the largest amount. This field provides an indication of the typical amount within the subset members.
The step-by-step example uses a sample payment data set. The subset or key is the supplier number field (SUPPNO) performing the RSF calculations on the AMOUNT field.
Step 1. In reviewing the field statistics (Figure 5.20) for the AMOUNT field of the "Payment" file, we note that there are three negative values and one zero item. To run the RSF test on positive values, an AMOUNT of greater than 0 should be extracted. Name the new file "Extract RSF-1," as displayed in Figure 5.21.
FIGURE 5.20 Field Statistics of Payment Amounts
Step 2. After the extracting for all positive numbers, we need to create or append a field called RECNO using the @Precno( ) function as shown in Figure 5.22. This will track the physical record number in the data set. The record number will be needed for a future join of files.
Step 3. Using the "Extract RSF-1" file, sort the database by the SUPPNO field in ascending order and the AMOUNT field in descending order, as in Figure 5.23. This
FIGURE 5.21 Preparing the File Extraction for Positive Amounts
FIGURE 5.22 Adding a Field for Physical Record Numbers
FIGURE 5.23 Sorting Largest Amounts by Supplier
will place the largest amount as the first record by supplier number. Name the new file "Sorted RSF-3."
Step 4. Summarize the "Sorted RSF-3" file to give the total number of transactions along with the top amounts per SUPPNO. When performing a summarization, IDEA provides the options to use fields from the first occurrence or use fields from the last occurrence for additional fields. Employing the default use fields from the first occurrence selection, the highest or top amount from step 3 for each supplier number will result in the new file. Summarize by the field SUPPNO plus select Fields to include the AMOUNT and RECNO fields as shown in Figure 5.24. Name the file "Summarization RSF-4."
FIGURE 5.24 Obtaining the Total Number of Transactions by Supplier and the Largest Amounts
Step 5. Perform a join using the "Sorted RSF-3" file as the primary database and the "Summarization RSF-4" file as the secondary database. Use Match Key Felds with RECNO as the common key in both files. Select the records with no secondary match join option. The secondary file of "Summarization RSF-4" contains the largest amounts for each supplier. By creating a new file by using join databases with no secondary match, the new file will not contain the largest or top-most numbers as displayed in Figure 5.25. This will allow us to extract the next highest amounts that are in fact the second largest amounts. Name the new file "Join RSF-5."
Step 6. This is the exact same procedure as in step 3 to obtain the top-most amounts, but now with the actual highest amount removed from the data set. Sort the "Join RSF-5" file by the SUPPNO field in ascending order and the AMOUNT field in descending order as shown in Figure 5.26. We need to do the sort again as the join changes the orders of the records. This will allow us to later obtain the second largest amounts. Name the file "Sorted RSF-6."
FIGURE 5.25 Preparing to Obtain the Second Largest Amounts
FIGURE 5.26 Sorting the Largest Amount by Supplier with the Top-Most Amount Excluded to Obtain the Second Largest Number
Step 7. Summarize the "RSF-6" file to obtain the average for the AMOUNT field.
Summarize by the field SUPPNO with the AMOUNT field as the numeric field to total.
Select average as the statistic to include. IDEA does not seem to allow you to perform a statistic on a field and add it as an additional field. We have to perform this step twice: once for the average and the second time for the top amount in the file in the next step. Name this file "Summarization RSF-7," as in Figure 5.27.
FIGURE 5.27 Calculating the Averages by Supplier
Step 8. Summarize the "RSF-6" ifile on the SUPPNO field and select the field to include of AMOUNT as displayed in Figure 5.28. Note the selection of the default "Use fields from first occurrence." This is identical to step 4 with the exception that RECNO does not have to be selected as a field to include. This will give us the largest numbers in the data set by supplier that are actually the second highest numbers. Name the file "Summarization RSF-8."
FIGURE 5.28 Calculating the Second Largest Number by Supplier
Step 9. Join as the primary file "Summarization RSF-8" to "Summarization RSF-7" as the secondary file. Use the SUPPNO for the Match Key Fields as in Figure 5.29. This will put together the second largest amount and the averages into one file. Include the SUPPNO and AMOUNT fields in the primary file. Include only the AMOUNT_AVERAGE field in the secondary file. Use the "Matches only" option for the join. Call the new file "Join RSF-9."
FIGURE 5.29 Putting Together the Second Largest Amount with the Average Amount
Step 10. Rename the AMOUNT and AMOUNT_AVERAGE fields to SECOND_ LARGEST_AMT and AVERAGE_X_LARGEST respectively, as shown in Figure 5.30.
FIGURE 5.30 Renaming Fields to Identify the Second Largest Amount and the Average Excluding the Largest Amount
Step 11. Join the "Summarization RSF-4" database as the primary file with "Join RSF-9" database as the secondary file. "Summarization RSF-4" contains the largest or top amounts. Use the SUPPNO fields as Match Key Fields as shown in Figure 5.31. Select
FIGURE 5.31 Putting the Largest Amount, Second Largest Amount, and Average Amount Excluding the Largest Amount Fields Together
the "Matches only" join option. For the primary file include the SUPPNO, NO_OF_RECS, and AMOUNT fields. For the secondary file include the SECOND_LARGEST_AMT and AVERAGE_X_LARGEST fields. Name this file "Join RSF-11."
Step 12. Rename the NO_OF_RECS field to COUNT and the AMOUNT field to LARGEST_AMT as in Figure 5.32.
FIGURE 5.32 Rename Fields to Identify the Largest Amount and Count
Step 13. Append a virtual numeric field with two decimal places called RELA-TIVE_SIZE_FACTOR using the equation of LARGEST_AMT/SECOND_LARGEST_AMT as shown in Figure 5.33 .
The final file, indexed by RELATIVE_SIZE_FACTOR in descending order, is shown in Figure 5.34. The auditor needs to decide above what RSF ratio further investigation is required. Knowing the average amounts without the largest amount included in the average calculation is useful for formulating the auditor's decision. It is suggested that RSF ratios above 2.50 should be reviewed. A note of significance is where the RSF ratio
FIGURE 5.33 Calculating the Relative Size Factor
FIGURE 5.34 Resulting File Displaying the Relative Size Factor for Each Supplier
is equal to 10. This would likely be a data entry error of shifting the decimal place one over to the left. An example might be where monthly rental is $3,000 per month and the largest amount is $30,000, resulting in an RSF of 10.
The RSF test is a test for reasonableness within a specific grouping of data sets. It identifies outliers within the group where the amount is too small to be considered as an anomaly when the data set is taken as a whole.