Cyber insurance policy issuance and fraud detection

We now present models surrounding cyber insurance fraud. As a first step, we develop a model to help an insurer decide whether or not to issue a cyber insurance policy, given the possibility that a customer might commit fraud. We describe this model in Section 4.5.3. Then, we develop a model to assist the insurer in determining whether or not to classify a cyber insurance claim by a customer as fraudulent . We describe this model in Section

4.5.5.[1]

There is a high level of insurance fraud in other areas, such as health insurance (Ekin, 2020), but the relatively low implementation of cyber insurance policies has moderated this risk in the cybersecurity field so far. However, the growth of cyber insurance and rise in the number of cyber attacks increases the likelihood of fraud, e.g. companies may be tempted to take advantage of major global cyber attack campaigns to file fraudulent claims. A new paradigm is therefore needed for fraud detection, which must efficiently detect fraudulent actions from multiple sources using large and diverse amounts of information within a relatively rapid time frame. As an example, we consider the use case of an insured company that exploits a widespread ransomware attack campaign to commit insurance fraud.

4.5.1 Use case

In order to insure against the risk of a cyber attack, a professional services company decides to take out a cyber insurance policy. The company provides advice on topics including legal and regulatory issues and business strategy. The potential impact of a cyber attack on the company includes data loss, brand damage, loss of clients, and regulatory fines. The assets that need to be protected involve customer data and business intelligence. The company does not have a high level of cybersecurity readiness. Their safeguards consist of a commercial antivirus product, a firewall for the company’s internet gateway, and a data backup solution deployed internally.

Later, a widespread ransomware attack campaign occurs that infects a number of companies around the world. The professional services company is not hit by the attack, but the company CEO decides to take advantage of this in order to commit fraud. He instructs a senior IT employee to secretly make a full backup of the company data, and then to intentionally infect the company’s servers with the ransomware. The company files an insurance claim for the loss of critical business data, although it actually has a copy of this data in a secret location.

4.5.2 Current approaches

There are a number of big data analytics techniques and machine learning algorithms for detecting fraudulent claim filings, and research continues to progress rapidly in these areas. Big data analytics techniques link and process large amounts of data from a variety of sources (e.g. customer activities and behaviour, social networks, public databases, and claims history) to detect unusual patterns associated with fraud (Verma and Marchette, 2019). Machine learning algorithms employ training and validation data sets to build predictive models capable of adapting to emerging fraud behaviour, making it useful for detection as well as prevention. Robust fraud detection mechanisms integrate a combination of both big data analytics techniques and machine learning algorithms to evaluate and flag business operations susceptible to fraud using decision rules (e.g. statistical outliers, dissimilarity/suspicion scores, and risk scores). They are used to assist and complement manual checks in internal and external audits for fraud control.

These tools can be used to detect fraudulent claims involving cyber insurance as well, although the absence of sufficient cyber claims data makes their use more limited at present . In the meantime, some of the strategies that insurers can use to detect cyber claims fraud include investing in alternative quality controls and audits and accounting for fraud-related losses when pricing cyber risks.

4.5.3 Cyber insurance policy issuance: Model formulation

We now present the model that we have developed to assist an insurer in deciding whether to issue a cyber insurance policy to a potential customer, taking into account the possibility that the customer might commit fraud. We describe the insurer’s decision-making process as a Bi-Agent Influence Diagram shown in Figure 4.4.

There are two agents: the Insurer (designated I) and the prospective Customer (designated J). Nodes involving just the Insurer are white and nodes involving only the Customer are gray; striped nodes are relevant to both agents. The Insurer’s decision as to whether or not to grant an insurance policy (designated г) is modelled as a decision node. The Customer has an organisation profile and features (designated ft) which is modelled as a deterministic node. The threats (designated t) that the Customer faces are an aggregate of the different threats described in Section 4.3. These threats (and their impacts) determine the likelihood and the size of a claim, as discussed in earlier sections, and are thus modelled as an uncertainty node. They are therefore a key component of a claim (designated cl), which is also modelled as an uncertainty node.

Whether or not the Customer decides to commit fraud also has a major impact on claims; the Customer’s fraud decision (designated fr) is modelled as a decision node. Should the Customer put in a claim, the Insurer (or their cybersecurity auditor) typically performs a forensic investigation regarding the claim and issues an audit report (designated d), modelled as an uncertainty node. If the investigation does not find any indication of fraud, then the Insurer reimburses the Customer’s claim; reimbursement (designated r) is modelled as an uncertainty node. Both the Insurer and the Customer aim to maximise their preferences, or expected utilities. Insurer utility (designated uj) and Customer utility (designated uj) are both modelled as value nodes.

BAID for cyber insurance policy issuance

Figure 4.4: BAID for cyber insurance policy issuance

4.5.4 Model solution

The Insurer’s decision is a standard ARA problem with a twist involving having to forecast whether the Customer will commit fraud, which is a strategic uncertainty. To model the Insurer’s decision, we first consider the Customer. As in the Attacker problem in Section 4.3.4, we model the Customer’s decision as an uncertainty and use random utilities Uj and probabilities Pj to build the Customer’s expected utility model and find their (random) optimal fraud decision

We use it to assess the Customer’s probability of committing fraud, from the perspective of the Insurer, as

which is estimated using a Monte Carlo simulation.

This, in turn, feeds into the Insurer’s preference, or expected utility, regarding whether to issue the insurance policy to the Customer

Finally, we maximise the Insurer’s expected utility to decide what, if any, insurance product from a catalogue L should be offered to the Customer

As with the other models presented here, this model serves as a template that can be extended or modified. For example, the model could be expanded to incorporate Customers’ organisational behaviours affecting cybersecurity effectiveness (e.g. adherence to security policies or implementation of security controls). This would be useful for the Insurer when deciding whether or not to grant an insurance policy. In addition, more customer nodes could be added and the claims node could be bifurcated according to different types of claims. This would allow the Insurer to take more complexity into account. Furthermore, an adversarial threat node could be used in place of or in addition to the threat node in order to be able to assess the potential impact of a specific cyber attack on claims. This could be particularly relevant during a major cyber attack campaign like WannaCry or NotPetya.

4.5.5 Cyber insurance fraud detection: Model formulation

We now introduce the model that we have developed to assist an insurance company in determining whether it should classify a particular claim by a customer as fraudulent.0 We model the insurer’s decision-making as a Bi-Agent Influence Diagram in Figure 4.5.

There are two agents: the Insurer (designated I) and the Customer (designated F). Nodes involving just the Insurer are white and nodes involving only the Customer are gray; striped nodes are relevant to both agents. The type of claim (designated y) refers to whether a claim is fraudulent (designated +) or legitimate (designated —) and is modelled as an uncertainty node. In turn, the type of claim у affects the claim features (designated ж), which also includes other characteristics such as the organisation profile and features; it is modelled as an uncertainty node as well.

Given that if the Customer commits fraud he does so by modifying the claim features x, the Customer’s decision as to whether or not to commit fraud is referred to as Customer modification (designated m) and it is modelled as a decision node. The modified claim features (designated x') is the modified claim information that the Insurer receives, which is also an uncertainty node. We consider only integrity violations, i.e. cases in which the Customer’s modifications are for the purposes of committing fraud. That is, we do not take accidental or erroneous modifications by the Customer into account. Based on this information, the Insurer must decide whether or not to classify the claim filed by the Customer as fraudulent; the Insurer decision (designated yc) is modelled as a decision node. Both the Customer and the Insurer seek to maximise their expected utilities. Customer utility (designated up) and Insurer utility (designated uj) are both modelled as value nodes.

4.5.6 Model solution

This may be seen as an ARA adversarial classification problem (Naveiro et ah, 2019).

The Insurer’s problem

The key elements needed to model the Insurer’s decision regarding whether or not to classify a claim as fraudulent are:

  • pi(y), which describes the Insurer’s beliefs about the distribution of the type of claim у (that is, her beliefs about the relative proportion of fraudulent and legitimate claims);
  • Pl(xy)- modelling the Insurer’s beliefs about the distribution of the claim features x given
  • 5The company will typically use a classifier algorithm which examines the Customer's claim for particular characteristics and flags it if it deems there may be a chance of fraud.
BAID for cyber insurance fraud detection

Figure 4.5: BAID for cyber insurance fraud detection

the type of claim y, when the Customer is not taken into account, thus needing p/ (x| +) and p/(a;| —) (that is, her beliefs about the kind of features that fraudulent and legitimate claims typically display);

  • pi{x'm,x), which models the Insurer’s beliefs about the modified claim features x' given the Customer modification m and the claim features x (i.e. her beliefs about whether the claim has been modified and turned into a fraudulent claim). If we consider only deterministic transformations, it will actually be the case that pi(x'm, x) = (x' = m(x)), where у is the indicator function;
  • ui(yc-y), describing the Insurer’s utility when she classifies a claim as yc and the type of claim is у (that is, her utility when she classifies a claim correctly or incorrectly, including classifying it as fraudulent when it is legitimate and classifying it as legitimate when it is fraudulent); and
  • pi (mx, y), portraying the Insurer’s beliefs about the Customer’s modification m, given the claim features x and the type of claim у (that is, whether the Insurer believes the Customer has filed a fraudulent claim or not).

In addition, we assume that the Insurer can approximate the set A(x) of possible fraud attempts for a given claim features x. When the Insurer observes x', she can compute the set X' = {x : m(x) = x' for some mA(x)} of instances potentially leading to x'. She should then aim to classify the claim as either fraudulent or legitimate yc in such a way that maximises her expected utility uj, taking into consideration that the possibility of the Customer committing fraud modifies the probabilities pi{x'y).

Therefore, she must consider the Customer’s potential modification m of the claim features x according to the probabilities pi{x'. x, my). In our context, this means that she must find the classification of c(x') such that

Furthermore, expanding the last expression and computing, we have

Given that we only consider modifications that are fraudulent, we have pj(mx, —) = x(m = id), where id stands for the identity attack that leaves the claim features x unchanged. Then, simple computations lead to

where р/(тх_>ж/ x, +) designates the probability that the Customer will try to commit fraud by transforming x into x', when (x,y = +).

The above assessments are standard (Clemen and Reilly, 2013), except for р/(тж_>ж/|а;, у), which requires the Insurer to take into account elements of the Customer’s strategic thinking.

The Customer’s strategic thinking

We now consider the Customer’s decision-making process. We assume that the Customer aims to modify the claim features x to maximise his expected utility uj, which is attained by making the Insurer classify fraudulent claims as legitimate.

The Customer needs to consider the Insurer’s decision about whether or not to classify the claim as fraudulent yc as an uncertainty. Suppose, for now, that we have the following information available about, the Customer:

  • pj{x'm,x), describing the Customer’s beliefs about how skillfully he has modified m the claim features x into the modified claims features x' that the Insurer evaluates (that is, his beliefs about whether he has convincingly made the fraudulent claim seem legitimate). As for the Insurer, we make pj(x'a,x) = x(x> = m(x))>
  • uj(yc,y,m), which describes the utility of the Customer when the Insurer classifies the claim as yc, the type of claim is y, and the modification is m (that is, when the Insurer classifies a claim correctly or incorrectly, including classifying it as fraudulent when it is legitimate or classifying it as legitimate when it is fraudulent). There are some implementation costs which are reflected here as well; and
  • pj(c{x')x'), which models the Customer’s beliefs about how the Insurer will classify the claim when the the Insurer observes x' (that is, his beliefs about whether the Insurer will classify the claim as legitimate when she examines the claim he has filed).

We designate by p = pj(c(m(x)) = +|m(ar)) the probability of the Customer admitting to the Insurer that he has filed a fraudulent claim, given that she discovers the falsification x' = m(x). Since she will have uncertainty about the probability of this occurring, we denote its density by fj(pm(x)) with expectation pJmtx

Among the different possible modifications m, the Customer would choose that maximising his expected utility

As we assume that the Customer does not modify the claim when it is legitimate, we only consider the case in which у = +. Then, the Customer’s expected utility when he engages in modification m and the situation is (x, у = +) will be

Maximising the Insurer’s expected utility given the Customer’s strategic thinking

However, the Insurer does not know the utilities uj and expectations pJm^ of the Customer. We model her uncertainty through a random utility function Uj and a random expectation Р'^(ж). We can then solve for the random optimal modification, optimising the random expected utility

and make pi(mx^x' |ж, +) = Pr(M*(x, +) = rnx^x'), assuming that the set of modifications is discrete. We use a Monte Carlo simulation to estimate such probabilities. This would feed problem (4.4) which can now be solved.

An operationalisation of the above framework in the context of spam detection may be seen in Naveiro et al. (2019).

  • [1] Note that none of the information in either of these sections is intended to imply that customer claimsare fraudulent by nature.
 
Source
< Prev   CONTENTS   Source   Next >