Abstract
Today financial institutions have been investing billions of US dollars to detect money laundering. When financial institutions are found to have their customers conduct money laundering through them, they are subjected to large penalties. Moreover, their reputation suffers greatly through public exposure. In response, financial institutions have been exploring opportunities to use graph machine learning algorithms. This paper describes one of those algorithms called Anti-TrustRank and demonstrates how it can be used to identify money launderers. In contrast to many other algorithms, Anti-TrustRank calls for selecting a very small set of customers to be confirmed by human experts (e.g., compliance officers or analysts) as money launderers. Once this set has been identified, Anti-TrustRank seeks out customers linked (either directly or indirectly) to those money launderers.
Keywords
Introduction
Money laundering is a process that transforms the proceeds of crime into clean legitimate assets. It is often linked to terrorism, drug and arms trafficking, and the exploitation of humans. It involves taking criminally obtained proceeds (dirty money) and disguising their origins, so they will appear to be from a legitimate source. Dirty money is typically “cleaned” thanks to transfers that involve banks and other financial institutions.
The estimated amount of money laundering per year is 2–5% of global GDP (gross domestic product), or $800 billion to $2 trillion [1].
Criminal charges. Big fines. Negative publicity about compliance lapses and penalties. They all damage reputation and deflate public perception. These are just some of the reasons why anti-money laundering (AML) matters for financial institutions.
Anti-money laundering
Beyond the moral imperative to fight money laundering, financial institutions also use AML solutions for compliance with regulations that require them to monitor transactions of their customers, and to report suspicious activities.
However, a manual review of all the transaction data is impractical because of a big number of transactions as well as due to human mistakes – what someone perceives as a normal transactional behavior for a customer, another might consider as suspicious. Therefore, in a typical scenario, transaction monitoring systems will automatically detect suspicious transactions, raise alerts, and route suspicious transactions to compliance teams who will confirm whether these transactions are money laundering or legitimate.
Transaction monitoring systems are an example of AML solutions. Generally, AML solutions fall into three categories: rules-based systems, machine learning systems, and their combination (hybrid approach).
Rules-based systems
Rules-based systems are not new, they have been around for decades. They work by using rules, a set of mathematical conditions to determine what action will be taken. An example of a rule is “If an account shows more than $10,000 in transactions in the past 14 days, then raise an alert”. Compliance teams will act based on the outputs produced by rules – they will decide which alerts to investigate and which to report to regulators.
A rule-based system relies on humans (programmers) who will directly write a set of rules that allow a computer to process input data into useful outputs (see Fig. 12). This system suffices only for solving simple AML tasks such as detecting a transaction when its amount exceeds $10,000.
Machine learning systems
Artificial intelligence is a technology that enables a computer (machine) to simulate a human behavior. A subset of artificial intelligence, machine learning, refers to a computer’s ability to use input data to automatically learn how to perform a task. This ability is especially important when there is a need to solve complex AML tasks where rules typically fail.
For example, with rules, things like basic aggregated information on the number of transactions for an account are commonplace. By contrast, with machine learning, it becomes easier to derive information like whether an account transacted with other suspicious accounts (see Fig. 13).
A machine learning system is not written by programmers directly. Instead, a machine learning algorithm uses training data to create a machine learning model. This data can be, e.g., a set of transactions, some labeled as money laundering whereas others as legitimate transactions. Attributes in the data are known as features. They can include attributes that are found in the data itself (e.g., transaction time, location, amount, etc.) as well as computed attributes (e.g., the average transaction amount for an account, the total number of transactions in the past 14 days, etc.).
After its creation, the model can take in new input data and process it into useful output. In that way, we can understand the algorithm as a program that writes another program (i.e., the model).
The machine learning process consists of two main phases: training (learning) and evaluation [2]. These phases can be repeated as often as needed to incrementally improve the model accuracy.
In the training phase, a computer is fed with past input data and the expected output. By leveraging a machine learning algorithm, the computer uses this historical information to create a machine learning model, which represents patterns that it has detected in the training data (see Fig. 14).
In the evaluation phase, the computer uses the model to predict output for new input data (see Fig. 14). In that way, the system can produce a probability that the new input data matches one of the labels from the training data (e.g., whether a transaction is money laundering or legitimate).
Hybrid approach
The emergence of machine learning does not mean that financial institutions will get rid of rules. In fact, none of them is ready to abandon rules-based systems and fully replace them with machine learning systems. Some rules are so valuable (e.g., those with more than 90% accuracy) that it would be foolish not to use them for flagging suspicious activities. Therefore, financial institutions are seeing a hybrid approach – use rules where they do the job, use machine learning models where rules would fall short. For example, machine learning models can uncover hidden links and complex relationships, unearthing new patterns of suspicious activities and behaviors that were previously undetectable by rules.
To help understand why using rules and machine learning models together can be effective, let’s discuss the challenges of the two systems. A challenge of one system is usually a silver bullet for another system and vice versa.
Challenges of rulesbased systems
A significant drawback of rules-based systems is the rules themselves. This drawback aggravates when there are thousands of rules, and hundreds of money laundering patterns and schemes that have evolved over the years. Not only is a system slow and expensive, but money launderers have learned ways to circumvent rules to avoid discovery. Not to miss any suspicious activity, financial institutions prefer to have false positives. But this comes with the cost of getting too many unnecessary alerts.
False positives
A false positive is a case where a system generates an alert pointing out that a suspicious activity has happened which, after a manual review, results in a non-suspicious case.
The price that financial institutions must pay due to false positives is, first of all, wasted resources like compliance officers and analysts. Financial institutions estimate that manual reviews account for almost 25% of the total cost of money laundering [2]. This is because money laundering cases are relatively rare. Typically, they are just 0.1–3% of the total number of suspicious transactions [3]. On the other hand, false positives concern up to 99% of the alerts produced by rule-based systems. That is, the system can be wrong 99% of the time – for every 1,000 alerts, 999 are false positives. This is a huge number. It translates into huge costs in terms of investigation time. False positives also affect customer experience: no customers want to be treated as if they were money launderers when they are actually not.
There are three main reasons for false positives. One is outdated rules. Rules need to be updated instantly to capture the ever-changing customers behaviors. For example, soon after the COVID-19 pandemic outbreak, it became evident that many customers who did not use to make any purchases online started to do so just because of closed shops. In this case, a rules-based system will continue to generate unnecessary alerts until its rules are updated. By contrast, a machine learning system will not increase the false positives rate as its model will re-learn from new data.
Another reason for false positives is that rules are narrow-viewed. Rules are usually based on data such as an account type, customer demographic and geographic information. To make a decision, rules will take into consideration a maximum of four to five features [2]. This narrow-viewed approach leads to more false positives, more workload for compliance teams and higher cost to financial institutions. Even if we can add more features to rules, it will be done at the expense of their performance – such rules will become slower. By contrast, machine learning models will leverage tens to hundreds of features [2], thereby significantly reducing the number of false positives (see Fig. 1).
Machine learning models reduce number of both false positives and false negatives [9].
Yet another reason for false positives is that rules are too strict. Each rule has a threshold. For example, “If a deposit exceeds $10,000, raise a flag”. The problem is that the ideal value for this threshold can change over time, e.g., due to inflation. Machine learning models can re-learn a threshold value from new data and adjust it to those changes. Rules cannot. In 2020, the $10,000 threshold celebrated its 50th anniversary. Since the purchasing power of the US dollar has steadily fallen over the last 50 years, more and more transactions are hitting the $10,000 threshold than were initially intended, resulting in the rapid growth of false positives [4].
As can be seen, not every alert, generated by rules, indicates money laundering. In many cases, the difference between a transaction related to money laundering and a legitimate one is subtle. A threshold is usually unable to detect subtleties in the transaction data, leading to missing actual suspicious cases while getting unnecessary alerts for non-suspicious ones. Just changing the threshold will relieve one problem at the expense of another. By contrast, a machine learning system will analyze the transaction data against its model. Since lots of historical information, including the customer behavioral one, has been used to build the model, the system will enable to detect subtleties in the transaction data much better than rules ever could.
Suppose that a customer travels abroad for a summer vacation. Despite the fact that she is in a foreign country, she is staying at a hotel from a hotel chain that she favors; she continues to have a lunch around noon and her average daily spending is only 5% above her typical daily average. A machine learning system will pick up all this information when deciding that the transaction in question fits the customer “normal” behavior. By contrast, a rules-based system can flag the same transaction as suspicious, taking into consideration the transaction data and threshold only [2].
The biggest advantage of a machine learning system is a significant reduction in false positives while increasing the detection of suspicious activity. For example, one bank had a rules-based system that alerted about 1,000 transactions per day – false positives that the system flagged as suspicious but were not actually illegal or out-of-limits. Just months after turning to a machine learning system, that number was down to about 100 transactions per day – a 90% reduction in false positives [5].
Financial institutions estimate that only 0.1–3% of alerts produced by rule-based systems are filed in SAR (Suspicious Activity Report) [16]. However, it is increasingly common to see compliance teams slapped with the responsibility of reviewing 97–99% of false positives [6]. Depending on the size of a financial institute, compliance teams must investigate around 20–30 false positives per day [7]. Unless financial institutions have sufficient resources like compliance officers and analysts to review every alert produced by rules-based systems, they will fail to scale to this high workload. For example, some banks have backlogs of alerts they simply cannot cope with [6]. Substantial fines – not to mention the social media attention that goes with them – have been levied on those banks for failing to devote sufficient resources to review true positives.
Not being able to scale would also lead to the customer frustration and even disrupt the customer experience. Holding up or delaying legitimate transactions unnecessarily due to the insufficient resources is a rather negative customer experience, which can cause financial institutions to lose their customers. With a limited number of compliance officers and analysts, lowering the number of false positives is a goal. The fewer false positives compliance teams get, the more time they can focus on true positives.
Machine learning systems help to identify and deactivate the 98–99% of cases that are false positives [10]. This will allow compliance teams to allocate more resources for the remaining 1–2% of cases that are more likely to be suspected of money laundering. A 98–99% reduction in the false positive cases also translates into shorter investigation time. For example, one bank achieved a 30% reduction in the investigation time [24]. In another bank, the investigation time was reduced from several weeks to a few seconds for unnecessary alerts [8].
Alerts prioritization
The problem with false positives (i.e., over-flagging) gets even worse if consider that rules-based systems typically do not have a fine-grained scale that can be used to evaluate which transactions or customers are potentially more suspicious than others – they are either suspicious or not.
Each rule is either a “yes” or “no” decision. As a result, rules-based systems overload compliance teams with lots of alerts that have not been prioritized. This causes compliance teams to spend as much time investigating “weak” alerts as they do “strong” alerts.
By contrast, machine learning systems produce a wide range of ranking levels that enables alerts prioritization. This helps compliance teams focus on the “strongest” alerts first, which in its turn translates into a 50% reduction in the investigation time [10].
One approach to alerts prioritization is to build a machine learning model on top of an existing transaction monitoring system to score alerts by their level of suspiciousness. Historical alerts associated with SAR filings – known as productive alerts – can be labeled as money laundering whereas unnecessary alerts can be labeled as legitimate transactions, and a model can be trained on this data to become more intelligent in scoring new alerts. For example, due to this approach, one bank reduced the number of false positives by 33% [9]. Another bank achieved an 80% reduction in false positives with over a 90% detection of true positives [25].
False negatives
A false negative refers to an actual suspicious case that a system was not able to detect.
Although rules-based systems are a quite common practice nowadays, they have proven flawed most of the time just because money launderers do not play by the rules. This becomes evident when looking at the huge fines imposed on financial institutions, e.g., $10.6 billion only in 2020 [10].
There are three main reasons for false negatives. One is, again, outdated rules. Rules help financial institutions combat only known money laundering patterns and schemes, leaving them open to new (unknown) ones. But money laundering is a dynamic crime that continuously adapts to escape its detection. New patterns and schemes emerge all the time. For example, cash was a primary means of money laundering for a long time. But this is no more the case. Cryptocurrency like Bitcoin, Ethereum and Dogecoin has recently become more prevalent – only in 2021 $8.6 billion of cryptocurrency was used for illegal activities, up by 30% from the previous year [12]. The reason is that this scheme offers a combination of anonymity or pseudonymity, ease of use, and the ability to circumvent regulations.
Financial institutions are trying to keep up with new patterns and schemes. But it is an unfair playing field where money launderers are usually one step ahead. Money launderers can change their behaviors every day. But it takes a minimum of three to six months for financial institutions to update their rules [2]. This is because new patterns and schemes can be rather difficult to capture through human inspections. Even when they are captured, it can be rather difficult to confirm them due to long manual reviews, potentially involving many compliance officers and analysts. The reason is that there is a great possibility that manual reviews will result in different outcomes among different compliance officers and analysts, e.g., some would conclude that a transaction is a money laundering, while others would not identify any suspicious activity. This inconsistency often requires a whole compliance team to explore a plethora of subjective judgments, which is time consuming. As a result, rules start to detect new patterns and schemes very late, after they have happened for months or even years.
By contrast, machine learning models can continuously re-learn new patterns and schemes from new data. This helps financial institutions keep pace with the ever-changing behaviors of money launderers and thus, decrease the chances of getting fined.
Another reason for false negatives is that rules are too common. But money laundering has many faces. Not only are there many money laundering schemes, but there are also many variants of them. For example, in the round tripping scheme, money is commonly deposited in a controlled-foreign corporation offshore (preferably in a “tax haven” where minimal records are kept) and then shipped back as a foreign direct investment, exempt from taxation. A variant of this is to transfer money to a law firm as funds on account of fees, then to cancel the retainer and, when the money is remitted, represent the sums received from the lawyers as a legacy under a will or proceeds of litigation [13].
Since there are many unique money laundering cases and their number steadily continues to grow, it is not practical to write rules for all those cases, albeit they can be known. The result is failed detection of false negatives. Even if rules can cover unique cases, it will be done at the expense of their accuracy – such rules will produce many unnecessary alerts. By contrast, machine learning models can cover unique cases, by learning from data.
Yet another reason for false negatives is that rules are so simple that they can be easily cheated. For example, money launderers can guess the $10,000 threshold and make their transactions run “just under the line”, by splitting up a large amount of money into smaller chunks to avoid their transactions appearing suspicious (so called smurfing scheme).
The problem with rules is that they can cover only simple money laundering cases, e.g., whether a given transaction is above $10,000. By leveraging algorithms, machine learning models can cover more complex ones, e.g., they can look for transactions for a given customer where the amount is significantly above the average for that customer. This enables financial institutions to identify a new set of transactions or customers that would otherwise go undetected by rules. For example, while testing the validity of applying machine learning to detect false negatives within its database, one bank identified 416 customers suspected of money laundering, 89 of which were previously missed – 21.4% reduction in false negatives, with no additional false positives [9]. Some other banks had lifts in the detection of false negatives in the range of 30-40%, after implementing machine learning systems [14].
Challenges of machine learning systems
When implementing any machine learning system, one of the most crucial components is data that will be fed into its model. This data is the backbone of the system because it will determine the model’s ability to learn and predict. For that reason, many challenges of a machine learning system are related to data.
Data volume
A machine learning model is reliant on complete training data. However, most financial institutions’ business operations were set up before extensive digitalization occurred, so the information needed to train machine learning models can be recorded only partially or not at all. Therefore, typically only a small amount of data, about 5–10% of the total, constitutes a training dataset [15].
Rules will work from the first transaction or customer, from the first feature. By contrast, machine learning models need lots of data to detect money laundering well. If they had fewer than 500,000 transactions or customers, and fewer than 200 features, they are just not going to work because there was not enough training data [16]. Data volume is important since, when providing lots of data to a machine learning model, it is likely to detect more patterns and produce more accurate results.
What training data does the model need? Any data that compliance teams go through to investigate suspicious cases.
Data diversity
Data diversity refers to the concept of having as many diverse features as possible with as many diverse values as possible from as many diverse data sources as possible.
Imagine what compliance teams would do to decide if a customer had a suspicious activity. They would look for clues in the available data. First, they would look for the customer’s demographic data such as her age and nationality. But this data alone is not enough to make a decision. Then they would search for negative news about the customer on the internet. Still not enough clues though. They would keep going – they would look through the customer’s transactions, starting from the latest ones. This task gets time-consuming and labor-intensive. In fact, the task of collecting data takes up over half of a compliance team’s time [10]. Now envision not only having all this data collected and analyzed but also converted into features that will help compliance teams decide if the customer’s activity is suspicious.
Rules will work even if transactions or customers have just a single feature, whose values are taken just from a single data source (usually an internal one). By contrast, machine learning models require that training data should be diverse. Otherwise, the produced results will be as inaccurate as for rules.
It is essential for a machine learning model to have multiple features to learn from, and these feature values must come from multiple data sources, including external ones. Indeed, apart from internal data sources, there are numerous external ones to consider, e.g., news articles, social networks, public archives, social media, etc. By using the model, the aim is to combine all these diverse data sources and uncover any hidden patterns that could otherwise remain undetected by rules.
On the other hand, data protection laws and regulations such as GDPR (General Data Protection Regulation) can limit the use of external data sources in a machine learning model.
Data quality
The data quality standards for machine learning models are very high. By contrast, rules will operate even on low-quality data, albeit at the expense of their accuracy too.
As an example of low-quality data, consider missing information. Imagine two customers with the same transaction data, both depositing $10,000. Suppose that one is labeled as a money launderer, whereas another is not. If we feed the model with only this information, the model will not be able to differentiate between the two. That is, apart from the transaction data itself, the model needs other information. What if we feed the model also with the data related to the customer’s employment? Suppose that one of them has income through her job, while the other one does not. Feeding the model with this information would allow the model to differentiate between the two [10].
One of the biggest challenges when implementing a machine learning system is the quality of training data for its model. If humans cannot provide good training data to learn from, machines cannot learn either – “garbage in, garbage out”. This means that a model is only as good as its inputs [3].
Since current machine learning algorithms do not have a strategy to identify and overcome low-quality data, financial institutions must provide good inputs for machine learning models themselves. This is often a very time-consuming and labor-intensive task. For example, it took almost one year for one bank to collect training data from its core banking platform and other internal data sources, and to cleanse that data [16]. Another bank spent 80% of its time on data collection and only 20% on the actual modelling effort [16].
Data balance
Rules are not sensitive to imbalance in data. But machine learning models are.
For example, imbalance can be commonly seen with money laundering detection, where a vast majority of transactions will be legitimate, and a very small minority will be money laundering. In other words, there will be a lot more non-suspicious cases than suspicious ones. This means that there will be the very big difference between the two in the dataset (see Fig. 15). As a counterexample, consider men and women. We can say that this dataset is in balance because the number of men will be approximately the same as the number of women (see Fig. 15).
Why is imbalanced data a problem for a machine learning model? To create machine learning models, most machine learning algorithms need to have at least 500,000 transactions [3]. About a half of them must be money laundering. However, current algorithms can hardly meet this requirement because money laundering cases are very rare – there will be a very small number of transactions labeled as money laundering. The result will be that the algorithms will inevitably misjudge suspicious cases.
Suppose that there are 5 cases labeled as money laundering and 95 cases labeled as legitimate transactions. A machine learning model that has been trained on such a dataset could now predict “legitimate transactions” for all new cases and still gain a 95% accuracy. That is, the imbalanced data will bias the model’s outputs towards the most common label, which is legitimate transactions. For that reason, it is critical to provide a machine learning model with as much balanced data as possible [23].
Data bias
Rules are free of data bias. But machine learning models are not. After all, they learn from data, which is collected by humans.
Data bias is a type of an error in which certain entities of a dataset are more heavily weighted or represented than others [17]. Suppose that our training data consists of a set of customers in which all men are money launderers, and all women are not. This does not mean that women cannot be money launderers, and men cannot be legitimate customers. However, as far as our machine learning model is concerned, female money launderers and male legitimate customers just do not exist.
Data bias may occur when a system is skewed for or against certain groups or categories of entities in a dataset. Typically, this stems from imbalanced, erroneous, incomplete, or unrepresentative data being fed into a model. This affects not just the model’s accuracy, but can also stretch to issues of ethics, fairness, trust, and reputation. Therefore, financial institutions must be vigilant to ensure that biased data are not using while training a model. Otherwise, the model will produce biased results, influencing what actions compliance teams will take with the model’s outputs.
Model explainability and interpretability
Traditionally, rules are viewed as white boxes since humans (programmers) write the code for them. In addition to being transparent, rules are simple. This makes them easy to understand and explain, although this simplicity has a backside – rules can be easily cheated.
By contrast, understanding and explaining machine learning models, which are usually more complex than rules, can be incredibly difficult. A model is created by the underlying algorithm, without programmers’ involvement. This means that the model will make decisions based on patterns, which were identified by the algorithm from the data, but those patterns may be unknown by humans.
Imagine giving a demo of a machine learning system to a compliance team. This demo can go well until some compliance officer or analyst asks: “What exactly happens inside the system? Why does it flag this particular transaction as suspicious?” All we can say is: “Nobody knows. It is a black box”. Soon after that, other compliance officers and analysts will start to worry: “How can we trust the system, if we do not know what it does, how it makes a decision and why?” It is a valid concern because for a long time machine learning models were viewed as black boxes [10] (see Fig. 2). But now, we have model explainability.
Machine learning model is viewed as block box.
Model explainability means that we can explain to humans what happens to the data in a machine learning model from its inputs to outputs [18]. It has three important aspects: transparency, ability to question, and ease of understanding. Model explainability solves the black-box problem, e.g., when a compliance team has no idea why produced results are what they are.
Model explainability is also critical for regulators. According to data protection laws and regulations such as GDPR, when any financial institute uses automated decision-making tools, it must provide meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject.
Model explainability is often used interchangeably with model interpretability. Both have the same goal: understand a machine learning model. Model interpretability is defined as the degree to which a human can understand the cause of a decision made by the model or the degree to which a human can consistently predict the model’s outputs [18].
For example, a machine learning model might be interpretable – we can see what the model is doing: it flags the transactions that it considers to be money laundering as suspicious. But the model is not explainable yet. It will be explainable once we explore the patterns in the data. Understanding what features contribute to the model’s decision (or prediction) and what are the relationships between those features is what model explainability is all about.
Machine learning enables a computer’s program to learn from data rather than through explicit programming. This program works by taking example (training) data, finding in it patterns that might be too complex for a human to intuitively see, then applying the findings to new data. When this learning capability is coupled with graph technologies, financial institutions have a recipe for a system that can uncover more complex relationships.
Graph machine learning is a branch of machine learning that models and views data as a graph. For example, a graph could be used to describe the flow of money between a group of accounts or customers in a given time period. Using nodes to represent the entities involved, two nodes could be connected if any money flowed from one to the other. The net amount of money that changed hands could provide a weight for the edges of such a graph, and the direction of the connection could point towards the node that saw a net gain from the associated transactions.
Graph machine learning systems are game-changers in the fight of combating money laundering – they help financial institutions identify patterns based on complex relationships, which would otherwise be invisible to the human eye. For example, graph machine learning systems are used in AML to uncover ultimate beneficial owners, to identify relatives and close associates of politically exposed persons, and to detect synthetic identities. They are also used for entity resolution, working out if multiple records within the bank’s database are referencing the same customer. In addition, graph machine learning systems are beneficial in helping to detect money laundering schemes such as smurfing, round tripping, and online gambling [24].
Currently, only 10% of innovative projects use graph machine learning, but this will significantly rise to 80% in 2025 [21]. According to this forecast, most banks and other financial institutions will become users of graph machine learning systems in the coming years. Therefore, financial institutions have identified the need of adopting graph machine learning algorithms. One example of these algorithms is Anti-TrustRank.
Anti-TrustRank
Anti-TrustRank is a variant of TrustRank, which in its turn is a variant of PageRank.
PageRank [19] is a graph machine learning algorithm used by Google in its search engine to deter-mine the order in which pages will be shown to users in response to their search queries. Pages that appear near the top of search results are considered to be more worthy to visit by users, whereas pages that are shown at the bottom are likely to be spam pages.
The algorithm views the web as a graph, where pages are nodes and links between pages are edges. Regardless of the number of links from a page
The algorithm computes scores for each page. These scores indicate the “goodness” of pages: the higher score a page has, the more likely a user will visit that page. The user starts from a random page on the web and follows a sequence of steps. In each step, the user may take one of two possible actions: (1) select one of the links on the current page and follow that link; or (2) go (“teleport”) to a randomly selected page on the web. Therefore, the score of a page is the probability that the user will end up at that page. Actually, there is a third possibility where the user just leaves the web. However, the algorithm assumes there will always be some number users on the web.
The basic idea of PageRank is that a page is good if it is linked by many other good pages. Initially, each page is assigned the same non-zero score. After computation, good pages will get high scores, whereas spam pages will get low scores.
The formula for computation of a PageRank score for a page is:
where:
Suppose that a damping factor
Since PageRank computes page scores based upon the link structure of the web, one popular technique for increasing the PageRank scores of spam pages is through a spam farm, a set of spam pages organized in a special way. Typically, a spam farm consists of a target page and very many supporting pages. The target page links to all the supporting pages, and the supporting pages link only to the target page. In addition, it is essential that some links from outside the spam farm be created. For example, links to the target page can be introduced by writing comments in blogs, discussion groups or newspapers.
TrustRank [19] seeks to combat spam farms. It is a variant of PageRank, where a user can start only from a page in a teleport set and can go only to a page in that set.
The basic idea of TrustRank is that good pages seldom link to spam pages. If we trust good pages, we can also trust pages referenced from these good pages. Before computation, a set of pages believed to be trusted is created as a teleport set. Each of the pages in the teleport set is assigned the same non-zero initial score, whereas other pages are initialized with zeroes. After computation, good pages will get high scores, whereas spam pages will get low scores.
The formula for computation of the TrustRank score for a page is:
where:
Anti-TrustRank [20] is the inverse of TrustRank. The formula for computation of the Anti-TrustRank score of a page is the same as that for TrustRank. The only difference is that the teleport set consists of pages that are already known to be spam. After computation, spam pages will get high scores, whereas good pages will get low scores.
Traditionally, Anti-TrustRank is used to find out spam pages. However, applications of the algorithm are not limited to the web model. In this paper, we will demonstrate how the algorithm can be used to identify money launderers, based on their relationships with other money launderers (see Table 1).
Mapping
Mapping
Given the postulate “Customers related to money launderers are more likely to be themselves money launderers”, we look for customers that appear near to a set of money launderers (teleport set). In this context, money laundering detection is supervised since customers in the teleport set are labeled as money launderers. After running the algorithm, customers who are not in the teleport set yet but get non-zero scores are the ones we want to find out. The higher scores of customers, the more suspected those customers of money laundering.
Our approach to identifying money launderers goes through the following steps:
Create a graph of customers. Create a transition matrix for this graph. Place customers known to be money launderers into a teleport set. Create an eigenvector for the teleport set. Run the algorithm to compute scores for the customers. Find customers with non-zero scores who are not in the teleport set yet.
Next, we will explain our approach by examples.
Suppose that there are four customers:
Graph [19].

Figure 3 shows a graph, where these customers are represented by nodes, and there are outgoing edges from the customers to their related persons. Examples of the related persons are relatives and close associates of politically exposed persons, business partners, shareholders, beneficial owners, legal representatives, and authorized signatories.
Figure 4 shows a transition matrix
Suppose that the customers
Figure 5 shows an eigenvector
Eigenvector 
Figure 6 shows estimates of a score vector
We computed the score vector

As can be seen, at the end of the computation, the customers
In our computation, we used a damping factor
Rule-based systems usually have a maximum of four to five alerts levels, e.g., “low risk”, “medium risk”, “high risk”, “very-high risk”, and “non-acceptable risk”. Even at the highest level though, this can correspond to hundreds if not thousands of alerts per day. Our approach comes with an advantage that enables to view customers ordered by their probability of being suspected of money laundering. This way compliance teams can focus on the most suspicious cases, saving investigation time and working more efficiently.
Figure 7 shows a graph, where customers from the teleport set of money launderers are shown in black, whereas customers identified by the algorithm as suspicious are shown in grey and non-suspicious in white.
As can be seen, the probability of being suspected of money laundering decreases with increase in the distance between customers and the teleport set. This means not only can the algorithm identify customers who are likely to be money launderers, but it also enables to prioritize those customers in terms of their suspiciousness. The higher scores of customers, the more suspicious (“grey”) they are.
Graph with Anti-TrustRank scores of customers.
By contrast, rule-based systems will produce the same alert for all the customers who are related to money launderers, regardless of whether those relationships were direct or indirect. This causes compliance teams to spend as much time investigating “light grey” customers as they do “dark grey” ones.
Suppose that there are two customers:
Figure 8 shows a graph, where these customers are represented by nodes, and the only outgoing edge is from
Graph with dangling node 
Graph with chain of dangling nodes.
Also suppose that there is a rule: “If a customer has a related person who is a money launderer, then raise an alert”. This rule will be triggered for the customer A if the customer
By contrast, in the second case, the algorithm identifies the customer
On the other hand, in the first case, the dangling node will cause
In Fig. 8, a distance between the two customers is very short:
Even if there were another rule like “If a customer is a related person of a money launderer, then raise an alert”, this rule would not resolve the second case because a maximum of four to ten customers in the chain of dangling nodes would be taken into consideration. Otherwise, the rule would become very slow. By contrast, the algorithm performance will not degrade as more dangling nodes are added to the chain.
Dangling nodes can also be organized in a tree, headed by a money launderer, as shown in Fig. 10.
Graph with tree of dangling nodes.
Graph with loop of nodes 
Suppose that there are four customers:
Figure 11 shows a graph, where these customers are represented by nodes, all having the only outgoing edge. In this graph,
Once we enter the loop of nodes
Example 5: Comparison of rules with Anti-TrustRank
How rule-based system works [2].
Simple vs. complex tasks [21].
How machine learning (ML) system works [2].
Balanced vs. imbalanced datasets [23].
How Anti-TrustRank succeeds where rule fails.
So far our discussion of the algorithm concerned only one type of entities: customers. But entities can also be transactions. Next, we will demonstrate how the algorithm can be applied to transactions as well, while taking advantage of their “linked” structure.
Financial institutions are obligated to report regulators all transactions that are above $10,000. Thus, depositing a large amount of money is not an option for money launderers. However, by enlisting the help of their friends, relatives, neighbors or associates, money launderers can have those individuals deposit smaller amounts of money into their accounts and then have that money wired to the money launderers’ own accounts (so called smurfing scheme).
Imagine a rule “If an account shows more than $10,000 in transactions in the past 14 days, then raise an alert”. This rule will view a transaction under the $10,000 threshold as a non-suspicious one. By contrast, the algorithm will help financial institutions to identify that this same transaction is related (“linked”) to another transaction that has been flagged as suspicious in the past. What if the customer is prudent enough to launder money in a way that is consistent with the rule, by depositing the money to an intermediary party instead of depositing it directly to her account? Detecting such a complex pattern within billions of transactions is more challenging than simply checking whether a transaction on the customer’s account matches the $100,000 threshold. By leveraging the algorithm, a machine learning model can succeed in complexity and flag this same transaction as suspicious.
Figure 16 shows that the previous rule produced many false positives but also missed many false negatives. At the same time, we can see that the algorithm enabled a significant reduction in the number of false positives, while greatly improving the model’s ability to detect false negatives.
The work that most closely comes to ours is using PageRank for fraud detection [22].
This algorithm views accounts as nodes, and transactions transferring money from one account (source) to another (destination) as edges connecting nodes. By doing so, the algorithm builds a community of trusted relationships among account holders (customers). This community is then used by the algorithm to identify fraudulent transactions. More specifically, customers perform legitimate transactions inside their community. Any transactions performed outside the community are considered to be suspicious. The basic idea behind using the algorithm is that accounts with higher scores are more trustable.
Trust is a transitive property, which is distributed through the link structure of accounts. For example, if an account
PageRank helps to reduce the number of false positives by up to 44%, with no additional false positives.
Conclusion
“Show me your friends and I will tell you who you are”. As this Greek saying states, our friends are responsible for a big portion of who we have become and who we will be.
Similarly, in the context of money laundering detection, we can think of customers as a product of related persons they are surrounded with, where related persons include (but are not limited to) relatives and close associates of politically exposed persons, business partners, shareholders, beneficial owners, legal representatives, and authorized signatories. The “nearer” customers are to money launderers, the more likely these customers are to be money launderers themselves.
Since relationships are key to understanding what a suspicious activity is and who is doing the crime with whom, we decided early on using a graph machine learning algorithm called Anti-TrustRank for money laundering detection [11]. This algorithm helps to address the main challenges of machine learning systems: a need for the massive amount of diverse, balanced, complete, high-quality input data. By contrast, the algorithm requires only a teleport set, consisting of very few money laundering cases confirmed by compliance teams. If financial institutions have rule-based systems already in place, then they can go to these systems to extract money launderers and thus, to provide enough input data for the teleport set.
The algorithm comes with one more benefit: maintenance within minutes or even seconds. For example, to cover new money laundering cases, all what financial institutions need to do is just to add new money launderers to the teleport set.
Future work
In the future, we will assign weights to edges (relationships). In such a weighted graph, the probability distribution of money launderers’ scores to their related persons can be made proportional to the types of their relationships. For example, business partners, shareholders and beneficial owners can be weighted more than legal representatives and authorized signatories. However, this may not be accurate for shareholders and beneficial owners, who have shares. Therefore, we can weight edges also based on the share amount. Our hypothesis is that a high share amount will indicate a high level of untrust.
