Fraud Invaders - Christopher Hillman

Document created by on Apr 29, 2015Last modified by on May 12, 2015
Version 2Show Document
  • View in full screen mode


Insurance Industry



Christopher Hillman




About the Insights

This analysis is a rapid way of detecting probable fraudulent Insurance claims. The suspect claims appear as alien like invaders on a planet or bacteria like bugs on an otherwise pure cell.

Fraudsters often leave tiny data traces in the claims detail and call center notes such as a common address, phone numbers, emails, bank accounts, registration details, doctors or lawyers and so on. This data visualization shows the connections between all good insurance claims and the already identified fraudulent claims. Each dot (or node) in the image represents an individual insurance claim and so the whole circle represents every claim. The large nodes are claims that have been investigated and found to be fraudulent. The smaller nodes are good claims and have not yet been investigated, so they could be either fraudulent or genuine. The lines (or edges) between the nodes show where there is a connection between the claims. This could be re-using the same phone numbers, addresses, bank account details, email addresses, registration details etc. The thicker the lines and the more red they are means they have high connections ie more than one common email, address, phone number etc.

From the analysis we can now easily pick out the clusters of potential fraud claims such as the alien bug like invader shape at 7pm on the circle, where there are good claims with common connection points to fraudulent claims. We can quickly isolate all the un-investigated claims that are highly connected to confirmed fraudulent claims. The end output is a list of claims and their connections points to prior fraud claims that are sent to the fraud department for investigation. They result in very high success rates.


About the Analytics

This visualization was created using Teradata Aster and Aster Lens for visualization. It uses detailed claims data, which typically means hundreds of GB up to Terabytes along with text from the call center agents dealing with the claims. The data was loaded into Teradata Aster database for analysis.

Policy numbers enable us to link the call center agent text data to the claim data. Finding the common or repeated connection is harder as it exists mostly in text form. Most of the detailed connections data was acquired by text mining the claims forms and call center notes using native Aster text mining functions such as the Named Entity Recognition algorithm. The output was used to identify any repeated data that could be found between any two claims and used to create an underlying node-edge table. This was visualized as a graph using Aster Lens and the ForceAtlas2 display algorithm.

Tweet #AsterCommunity