How Mastercard’s NuData uses the power of the cloud and machine learning to improve fraud detection

Fraud-detection service harnesses AWS cloud services to help businesses defend against account scams fueled by bots and human farms.

istock 1090870004
iStock

Cyberattacks continue to dominate the headlines. Attempts at digital fraud shot up during the first four months of 2021, especially in the financial services industry, where they ballooned 109% in the U.S. and 149% globally compared to 2020’s final four months. But thanks to behavioral analytics, machine learning, and the performance and scale of the cloud, the good guys are fighting back.

Case in point: NuData Security, a Mastercard company, continues to build out powerful fraud-detection capabilities using a variety of AWS cloud services. Its NuDetect service analyzes and correlates petabytes of data each day, helping banks, insurance companies, e-commerce sites, and other businesses thwart unauthorized access and account-takeover attempts.

Built in 2008 in a private data center, NuDetect migrated gradually to the public cloud as AWS grew its infrastructure and rounded out its portfolio of managed services and tools, says Justine Fox, NuData’s director of software engineering.

Today, NuDetect runs exclusively on AWS, using a combination of machine learning and sophisticated rules engines to filter out malicious behaviors with minimal impact to legitimate users. “Data is in an Amazon S3 data lake and everything just flows from one place to another,” says Fox, “freeing us to focus on feature additions, cost optimizations, and other value-added activities.”

The effort appears to be paying off: NuData claims over 99% accuracy while maintaining a sub-0.1% false-positive rate. This amounts to millions of attacks mitigated daily, while protecting over 100 million accounts every month.

As its customer base grows and the volumes of behavior data to be analyzed and correlated proliferate, NuData can quickly scale by spinning up AWS workloads on the fly, rather than having to continually procure, test, and deploy data center equipment. Since deploying Amazon SageMaker, AWS’s comprehensive machine-learning service, in 2016, NuData “has shaved years off some of our projects,” says Fox. It reports also seeing a “60% to 70% velocity increase” in determining fraudulent activity.

Bot vs. human

NuData focuses on detecting two primary types of malicious activity. One is automated application attacks, which use bots to try to exploit vulnerabilities in web applications. The other is human farms—large pools of outsourced labor hired by criminals to create bogus accounts and fool captcha programs and other bot challenges. Bot challenges attempt to distinguish between human and machine input, often by requiring a user to type in a code shown or identify photos containing a common element.

With each login attempt to a NuData customer’s monitored service, that service calls out to NuDetect “to verify that the one accessing the service is human and that it’s the same human that accessed the account last time,” Fox explains.

Protecting against both types of attacks is important because a successful account hack not only opens the door for unauthorized charges to an account. It also often leads to unauthorized access to other pieces of information that bad actors can use to create other accounts on the legitimate account holder’s behalf.

In the interest of privacy, says Fox, the NuDetect system minimizes the use of data that reveals personally identifiable information (PII) and focuses instead on the “unique indicators of fraud.”

Examples of such indicators might include somebody traversing a website or mobile app faster than is humanly possible, which could signal the use of a bot. In some instances, though, whether bots are “good” or “bad” might depend on the situation. There might less risk in letting bots through in one instance, like gathering data for a report, than others, such as a bank account, where there’s a lower tolerance for automated access, Fox notes.

Behavioral analytics based on passive biometrics is where NuDetect “really shines,” says Fox. Strength in this area drives the company’s high level of accuracy in validating authorized access or flagging suspicious access attempts.

Passive biometrics compares data that it determines is unique to the account holder, looking for discrepancies. Examples are as intricate as the angle at which the account holder typically grips a cell phone, typing pattern on a keyboard, screen-swipe speed, or how that individual moves a mouse.

The biometric data is gathered whenever the account holder logs in, and if it should change, NuDetect flags that attempt as anomalous.

Under the hood: Trust Consortium

To process and analyze the vast quantities of data required for highly accurate detection and machine learning, NuData designed and built a data lake feedback loop on Amazon’s S3 called Trust Consortium. S3’s 11-nines reliability SLA makes it a strong foundational choice for NuDetect, which needs to be available 24/7, says Fox.

The company creates an S3 data lake for each microservice it builds, with Amazon EC2 and AWS Lambda delivering compute services. Amazon Kinesis Data Firehose aggregates anonymized and encrypted data events—such as login attempts, new enrollments, multiple simultaneous account sessions, and any other account activity—across customers in real-time and applications and feeds a selection of them to the consortium. “All our insights are derived from S3 datasets,” explains Fox.

When a new event such as a login occurs, NuDetect scores it based on the information from the event and the history of the account. In 300 milliseconds or less, data from the event moves to the consortium for more context, such as whether its IP address or device ID is linked to past fraud in another environment.

“As end-user requests come in, users’ log data is admitted and then later analyzed to generate a dataset of reputation-style intelligence,” says Fox. “For example, it might say that this IP address was bad last week, so we probably don't want to trust it today. Then over time, restrictions for that particular IP address or other sanitized data point loosen up.”

NuData customers often implement NuDetect as part of a broader fraud detection strategy, Fox notes. For example, a preconfigured set of NuDetect APIs and SDKs can be integrated with one or more customer applications or platforms, such as websites and e-commerce systems, for the NuData service to monitor. Every request against the protected application runs through NuDetect’s S3-based rules engine and Trust Consortium feedback loop.

The consortium gets smarter as it identifies and processes more fraud scenarios, helping the system—which uses both Amazon SageMaker and home-grown machine-learning algorithms, in addition to human expertise—to continually identify and thwart new types of unusual behaviors.

The human element remains critical to the success of the service. "Humans are a great part of pattern recognition and figuring out what's normal for one customer versus another,” says Fox.

The consortium is accessible via managed AWS offerings such as the Amazon Athena serverless interactive query service, Amazon QuickSight business intelligence service, Amazon Redshift data warehouse, and Amazon EMR big data provisioning and management service. Datasets are transformed into a general-purpose format and used to create localized lookup tables in Amazon DynamoDB, allowing all NuDetect services access to the feedback loop insights.

Microservice approach

The NuData platform consists of about 26 microservices currently, each created around customer use cases for detecting a particular type of fraud attempt.  For example, web browsing activities all tend to follow a certain format, says Fox. "If that format is broken, it’s indicative of a bot.”

Another microservice identifies session uniqueness. “If one single session is using a cell phone, laptop, and another device, that session is no longer unique and could indicate multiple users sharing an account, a bot, or something else.”

When customers come to NuData with a particular problem or situation, NuData will often create a customer-specific version of NuDetect that could evolve into a microservice that’s part of the NuDetect full stack if it becomes generally applicable to any customer. One new microservice product being honed for possible general availability this year is called Trusted Device.

This microservice gathers data about the account user’s device based on the behavior patterns of “good” logins and spots any deviations during access or login attempts.

“For example, if you've been a Mac user for 10 years, how likely are you to move over to a Windows PC on a random Tuesday? If that occurs, and other behavioral parameters are also suspicious, maybe we should throw up a captcha or send a [multifactor authentication] prompt to increase the security of your account,” says Fox. “Ultimately, the client decides what step-up they want to automate, but NuDetect can automate it for them, so that depending of the type of risk, there will be one kind of step-up or another.”

Similarly, it's not unusual for a given device to have updated its operating system or web browser software by a version or two between logins. “But if that device is suddenly running software versions that are three years older than those running during the last login, that’s concerning.”

Measuring success

Advances in AWS infrastructure and managed services have been pivotal to honing the NuDetect system’s breadth and accuracy. Fox describes the cloud impact as “immense and empowering.”

Reducing investments in data center equipment and operations has also helped drive innovation at NuData. “Over time, we’ve offloaded any undifferentiated heavy IT lifting to an AWS managed service,” Fox says.  That allows the company to “reinvest staffing time and resources and really double down on what makes our business unique.”

Learn more about ways to reinvent your business with data.

Copyright © 2021 IDG Communications, Inc.