Credit Card Fraud Detection Using Machine Learning

Dexter Barahona
7 min readJun 12, 2020

--

Imagine you’re working your job, getting paid, and living a paycheck to paycheck kind of life, then your credit card is charged 200$ at the grocery store nearby…but you haven’t bought anything and your hard-earned money is just… gone! This charge is a telltale sign of credit card fraud and honestly, it happens to the best of us!

Credit card fraud is the action when someone malicious, a fraudster or thief, uses your credit card to make unauthorized transactions.

This type of fraud is not at all rare, in 2018 credit card fraud accounted for a $24.26 billion loss. This can happen to anyone! Although that shouldn’t scare you as there are so many ways that you can prevent this from happening and protect yourself!

How Does Credit Card Fraud Happen?

  • The person looking for your information can simply look through your trash, finding receipts or even credit statements that have your account number on them can allow for credit thief experts to make you a victim of credit card fraud.
  • This one is something you’ve probably never thought about.. and its a bit hard to avoid but, a malicious waiter that grabs and snaps a picture of your credit card when you’re not looking!
  • Many fraudsters create fake websites to trick you into posting your credit card number. These can include insurance websites, deals, and even websites that mimic others.
  • You may even be caught in a credit scam while at a gas pump or ATM. HOW? Credit card skimmers can steal your credit card information. A credit card skimmer is a small device that thieves can install anywhere you swipe your card. Skimming has proved to be an effective way for thieves to steal credit card information.
  • Credit card information can even be exposed, while it is not your fault at all, data breaches happen all the time, and it may even happen to one of your favorite places to shop.
  • Stolen credit card numbers are even sold through the dark web, this would be one of the easiest ways for someone to obtain stolen goods, especially because these transactions cannot be tracked.
  • Even people that you allow into your home, guests, service workers all have direct access to being able to obtain your credit card information, stay cautious, and weary!

When it comes to legal action, do you have a legal liability?

The law protects its citizens well when it comes to credit fraud. If you lose your credit card and pronounce it lost or stolen before the transactions are made, you are not at all responsible for unauthorized charges. If the fraudster were to steal only your credit card numbers, and not your actual credit card, don’t worry at all! — the law has you covered.

Ways to Protect Yourself

  • Regularly review your transactions. This could be the easiest way in finding out if you have been a victim of credit fraud, once you see a transaction that doesn’t make sense make sure to contact your bank, they would help you figure out the next steps for retrieving your money back.
  • Protect your information. Anyone can have access to your credit card information, especially if you leave it lying around for anyone to see.
  • Do not keep old information. When you’re done with your bank statements, make sure you shred them. I would personally invest in a shredder to make these statements unreadable.
  • Only the cards you need are the ones you should carry. This reduces the risk of becoming a victim of this kind of fraud.
  • Watch out for phishing scams. You might receive emails in your inbox from what looks like your cable TV provider, internet service provider, or bank asking you to provide your credit card information, often to avoid losing your service, always contact your service provider to make sure these are real claims.
  • Watch out for phone scams. Thieves don’t rely solely on the internet to steal your credit card information, they may also call you, making things seem more legit. Phone scams are very dangerous because it is way harder to tell whether or not these are real scams. Always make sure you contact your provider directly, using the number you find online.
  • Online banking. Signing up for online banking, statements can prevent thieves from getting your information and its a quicker and easier way to obtain the statements in the first place.
  • Report lost cards or suspected fraud quickly. Call your bank and let them know as soon as you’re suspicious.

How Machine Learning Can Help

Fraudulent transactions happen about 0.17% of the time, and you may be wondering why it’s such an issue if they happen in such a small percentage. As mentioned earlier billions and billions of dollars are lost, and by just increasing the specificity to 0.1% millions of dollars can be saved.

Lets Code:

Import your dependencies:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib import gridspec

Once you have a CSV file of your data, import it:

data = pd.read_csv('/content/creditcard.csv')

Understanding your data:

Here you can take a look at your data, just the first couple lines to understand what your file looks like

data.head()

Taking a look at the description:

We are going to be looking at the time, v1, amount, and class. We can do this with code snippet-

print(data.shape)
print(data.describe())

You will receive this data:

Next, we want to visualize this data:

features = data.iloc[:,0:28].columns
plt.figure(figsize=(12,28*4))
gs = gridspec.GridSpec(28, 1)
for i, c in enumerate(data[features]):
ax = plt.subplot(gs[i])
sns.distplot(data[c][data.Class == 1], bins=50)
sns.distplot(data[c][data.Class == 0], bins=50)
ax.set_xlabel("")
ax.set_title("histogram of feature: " + str(c))
plt.show()

Google Co-lab will output:

After we’ve visualized, we can separate fraudulent cases from the real ones.

Fraud = data[data["Class"] == 1]Valid = data[data["Class"] == 0]outlier_fraction = len(Fraud)/float(len(Valid))print(outlier_fraction)print("Fraud Cases: {}".format(len(data[data["Class"] == 1])))print("Valid Transactions: {}".format(len(data[data["Class"] == 0])))

You will then receive this output:

Now, we can print details about fraudulent transactions

print("Fraudulent Transaction")Fraud.Amount.describe()

and once again about valid transactions

print("Valid transaction")Valid.Amount.describe()

Our output will look like:

From this, we can tell that the average money transaction for fraudulent is higher.

Next, we will look into the correlation matrix with this piece of code:

corrmat = data.corr()
fig = plt.figure(figsize = (12, 9))
sns.heatmap(corrmat, vmax = .8, square = True)
plt.show()

This output shows us the heat map for our data points. We can tell that there are a lot of discrepancies between numbers like V1 and V5.

Using Skicit learn to split the data into Training and Testing.

from sklearn.model_selection import train_test_splitX_train, X_test, Y_train, Y_test = train_test_split(X_data, Y_data, test_size = 0.2, random_state = 42)from sklearn.ensemble import IsolationForestifc=IsolationForest(max_samples=len(X_train),contamination=outlier_fraction,random_state=1)ifc.fit(X_train)scores_pred = ifc.decision_function(X_train)y_pred = ifc.predict(X_test)y_pred[y_pred == 1] = 0y_pred[y_pred == -1] = 1n_errors = (y_pred != Y_test).sum()

Next, we want to print the Isolation Forest Model:

from sklearn.metrics import classification_report, accuracy_score,precision_score,recall_score,f1_score,matthews_corrcoeffrom sklearn.metrics import confusion_matrixn_outliers = len(Fraud)print("the Model used is {}".format("Isolation Forest"))acc= accuracy_score(Y_test,y_pred)print("The accuracy is  {}".format(acc))prec= precision_score(Y_test,y_pred)print("The precision is {}".format(prec))rec= recall_score(Y_test,y_pred)print("The recall is {}".format(rec))f1= f1_score(Y_test,y_pred)print("The F1-Score is {}".format(f1))MCC=matthews_corrcoef(Y_test,y_pred)print("The Matthews correlation coefficient is{}".format(MCC))

This will be our output:

This is the expected information and outcome that we would receive from this model.

This is the start of a bigger picture for solving this huge problem. Many people lose money around the world, and stopping this is simple, fighting fire with fire would be the place to go. Using technology to prevent real-world problems is the future.

It is important to realize that in most cases exceeds the previously reported results with an MCC of 0.8629.

Contact Me:

Hey, My name is Dexter Barahona. I’m an 18-year-old Innovator at The Knowledge Society. I’ve been mainly focusing on machine learning and AI. Recently, I have been pursuing my dream to become a full stack developer and will be writing articles on things I find interesting! :)

Email: dexteralxbarahona@gmail.com

LinkedIn: https://www.linkedin.com/in/dexter-barahona-723314194

Twitter: https://twitter.com/dexbarahona

Instagram: Dexterbarahona

Check out my personal website!

--

--

No responses yet