Carnegie Mellon University

Fraud Detection in Financial Data

Our goal is to do behavior characterization (and possibly, fraud detection) for financial events of a user, and specifically, for credit card usage. After feature extraction, we plan to use our so-called 'KLV' method, to spot rare feature combinations. One feature, 'F1', could be 'dollar amount spent on groceries'; another feature, 'F2', could be 'amount spent on jewelry'. 'KLV' computes the histogram (PDF) of the full population - in our toy example above, it would be 'F1' is 80% (that is most of our customers spend 80% on groceries, and 20% on 'F2'= jewelry). Other set of features we plan to investigate: inter-arrival times between events, which we plan to bucketize  logarithmically. If we see a customer 'Smith' with too many events, all within 1-2 hours, while every other customer has 1 event every several days, then 'Smith' may be a fraudster (or stolen credit card).

Christos Faloutsos

Christos Faloutsos

Project Lead