Recent Posts



Data Science and the Australian Banking Royal Commission

The Royal Commission Into Misconduct in the Banking, Superannuation and Financial Services Industry released its final report earlier this month (Feb 2019).

The implications for lending in Australia are not great, in fact the lending industry was belted by the report, it was argued that there was a culture of greed. I don’t know if this is fair. I think there were probably a few things that contributed to the problem, which I am going to outline.

I think it all started with the most hilariously vague document written in the history of the universe “Regulatory Guide 209: Credit Licensing: Responsible Lending Conduct”. This document would be funny, if it wasn’t the basis that lenders used for lending and that the Australian Securities and Investments Commission (ASIC) used to police lending in Australia.

I sympathise with lenders, I feel they were set up for failure. It’s the equivalent of blindfolding someone, laying down a bunch of bear traps around them and telling them they probably shouldn’t step on the bear traps, but not telling them where the bear traps are. Best of luck matey!

The most vague document in the history of the universe

Here some of the brilliant quotes in RG209, I really couldn’t make this up if I tried. Lending requires yes and no decisions, especially automated lending systems, to have anything other than binary decisions won’t work at scale. So we have a vague document with vague requirements for lending and computers can’t handle vague, lending isn’t vague - the applicant either gets the loan or they don’t. So immediately we have problems.

1) Lenders should “Conduct an assessment that the credit contract or consumer lease is not unsuitable for the consumer”.
I mean what the hell is that supposed to mean?

“Ok, Bob, here is your home loan. We think it is ummm… errr… pretty well not unsuitable for you?”

2) A lender should make “Reasonable inquiries about consumer’s requirements and objectives and financial situation”:

“Hi Bob, this is Jim calling you from ABC Lending Company, umm.. How many dependants do you have?”

3) A lender should make “Reasonable steps to verify their (the borrower’s) financial situation”.
I agree with this, but the guidance note doesn’t say what they are, or how to do it or anything at all. For short term loans it does say to get some bank statements and have a look. But I mean, how do you make sense of someone’s bank statements? Do you look at the statements line by line and see if there is something odd about them? It kind of feels like something a computer may be better at doing.

“Hi Bob, this is Jim again I’ve just been going over the 27 pages of your bank statements and I saw that on the 29th of October rather than spending $12 on your usual Friday lunch you actually spent $24. This represents an anomaly.

Now, did you get a pay raise, or did you shout someone lunch? You see I just need to verify your financial situation.”

But never fear because RG209 says we can scale these inquiries:

4) “The obligation to make reasonable inquiries, and to take reasonable steps to verify information, is scalable—that is, what you need to do to meet these obligations will vary depending on the circumstances”

Hahahaha! Oh, wait - hang on, I am writing something and I need to stay focussed... Bahahaha! So, this is a guidance note, that doesn’t actually provide guidance!

Jim: “Hi Bob, how many credit cards do you have?”

Bob: “Two.”

Jim: “Really?”

Bob: “Yes”

Jim: “Really, truly?”

Bob: “Yup”

Jim: “Sorry about that, I just had to make further inquiries”

Verification of Expenses and the HEM

So, parsing bank statements and finding out what someone is actually spending has historically been in the “too hard” basket for most lenders. Instead they have relied on the Household Expenditure Measure (HEM) which is essentially a lookup table for expenses.

So you plug in your:

  • Location where you live (city and state) really postcode is what drives location

  • Household type: single, couple, family

  • The number of dependents

And then out pops your expenses like magic!

Except it is saying that someone in the same postcode as me with a couple of kids has the same expenses as me. That’s totally crazy! Nevertheless that’s the system used to determine borrowing capacity. Nevertheless that’s what happens, the Australian Bureau of Statistics surveys 10,000 households in Australia and produces the HEM. Somehow this measure has found itself as a baseline for expenses by household for lending purposes.

Here’s some more info about the HEM:

Interestingly the Royal Commission didn’t disparage the HEM like I thought they would have. I imagine they couldn’t think of anything to use in its place.

Bank Statement data is a better option.

Some different lenders have used tax information or payslips to verify income, expenses they generally haven’t touched besides using the HEM.

The problem with using past data is that it isn’t verification. It is just showing you what someone’s income was at some time in the past, it doesn’t show what their income is today.

Borrowers have an incentive to request a loan if they know they are going to be short of cash, i other words if they have an event like a job loss, or reduced shifts or their car breaks or something like that. By looking at past income you can’t actually be sure what their income is right now. To do that you need bank statements.

With the HEM - well anything is better than that - but why not use the transactional bank statement data that is already in the building as a basis for lending? More banks are doing so, but in the past they hardly ever did. Instead what they used to do was rely on stated expenses in the application form, compare it to the HEM and take whichever was greater. Here’s the key point nobody knows how much they spend on groceries or anything else, so that information is inaccurate - and the HEM is inaccurate. So if you compare the two and you take one of them then you still have an inaccurate figure.

More to the point if you have actual data from the bank statements these features will dominate any lending model you have over credit bureau or application form information. If you can capture someone’s discretionary spending in their bank statement, well this is the most likely predictor of default.

Short term lenders and 90 days of bank statements.

Short term lenders have a requirement to use 90 days of bank statement data when making a loan assessment. The savvy ones have automated this process and incorporated bank statements into their credit risk models. With wider data and more accurate models they are able to build successful businesses by cherry-picking the customers the banks reject due to deficiencies in their lending models. Specifically relying on stated, rather than actual data, use of the HEM, eyeballing payslips rather than looking at the bank statement.

The kind of checks that can be automated are looking for a regular pulse of income every 2 weeks and raising a flag when this regular income doesn’t appear at the end of the bank statement. Detecting various income streams and taking a haircut when they appear to be volatile. Looking at breaks in the expected sequence of income. Raising a flag for high discretionary spending as a percentage of income.

Why haven’t the banks been using bank statements?

Some have been using bank statements and some haven’t. Mostly though the software they plug in simply rolls up the bank statement info and spits back a figure for categories like “rent” and “groceries” etc. It doesn’t flag oddities or hygiene check the bank statement.

Some of these systems may have been trained on overseas data, or not provide mechanisms for customization or not be as accurate as they ought to be. So, many of them are far from perfect. I think though the best models start with a company’s own internal data assets.

Building a Natural Language Processing model in house is hard, but you would own the IP and then be able to customize the model to suit your needs. You may find a bunch of new online gambling sites that you would want to add in to your “gambling expenses”, you may want to change your policy rules and have that reflected in your bank statement checks. Having the ability to respond to changes quickly is critical.

This exercise requires a different skill set - the hybrid developer/ data scientist with some lending domain expertise. It stretches the analytics department beyond that mainstay of banking and finance which is logistic regression. Typically with larger institutions the Model Validations Team sets the policy for how to build models and the analytics department follows these guidelines. Even this setup will serve to stifle innovation. What is needed is a holistic approach combining Product, Dev and Data Science teams to work together to build awesome stuff.

What Are the Implications of Collecting Better data?

Well, I would argue that if you are collecting better data from bank statements then you are obliged to use it.

So, for instance we may say “We will give you this Personal Loan as long as you promise to reduce your discretionary spending by 20%”

In a year’s time we may find the applicant return and apply for a credit card, we may draw more recent bank statements and find that they didn’t reduce their discretionary spending. In fact, we may find their discretionary spending has risen through the ceiling, which is why they are after a credit card. What do we do with this information? We should incorporate the past bank statement data into some kind of a customer score and encourage the applicant to control their discretionary spending before we can give them a loan I’d imagine. If we are being truthful I think we would find this situation occurring a lot. Many people would find it difficult to trim down on expenses and adjust their spending.

The Responsible Lending and Borrowing Summit

I may have come across as being a bit harsh on lenders and the regulator, however it is great to see people coming together to discuss responsible lending.

The regulator as well as the lenders were totally drilled by the Royal Commission, but responsible lending never had the focus it has now. So this would have meant that the regulator would not have had the budget or the bodies to really oversee everything. Likewise the risk departments of the lenders would sometimes have been seen as just another cost center. So the whole situation is really unfortunate. I think it is unfair that the regulator is tarred as being inept in the media, and that the lenders are portrayed as being greedy and careless. I don’t think that simple view is really fair.

What’s Next for Lending in Australia

The path of the Royal Commission can be seen when you step back a little.

It started with short term lending, then it went to the credit unions and retail banks offering loans to consumers. Next I’d imagine the regulator will look into lending to small businesses. Typically these businesses haven’t been given the same protection as consumers, although in the case of a carpenter or cafe owner it is pretty hard to delineate the dealings of the consumer with the dealings of the business. These people run their businesses, are not experts in lending or finance and typically don’t have the resources to hire experts. So I’d suggest they are vulnerable.

When you look into commercial lending for these companies, you sometimes see that it is largely judgement based. Even so called Fintech companies can simply throw up a Wordpress website and have someone making a judgement call. I have seen lending standards that I felt were less than ideal in these kind of environments. In fact a particular lender I spoke to was unable to articulate their lending process at all. It had much of the same vagueness as RG209. There were absolutely no ratios, no red flags, no disqualifying criteria at all. It was more like they could always find some reason to give a loan rather than deny an applicant. While this is great in terms of access to credit for businesses, it is really irresponsible to the business, it’s employees, stakeholders, directors and future clients.

So, I think is a problem in commercial lending too, but I think if we collect data as data in a database, we implement the right checks and we have the right people we can use the tech skills we have to take a lot of this judgement and vagueness out of commercial lending. One of the problems to solve with SME lending is delineating the expenses of the person from that of the business - to do that I’d turn to my old friend the bank statement.