Monetary value of transactions/purchases made by a customer with a business over his entire lifetime(time period till a customer purchases with particular Business before moving to competitors)
Data is divided in multiple datasets for better understanding and organization. Refering with the following data schema when working with it is helpfull:- Prob. Den. Function
- Poisson Distribution
- Gamma Distribution
- P value
- Beta Distribution
Here:
Average Sales
=Total Sales
/Total Number of Orders
Purchase Frequency
=Total number of Orders
/Total Unique Customers
Retention Rate
=Total numer of orders > 1
/Total Unique Customers
Churn
=1
/Retention Rate
Profit Margin
=Based on business context
- Assume Profit Margin say any number%
What I am trying to do!!!
:
- To predict future sales and monetary values of customers by using historical transactional data from the business
Some frameworks/models to handle such complexity with ease are
Historical Approach
:
1.1.Aggregate Model
—calculating the CLV by usingaverage revenue per customer based on past transactions
.This method gives us a single value for the CLV.
1.2.Cohort Model
—grouping customers into different cohorts based ontransaction date
, etc. andcalculate average revenue per cohort
. This method gives CLV value for each cohort.
Predictive Approach
:
2.1.Machine Learning Model
—usingregression techniques
to fit on past data to predict CLV.
2.2.Probabilistic Model
—it tries tofit a probability distribution to data and estimates future count of transactions and monetary value for each transaction
-
Aggregate Model
=> considers all customer as one group -
Cohort Model
=> makes diff. groups out of all customer- main assumption => customers within a cohort spend similarly or in general behave similarly
- most common way
to group customers into cohorts
is by start date of a customer, typically by month - best choice will depend on
customer acquisition rate
,seasonality of business
, andwhether additional customer information can be used
-
Probabilistic Model
- This class of models tries to fit a probability distribution to data and then use that information to estimate other parameters of CLV equation (such as number of future transactions, future monetary value, etc)
- There are various probabilistic models out there that can be used to predict future CLV
- Not all variables in the CLV equation can be predicted using a single model
Usually:
Transaction variables(Purchase freq & Churn)
andMonetary variables(Avg order value)
are modeled separately
-
Predicting Future Transaction and Churn
(Purchase Freq & Churn)Pareto
/NBD Model- one of the most used methods in CLV calculations
BG
/NBD Model =>Beta Geometric
/Negative Binomial Distribution- one of the most commonly used probabilistic models for predicting CLV
- alternative to Pareto/NBD model
MBG
/NBD ModelBG
/BB Model
-
Predicting Future Monetary Values
(Avg Order Values)Gamma-Gammma Model
=> adds monetary aspect of customer transaction
-
When a user is active, number of transactions in a
time-t
is described by Poisson distribution with ratelambda
-
Heterogeneity in transaction across users (difference in purchasing behavior across users) has
Gamma distribution
withshape parameter-r
andscale parameter-a
-
Users may become inactive after any transaction with
probability-p
and their dropout point is distributed between purchases with Geometric distribution -
Heterogeneity in dropout probability has
Beta distribution
with the two shape parametersalpha and beta
-
Transaction rate and dropout probability vary independently across users
Lifetimes Package: This package is primarily built to aid customer lifetime value calculations, predicting customer churn, etc. It has all the major models and utility functions that are needed for CLV calculations
import lifetimes #importing lifetime package
First
, we need to create a summary table
from transaction data
* summary table is nothing but an RFM table
* (RFM — Recency, Frequency and Monetary value)
Helps in finding:
frequency
— the number of repeat purchases (more than 1 purchases)recency
— the time between the first and the last transactionT
— the time between the first purchase and the end of the transaction period (last date of the time frame considered for the analysis)monetary_value
— it is the mean of a given customers sales value
CLV Overview
- This is often referred to as Customer Lifetime Value (CLV), Recency, Frequency, Monetary Value (RFM) or Customer Probability Models etc
- These models focus exclusively on how customers make repeat purchases over their own lifetime relationship with the company
- Cameron Pilon transformed their CLV work into an easily implementable python code package called lifetimes
I am applying analysis to company (Olist) and carefully examining registered customer orders, dataset has information of 100k orders from 2016 to 2018 made at multiple marketplaces in Brazil
- Olist is a Brazilian online e-commerce department store
- Olist connects small businesses from all over Brazil to channels without hassle and with a single contract
- These merchants are able to sell their products through Olist Store and ship them directly to customers using Olist logistics partners
Steps to take:
- From the data-set, I extract Olist best customers through forecasting their future purchases
- Then analyze their historical purchasing paths, infer their probability of leaving and generalize Olist customers via RF Matrices
- Take models presented one step further and perform a bottom-up financial valuation of Olist to determine how much it is worth today
How Modelling will be done??
First step to customer analysis is to find a good mathematical model to describe customer repeat purchases
This doesn't have to get complicated, Fader and Hardie only considers timing as a primary factor. Who is Fader and Hardie???
Few well-established models proposed are as:
Pareto/NBD Model
by Schmittlein et al. in 1984BG/NBD Model
was later proposed by Fader and Hardie in 2004 => An alternate and easier to implement model then Pareto/NBD Model- BG/BB Model in 2009 => more recent model
I will be introducing and actively using
BG/NBD model
within our analysis, bothPareto/NBD
andBG/NBD
are supported withinlifetimes
- Customers will come and buy at an interval that's randomly distributed within a reasonable time range
- After each purchase they have a certain probability of dying or becoming inactive (never returning to buy again)
- Each customer is different and have varying purchase intervals and probability of going inactive
- Customers buy
stochastically
according to a Poisson distribution with purchasing rate(λ)-lambda
- After each purchase customer has
p%
chance of becoming inactive- so time period at which a customer becomes inactive is distributed as a shifted geometric distribution
- customer-base is
heterogeneous
across those two parameters such that we can assume a(γ)-gamma
and(β)-beta distribution
respectively - at last we assume that
λ
andp%
across customers are jointly independent- This makes some of math later on much easier
- This class of models try to quantify customer behavior under a
non-contractual setting
where we don't know when customers become inactive but rather assign a percentage confidence that we believe they are dead- In contrast, another class of models are designed for
contractual settings
and have been successfully applied to other businesses such as telecommunication industry where a customer has to tell you that they're ending their relationship- Example: Large number of subscriber-based firms like Netflix often report and describe their churn rate or customer turnover. CLV is much simpler to quantify on that scale but for a normal product-selling firm, this is much more difficult as were not sure when a customer has decided to terminate relationship
- In contrast, another class of models are designed for
- Having a model that can describe customer
deaths
andpurchases over time
is much more effective at inferring their future purchases and subsequent aggregation into expected total sales than a naive "oh I expect goods sale to grow at maybe 1% next year". This I will shown in details later on with some simple plots