The methods documented here are to be used together with
predict (transactions) to obtain
the expected number of transactions of an average, yet-to-be acquired customer and
with predict (spending) to obtain
the expected spending of an average yet-to-be acquired customer.
See the Method subsection in Details for more explanations.
The methods described here produce the data required as input to
predict(newdata=) to make this new customer prediction.
This is mostly covariate data for static and dynamic covariate models.
See details for the required format.
newcustomer(), newcustomer.static(), newcustomer.dynamic():
To predict the number of transactions a single, fictional, average, yet-to-be acquired
customer is expected to make in the first num.periods periods.
newcustomer.spending(): To estimate how much a single, fictional, average,
yet-to-be acquired customer is expected to spend on average per transaction.
Note that the spending model should be fit with remove.first.transaction=FALSE
because the spending predictions are also used for the first orders.
newcustomer(num.periods)
newcustomer.static(num.periods, data.cov.life, data.cov.trans)
newcustomer.dynamic(
num.periods,
data.cov.life,
data.cov.trans,
first.transaction
)
newcustomer.spending()A positive, numeric scalar indicating the number of periods to predict from the initial transaction.
Numeric-only covariate data for the lifetime process for a single customer, data.table or data.frame. See details.
Numeric-only covariate data for the transaction process for a single customer, data.table or data.frame. See details.
For dynamic covariate models only: The time point of the first transaction of the customer ("coming alive"). Has to be within the time range of the covariate data.
An object of class clv.newcustomer.no.cov
An object of class clv.newcustomer.static.cov
An object of class clv.newcustomer.dynamic.cov
An object of class clv.newcustomer.spending
The covariate data has to contain one column for every covariate parameter in the fitted model. Only numeric values are allowed, no factors or characters. No customer Id is required because the data on which the model was fit is not used for this prediction.
For newcustomer.static(): One column for every covariate parameter in the estimated model.
No column Id. Exactly 1 row of numeric covariate data.
For example: data.frame(Gender=1, Age=30, Channel=0).
For newcustomer.dynamic(): One column for every covariate parameter in the estimated model.
No column Id. A column Cov.Date with time points that mark the start of the period defined by time.unit.
For every Cov.Date, exactly 1 row of numeric covariate data.
For example for weekly covariates: data.frame(Cov.Date=c("2000-01-03", "2000-01-10"), Gender=c(1,1), Channel=c(1, 1), High.Season=c(0,1,0))
If Cov.Date is of type character, the date.format given when creating the the clv.data object is used to parse it.
The data has to cover the time from the customer's first transaction first.transaction
to the end of the prediction period given by t. It does not have to cover the same time range as when fitting the model.
See examples.
For models with dynamic covariates, the time point of the first purchase (first.transaction) is
additionally required because the exact covariates that are active during the prediction period have
to be known.
These predictions are for average, prospective customers: Yet-to-be acquired
customers which still have to place their first order.
Therefore, the predicted number of expected orders also includes the initial purchase (1+).
The subsequent orders in the first t periods are then predicted using the unconditional expectation.
In case of the Pareto/NBD this is
$$1 + E[X(t)]= 1 + \frac{r \beta}{\alpha (s-1)} \left[ 1- \left (\frac{\beta}{\beta+t} \right)^{s-1} \right].$$
predict (transactions) to use the output of the methods described here.
predict (spending) to use the output of the methods described here.
# \donttest{
data("apparelTrans")
data("apparelStaticCov")
data("apparelDynCov")
clv.data.apparel <- clvdata(apparelTrans, date.format = "ymd",
time.unit = "w", estimation.split = 52)
clv.data.static.cov <-
SetStaticCovariates(clv.data.apparel,
data.cov.life = apparelStaticCov,
names.cov.life = "Gender",
data.cov.trans = apparelStaticCov,
names.cov.trans = c("Gender", "Channel"))
clv.data.dyn.cov <-
SetDynamicCovariates(clv.data = clv.data.apparel,
data.cov.life = apparelDynCov,
data.cov.trans = apparelDynCov,
names.cov.life = c("High.Season", "Gender"),
names.cov.trans = c("High.Season", "Gender"),
name.date = "Cov.Date")
# No covariate model
p.apparel <- pnbd(clv.data.apparel)
#> Starting estimation...
#> Estimation finished!
# Predict the number of transactions an average new
# customer is expected to make in the first 3.68 weeks
predict(
p.apparel,
newdata=newcustomer(num.periods=3.68)
)
#> [1] 1.102611
# Spending model
# Note: remove.first.transaction=FALSE as the predicted spending will be multiplied
# with the total number of orders that also includes the initial purchase
gg.apparel <- gg(clv.data.apparel, remove.first.transaction=FALSE)
#> Starting estimation...
#> Estimation finished!
predict(gg.apparel, newdata = newcustomer.spending())
#> [1] 40.194
# Static covariate model
p.apparel.static <- pnbd(clv.data.static.cov)
#> Starting estimation...
#> Estimation finished!
# Predict the number of transactions an average new
# customer who is female (Gender=1) and who was acquired
# online (Channel=1) is expected to make in the first 3.68 weeks
predict(
p.apparel.static,
newdata=newcustomer.static(
num.periods=3.68,
# For the lifetime process, only Gender was used when fitting
data.cov.life=data.frame(Gender=1),
data.cov.trans=data.frame(Gender=1, Channel=0)
)
)
#> [1] 1.102862
if (FALSE) { # \dontrun{
# Dynamic covariate model
p.apparel.dyn <- pnbd(clv.data.dyn.cov)
# Predict the number of transactions an average new
# customer who is male (Gender=0), who did not purchase during
# high.season, and who was
# acquired on "2005-02-16" (first.transaction) is expected
# to make in the first 2.12 weeks.
# Note that the time range is very different from the one used
# when fitting the model. Cov.Date still has to match the
# beginning of the week.
predict(
p.apparel.dyn,
newdata=newcustomer.dynamic(
num.periods=2.12,
data.cov.life=data.frame(
Cov.Date=c("2051-02-12", "2051-02-19", "2051-02-26"),
Gender=c(0, 0, 0),
High.Season=c(4, 0, 7)),
data.cov.trans=data.frame(
Cov.Date=c("2051-02-12", "2051-02-19", "2051-02-26"),
Gender=c(0, 0, 0),
High.Season=c(4, 0, 7)),
first.transaction = "2051-02-16"
)
)
} # }
# }