Title: | Semi-Automated Marketing Mix Modeling (MMM) from Meta Marketing Science |
---|---|
Description: | Semi-Automated Marketing Mix Modeling (MMM) aiming to reduce human bias by means of ridge regression and evolutionary algorithms, enables actionable decision making providing a budget allocation and diminishing returns curves and allows ground-truth calibration to account for causation. |
Authors: | Gufeng Zhou [aut], Bernardo Lares [cre, aut], Leonel Sentana [aut], Igor Skokan [aut], Meta Platforms, Inc. [cph, fnd] |
Maintainer: | Bernardo Lares <[email protected]> |
License: | MIT + file LICENSE |
Version: | 3.11.1.9004 |
Built: | 2024-11-20 05:52:14 UTC |
Source: | https://github.com/facebookexperimental/robyn |
adstock_geometric()
for Geometric Adstocking is the classic one-parametric
adstock function.
adstock_weibull()
for Weibull Adstocking is a two-parametric adstock
function that allows changing decay rate over time, as opposed to the fixed
decay rate over time as in Geometric adstock. It has two options, the cumulative
density function "CDF" or the probability density function "PDF".
adstock_geometric(x, theta) adstock_weibull(x, shape, scale, windlen = length(x), type = "cdf") transform_adstock( x, adstock, theta = NULL, shape = NULL, scale = NULL, windlen = length(x) ) plot_adstock(plot = TRUE)
adstock_geometric(x, theta) adstock_weibull(x, shape, scale, windlen = length(x), type = "cdf") transform_adstock( x, adstock, theta = NULL, shape = NULL, scale = NULL, windlen = length(x) ) plot_adstock(plot = TRUE)
x |
A numeric vector. |
theta |
Numeric. Theta is the only parameter on Geometric Adstocking and means fixed decay rate. Assuming TV spend on day 1 is 100€ and theta = 0.7, then day 2 has 100 x 0.7 = 70€ worth of effect carried-over from day 1, day 3 has 70 x 0.7 = 49€ from day 2 etc. Rule-of-thumb for common media genre: TV c(0.3, 0.8), OOH/Print/ Radio c(0.1, 0.4), digital c(0, 0.3). |
shape , scale
|
Numeric. Check "Details" section for more details. |
windlen |
Integer. Length of modelling window. By default, same length as |
type |
Character. Accepts "CDF" or "PDF". CDF, or cumulative density function of the Weibull function allows changing decay rate over time in both C and S shape, while the peak value will always stay at the first period, meaning no lagged effect. PDF, or the probability density function, enables peak value occurring after the first period when shape >=1, allowing lagged effect. |
adstock |
Character. One of: "geometric", "weibull_cdf", "weibull_pdf". |
plot |
Boolean. Do you wish to return the plot? |
has two parameters, shape & scale, and has flexible decay rate, compared to Geometric adstock with fixed decay rate. The shape parameter controls the shape of the decay curve. Recommended bound is c(0.0001, 2). The larger the shape, the more S-shape. The smaller, the more L-shape. Scale controls the inflexion point of the decay curve. We recommend very conservative bounce of c(0, 0.1), because scale increases the adstock half-life greatly.
also shape & scale as parameter and also has flexible decay rate as Weibull CDF. The difference is that Weibull PDF offers lagged effect. When shape > 2, the curve peaks after x = 0 and has NULL slope at x = 0, enabling lagged effect and sharper increase and decrease of adstock, while the scale parameter indicates the limit of the relative position of the peak at x axis; when 1 < shape < 2, the curve peaks after x = 0 and has infinite positive slope at x = 0, enabling lagged effect and slower increase and decrease of adstock, while scale has the same effect as above; when shape = 1, the curve peaks at x = 0 and reduces to exponential decay, while scale controls the inflexion point; when 0 < shape < 1, the curve peaks at x = 0 and has increasing decay, while scale controls the inflexion point. When all possible shapes are relevant, we recommend c(0.0001, 10) as bounds for shape; when only strong lagged effect is of interest, we recommend c(2.0001, 10) as bound for shape. In all cases, we recommend conservative bound of c(0, 0.1) for scale. Due to the great flexibility of Weibull PDF, meaning more freedom in hyperparameter spaces for Nevergrad to explore, it also requires larger iterations to converge.
Run plot_adstock()
to see the difference visually.
Numeric values. Transformed values.
Other Transformations:
saturation_hill()
,
transformations
adstock_geometric(rep(100, 5), theta = 0.5) adstock_weibull(rep(100, 5), shape = 0.5, scale = 0.5, type = "CDF") adstock_weibull(rep(100, 5), shape = 0.5, scale = 0.5, type = "PDF") # Wrapped function for either adstock transform_adstock(rep(100, 10), "weibull_pdf", shape = 1, scale = 0.5)
adstock_geometric(rep(100, 5), theta = 0.5) adstock_weibull(rep(100, 5), shape = 0.5, scale = 0.5, type = "CDF") adstock_weibull(rep(100, 5), shape = 0.5, scale = 0.5, type = "PDF") # Wrapped function for either adstock transform_adstock(rep(100, 10), "weibull_pdf", shape = 1, scale = 0.5)
Contains prophet
's "new" default holidays by country.
When using own holidays, please keep the header
c("ds", "holiday", "country", "year")
.
data(dt_prophet_holidays)
data(dt_prophet_holidays)
An object of class "data.frame"
Date
Name of celebrated holiday
Code for the country (Alpha-2)
Year of ds
data.frame
Dataframe. Contains prophet
's default holidays by country.
Other Dataset:
dt_simulated_weekly
data(dt_prophet_holidays) head(dt_prophet_holidays)
data(dt_prophet_holidays) head(dt_prophet_holidays)
Simulated MMM data. Input time series should be daily, weekly or monthly.
data(dt_simulated_weekly)
data(dt_simulated_weekly)
An object of class "data.frame"
Date
Daily total revenue
Television
Out of home
...
data.frame
Dataframe. Contains simulated dummy dataset to test and run demo.
Other Dataset:
dt_prophet_holidays
data(dt_simulated_weekly) head(dt_simulated_weekly)
data(dt_simulated_weekly) head(dt_simulated_weekly)
This function is called in robyn_engineering()
. It uses
the Michaelis-Menten function to fit the nonlinear model. Fallback
model is the simple linear model lm()
in case the nonlinear
model is fitting worse. A bad fit here might result in unreasonable
model results. Two options are recommended: Either splitting the
channel into sub-channels to achieve better fit, or just use
spend as paid_media_vars
fit_spend_exposure(dt_spendModInput, mediaCostFactor, paid_media_var)
fit_spend_exposure(dt_spendModInput, mediaCostFactor, paid_media_var)
dt_spendModInput |
data.frame. Containing channel spends and exposure data. |
mediaCostFactor |
Numeric vector. The ratio between raw media exposure and spend metrics. |
paid_media_var |
Character. Paid media variable. |
List. Containing the all spend-exposure model results.
Reference data.frame that shows the upper and lower bounds valid for each hyperparameter.
hyper_limits()
hyper_limits()
Dataframe. Contains upper and lower bounds for each hyperparameter.
hyper_limits()
hyper_limits()
Output all hyperparameter names and help specifying the list of
hyperparameters that is inserted into robyn_inputs(hyperparameters = ...)
hyper_names(adstock, all_media, all_vars = NULL)
hyper_names(adstock, all_media, all_vars = NULL)
adstock |
Character. Default to |
all_media |
Character vector. Default to |
all_vars |
Used to check the penalties inputs, especially for refreshing models. |
Character vector. Names of hyper-parameters that should be defined.
Get correct hyperparameter names:
All variables in paid_media_vars
or organic_vars
require hyperprameters
and will be transformed by adstock & saturation. Difference between paid_media_vars
and organic_vars
is that paid_media_vars
has spend that
needs to be specified in paid_media_spends
specifically. Run hyper_names()
to get correct hyperparameter names. All names in hyperparameters must
equal names from hyper_names()
, case sensitive.
Get guidance for setting hyperparameter bounds: For geometric adstock, use theta, alpha & gamma. For both weibull adstock options, use shape, scale, alpha, gamma.
Theta: In geometric adstock, theta is decay rate. guideline for usual media genre: TV c(0.3, 0.8), OOH/Print/Radio c(0.1, 0.4), digital c(0, 0.3)
Shape: In weibull adstock, shape controls the decay shape. Recommended c(0.0001, 2). The larger, the more S-shape. The smaller, the more L-shape. Channel-type specific values still to be investigated
Scale: In weibull adstock, scale controls the decay inflexion point. Very conservative recommended bounce c(0, 0.1), because scale can increase adstocking half-life greatly. Channel-type specific values still to be investigated
Gamma: In s-curve transformation with hill function, gamma controls the inflexion point. Recommended bounce c(0.3, 1). The larger the gamma, the later the inflection point in the response curve
Set each hyperparameter bounds. They either contains two values e.g. c(0, 0.5), or only one value (in which case you've "fixed" that hyperparameter)
Get adstock transformation example plot, helping you understand geometric/theta and weibull/shape/scale transformation
Get saturation curve transformation example plot, helping you understand hill/alpha/gamma transformation
media <- c("facebook_S", "print_S", "tv_S") hyper_names(adstock = "geometric", all_media = media) hyperparameters <- list( facebook_S_alphas = c(0.5, 3), # example bounds for alpha facebook_S_gammas = c(0.3, 1), # example bounds for gamma facebook_S_thetas = c(0, 0.3), # example bounds for theta print_S_alphas = c(0.5, 3), print_S_gammas = c(0.3, 1), print_S_thetas = c(0.1, 0.4), tv_S_alphas = c(0.5, 3), tv_S_gammas = c(0.3, 1), tv_S_thetas = c(0.3, 0.8) ) # Define hyper_names for weibull adstock hyper_names(adstock = "weibull", all_media = media) hyperparameters <- list( facebook_S_alphas = c(0.5, 3), # example bounds for alpha facebook_S_gammas = c(0.3, 1), # example bounds for gamma facebook_S_shapes = c(0.0001, 2), # example bounds for shape facebook_S_scales = c(0, 0.1), # example bounds for scale print_S_alphas = c(0.5, 3), print_S_gammas = c(0.3, 1), print_S_shapes = c(0.0001, 2), print_S_scales = c(0, 0.1), tv_S_alphas = c(0.5, 3), tv_S_gammas = c(0.3, 1), tv_S_shapes = c(0.0001, 2), tv_S_scales = c(0, 0.1) )
media <- c("facebook_S", "print_S", "tv_S") hyper_names(adstock = "geometric", all_media = media) hyperparameters <- list( facebook_S_alphas = c(0.5, 3), # example bounds for alpha facebook_S_gammas = c(0.3, 1), # example bounds for gamma facebook_S_thetas = c(0, 0.3), # example bounds for theta print_S_alphas = c(0.5, 3), print_S_gammas = c(0.3, 1), print_S_thetas = c(0.1, 0.4), tv_S_alphas = c(0.5, 3), tv_S_gammas = c(0.3, 1), tv_S_thetas = c(0.3, 0.8) ) # Define hyper_names for weibull adstock hyper_names(adstock = "weibull", all_media = media) hyperparameters <- list( facebook_S_alphas = c(0.5, 3), # example bounds for alpha facebook_S_gammas = c(0.3, 1), # example bounds for gamma facebook_S_shapes = c(0.0001, 2), # example bounds for shape facebook_S_scales = c(0, 0.1), # example bounds for scale print_S_alphas = c(0.5, 3), print_S_gammas = c(0.3, 1), print_S_shapes = c(0.0001, 2), print_S_scales = c(0, 0.1), tv_S_alphas = c(0.5, 3), tv_S_gammas = c(0.3, 1), tv_S_shapes = c(0.0001, 2), tv_S_scales = c(0, 0.1) )
When prophet_vars
in robyn_inputs()
is specified, this
function decomposes trend, season, holiday and weekday from the
dependent variable.
prophet_decomp( dt_transform, dt_holidays, prophet_country, prophet_vars, prophet_signs, factor_vars, context_vars, organic_vars, paid_media_spends, intervalType, dayInterval, custom_params )
prophet_decomp( dt_transform, dt_holidays, prophet_country, prophet_vars, prophet_signs, factor_vars, context_vars, organic_vars, paid_media_spends, intervalType, dayInterval, custom_params )
dt_transform |
A data.frame with all model features.
Must contain |
dt_holidays |
data.frame. Raw input holiday data. Load standard
Prophet holidays using |
context_vars , paid_media_spends , intervalType , dayInterval , prophet_country , prophet_vars , prophet_signs , factor_vars
|
As included in |
organic_vars |
Character vector. Typically newsletter sendings,
push-notifications, social media posts etc. Compared to |
custom_params |
List. Custom parameters passed to |
A list containing all prophet decomposition output.
Robyn is an automated Marketing Mix Modeling (MMM) code. It aims to reduce human bias by means of ridge regression and evolutionary algorithms, enables actionable decision making providing a budget allocator and diminishing returns curves and allows ground-truth calibration to account for causation.
Gufeng Zhou ([email protected])
Leonel Sentana ([email protected])
Igor Skokan ([email protected])
Bernardo Lares ([email protected])
Useful links:
Report bugs at https://github.com/facebookexperimental/Robyn/issues
robyn_allocator()
function returns a new split of media
variable spends that maximizes the total media response.
robyn_allocator( robyn_object = NULL, select_build = 0, InputCollect = NULL, OutputCollect = NULL, select_model = NULL, json_file = NULL, scenario = "max_response", total_budget = NULL, target_value = NULL, date_range = "all", channel_constr_low = NULL, channel_constr_up = NULL, channel_constr_multiplier = 3, optim_algo = "SLSQP_AUGLAG", maxeval = 1e+05, constr_mode = "eq", keep_zero_coefs = FALSE, plots = TRUE, plot_folder = NULL, plot_folder_sub = NULL, export = TRUE, quiet = FALSE, ui = FALSE, ... ) ## S3 method for class 'robyn_allocator' print(x, ...) ## S3 method for class 'robyn_allocator' plot(x, ...)
robyn_allocator( robyn_object = NULL, select_build = 0, InputCollect = NULL, OutputCollect = NULL, select_model = NULL, json_file = NULL, scenario = "max_response", total_budget = NULL, target_value = NULL, date_range = "all", channel_constr_low = NULL, channel_constr_up = NULL, channel_constr_multiplier = 3, optim_algo = "SLSQP_AUGLAG", maxeval = 1e+05, constr_mode = "eq", keep_zero_coefs = FALSE, plots = TRUE, plot_folder = NULL, plot_folder_sub = NULL, export = TRUE, quiet = FALSE, ui = FALSE, ... ) ## S3 method for class 'robyn_allocator' print(x, ...) ## S3 method for class 'robyn_allocator' plot(x, ...)
robyn_object |
Character or List. Path of the |
select_build |
Integer. Default to the latest model build. |
InputCollect |
List. Contains all input parameters for the model.
Required when |
OutputCollect |
List. Containing all model result.
Required when |
select_model |
Character. A model |
json_file |
Character. JSON file to import previously exported inputs or
recreate a model. To generate this file, use |
scenario |
Character. Accepted options are: |
total_budget |
Numeric. Total marketing budget for all paid channels for the
period in |
target_value |
Numeric. When using the scenario |
date_range |
Character. Date(s) to apply adstocked transformations and pick mean spends
per channel. Set one of: "all", "last", or "last_n" (where
n is the last N dates available), date (i.e. "2022-03-27"), or date range
(i.e. |
channel_constr_low , channel_constr_up
|
Numeric vectors. The lower and upper bounds
for each paid media variable when maximizing total media response. For example,
|
channel_constr_multiplier |
Numeric. Default to 3. For example, if
|
optim_algo |
Character. Default to |
maxeval |
Integer. The maximum iteration of the global optimization algorithm. Defaults to 100000. |
constr_mode |
Character. Options are |
keep_zero_coefs |
Boolean. By default, zero coefficient (betas) channels will be removed to avoid spending budget were there is no impact. |
plots |
Boolean. Generate plots? |
plot_folder |
Character. Path for saving plots and files. Default
to |
plot_folder_sub |
Character. Sub path for saving plots. Will overwrite the default path with timestamp or, for refresh and allocator, simply overwrite files. |
export |
Boolean. Export outcomes into local files? |
quiet |
Boolean. Keep messages off? |
ui |
Boolean. Save additional outputs for UI usage. List outcome. |
... |
Additional parameters passed to |
x |
|
A list object containing allocator result.
List. Contains optimized allocation results and plots.
## Not run: # Having InputCollect and OutputCollect results AllocatorCollect <- robyn_allocator( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = "1_2_3", scenario = "max_response", channel_constr_low = 0.7, channel_constr_up = c(1.2, 1.5, 1.5, 1.5, 1.5), channel_constr_multiplier = 4, date_range = "last_26", export = FALSE ) # Print a summary print(AllocatorCollect) # Plot the allocator one-pager plot(AllocatorCollect) ## End(Not run)
## Not run: # Having InputCollect and OutputCollect results AllocatorCollect <- robyn_allocator( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = "1_2_3", scenario = "max_response", channel_constr_low = 0.7, channel_constr_up = c(1.2, 1.5, 1.5, 1.5, 1.5), channel_constr_multiplier = 4, date_range = "last_26", export = FALSE ) # Print a summary print(AllocatorCollect) # Plot the allocator one-pager plot(AllocatorCollect) ## End(Not run)
robyn_clusters()
uses output from robyn_run()
,
to reduce the number of models and create bootstrapped confidence
interval and help the user pick up the best (lowest combined error)
of the most different kinds (clusters) of models.
robyn_clusters( input, dep_var_type, cluster_by = "hyperparameters", all_media = NULL, k = "auto", wss_var = 0.06, max_clusters = 10, limit = 1, weights = rep(1, 3), dim_red = "PCA", quiet = FALSE, export = FALSE, seed = 123, ... )
robyn_clusters( input, dep_var_type, cluster_by = "hyperparameters", all_media = NULL, k = "auto", wss_var = 0.06, max_clusters = 10, limit = 1, weights = rep(1, 3), dim_red = "PCA", quiet = FALSE, export = FALSE, seed = 123, ... )
input |
|
dep_var_type |
Character. For dep_var_type 'revenue', ROI is used for clustering. For conversion', CPA is used for clustering. |
cluster_by |
Character. Any of: "performance" or "hyperparameters". |
all_media |
Character vector. Default to |
k |
Integer. Number of clusters |
wss_var |
Numeric. Used to pick automatic |
max_clusters |
Integer. Maximum number of clusters. |
limit |
Integer. Top N results per cluster. If kept in "auto", will select k as the cluster in which the WSS variance was less than 5%. |
weights |
Vector, size 3. How much should each error weight? Order: nrmse, decomp.rssd, mape. The highest the value, the closer it will be scaled to origin. Each value will be normalized so they all sum 1. |
dim_red |
Character. Select dimensionality reduction technique.
Pass any of: |
quiet |
Boolean. Keep quiet? If not, print messages. |
export |
Export plots into local files? |
seed |
Numeric. Seed for reproducibility |
... |
Additional parameters passed to |
List. Clustering results as labeled data.frames and plots.
Bernardo Lares ([email protected])
## Not run: # Having InputCollect and OutputCollect results cls <- robyn_clusters( input = OutputCollect, all_media = InputCollect$all_media, k = 3, limit = 2, weights = c(1, 1, 1.5) ) ## End(Not run)
## Not run: # Having InputCollect and OutputCollect results cls <- robyn_clusters( input = OutputCollect, all_media = InputCollect$all_media, k = 3, limit = 2, weights = c(1, 1, 1.5) ) ## End(Not run)
robyn_converge()
consumes robyn_run()
outputs
and calculate convergence status and builds convergence plots.
Convergence is calculated by default using the following criteria
(having kept the default parameters: sd_qtref = 3 and med_lowb = 2):
Last quantile's standard deviation < first 3 quantiles' mean standard deviation
Last quantile's absolute median < absolute first quantile's absolute median - 2 * first 3 quantiles' mean standard deviation
Both mentioned criteria have to be satisfied to consider MOO convergence.
robyn_converge( OutputModels, n_cuts = 20, sd_qtref = 3, med_lowb = 2, nrmse_win = c(0, 0.998), ... )
robyn_converge( OutputModels, n_cuts = 20, sd_qtref = 3, med_lowb = 2, nrmse_win = c(0, 0.998), ... )
OutputModels |
List. Output from |
n_cuts |
Integer. Default to 20 (5% cuts each). |
sd_qtref |
Integer. Reference quantile of the error convergence rule for standard deviation (Criteria #1). Defaults to 3. |
med_lowb |
Integer. Lower bound distance of the error convergence rule for median. (Criteria #2). Default to 3. |
nrmse_win |
Numeric vector. Lower and upper quantiles thresholds to winsorize NRMSE. Set values within [0,1]; default: c(0, 0.998) which is 1/500. |
... |
Additional parameters |
List. Plots and MOO convergence results.
## Not run: # Having OutputModels results MOO <- robyn_converge( OutputModels, n_cuts = 10, sd_qtref = 3, med_lowb = 3 ) ## End(Not run)
## Not run: # Having OutputModels results MOO <- robyn_converge( OutputModels, n_cuts = 10, sd_qtref = 3, med_lowb = 3 ) ## End(Not run)
robyn_inputs()
is the function to input all model parameters and
check input correctness for the initial model build. It includes the
engineering process results that conducts trend, season,
holiday & weekday decomposition using Facebook's time-series forecasting
library prophet
and fit a nonlinear model to spend and exposure
metrics in case exposure metrics are used in paid_media_vars
.
robyn_inputs( dt_input = NULL, dep_var = NULL, dep_var_type = NULL, date_var = "auto", paid_media_spends = NULL, paid_media_vars = NULL, paid_media_signs = NULL, organic_vars = NULL, organic_signs = NULL, context_vars = NULL, context_signs = NULL, factor_vars = NULL, dt_holidays = Robyn::dt_prophet_holidays, prophet_vars = NULL, prophet_signs = NULL, prophet_country = NULL, adstock = NULL, hyperparameters = NULL, window_start = NULL, window_end = NULL, calibration_input = NULL, json_file = NULL, InputCollect = NULL, ... ) ## S3 method for class 'robyn_inputs' print(x, ...)
robyn_inputs( dt_input = NULL, dep_var = NULL, dep_var_type = NULL, date_var = "auto", paid_media_spends = NULL, paid_media_vars = NULL, paid_media_signs = NULL, organic_vars = NULL, organic_signs = NULL, context_vars = NULL, context_signs = NULL, factor_vars = NULL, dt_holidays = Robyn::dt_prophet_holidays, prophet_vars = NULL, prophet_signs = NULL, prophet_country = NULL, adstock = NULL, hyperparameters = NULL, window_start = NULL, window_end = NULL, calibration_input = NULL, json_file = NULL, InputCollect = NULL, ... ) ## S3 method for class 'robyn_inputs' print(x, ...)
dt_input |
data.frame. Raw input data. Load simulated
dataset using |
dep_var |
Character. Name of dependent variable. Only one allowed |
dep_var_type |
Character. Type of dependent variable as "revenue" or "conversion". Will be used to calculate ROI or CPI, respectively. Only one allowed and case sensitive. |
date_var |
Character. Name of date variable. Daily, weekly
and monthly data supported.
|
paid_media_spends |
Character vector. Names of the paid media variables.
The values on each of these variables must be numeric. Also,
|
paid_media_vars |
Character vector. Names of the paid media variables'
exposure level metrics (impressions, clicks, GRP etc) other than spend.
The values on each of these variables must be numeric. These variables are not
being used to train the model but to check relationship and recommend to
split media channels into sub-channels (e.g. fb_retargeting, fb_prospecting,
etc.) to gain more variance. |
paid_media_signs |
Character vector. Choose any of
|
organic_vars |
Character vector. Typically newsletter sendings,
push-notifications, social media posts etc. Compared to |
organic_signs |
Character vector. Choose any of
"default", "positive", "negative". Control
the signs of coefficients for |
context_vars |
Character vector. Typically competitors, price & promotion, temperature, unemployment rate, etc. |
context_signs |
Character vector. Choose any of
|
factor_vars |
Character vector. Specify which of the provided variables in organic_vars or context_vars should be forced as a factor. |
dt_holidays |
data.frame. Raw input holiday data. Load standard
Prophet holidays using |
prophet_vars |
Character vector. Include any of "trend", "season", "weekday", "monthly", "holiday" or NULL. Highly recommended to use all for daily data and "trend", "season", "holiday" for weekly and above cadence. Set to NULL to skip prophet's functionality. |
prophet_signs |
Character vector. Choose any of
"default", "positive", "negative". Control
the signs of coefficients for |
prophet_country |
Character. Only one country allowed.
Includes national holidays for all countries, whose list can
be found loading |
adstock |
Character. Choose any of "geometric", "weibull_cdf",
"weibull_pdf". Weibull adstock is a two-parametric function and thus more
flexible, but takes longer time than the traditional geometric one-parametric
function. CDF, or cumulative density function of the Weibull function allows
changing decay rate over time in both C and S shape, while the peak value will
always stay at the first period, meaning no lagged effect. PDF, or the
probability density function, enables peak value occurring after the first
period when shape >=1, allowing lagged effect. Run |
hyperparameters |
List. Contains hyperparameter lower and upper bounds.
Names of elements in list must be identical to output of |
window_start , window_end
|
Character. Set start and end dates of modelling
period. Recommended to not start in the first date in dataset to gain adstock
effect from previous periods. Also, columns to rows ratio in the input data
to be >=10:1, or in other words at least 10 observations to 1 independent variable.
This window will determine the date range of the data period within your dataset
you will be using to specifically regress the effects of media, organic and
context variables on your dependent variable. We recommend using a full
|
calibration_input |
data.frame. Optional. Provide experimental results to calibrate. Your input should include the following values for each experiment: channel, liftStartDate, liftEndDate, liftAbs, spend, confidence, metric. You can calibrate any spend or organic variable with a well designed experiment. You can also use experimental results from multiple channels; to do so, provide concatenated channel value, i.e. "channel_A+channel_B". Check "Guide for calibration source" section. |
json_file |
Character. JSON file to import previously exported inputs or
recreate a model. To generate this file, use |
InputCollect |
Default to NULL. |
... |
Additional parameters passed to |
x |
|
List. Contains all input parameters and modified results
using Robyn:::robyn_engineering()
. This list is ready to be
used on other functions like robyn_run()
and print()
.
Class: robyn_inputs
.
We strongly recommend to use experimental and causal results that are considered ground truth to calibrate MMM. Usual experiment types are people-based (e.g. Facebook conversion lift) and geo-based (e.g. Facebook GeoLift).
Currently, Robyn only accepts point-estimate as calibration input. For example, if 10k$ spend is tested against a hold-out for channel A, then input the incremental return as point-estimate as the example below.
The point-estimate has to always match the spend in the variable. For example, if channel A usually has 100k$ weekly spend and the experimental HO is 70
# Using dummy simulated data InputCollect <- robyn_inputs( dt_input = Robyn::dt_simulated_weekly, dt_holidays = Robyn::dt_prophet_holidays, date_var = "DATE", dep_var = "revenue", dep_var_type = "revenue", prophet_vars = c("trend", "season", "holiday"), prophet_country = "DE", context_vars = c("competitor_sales_B", "events"), paid_media_spends = c("tv_S", "ooh_S", "print_S", "facebook_S", "search_S"), paid_media_vars = c("tv_S", "ooh_S", "print_S", "facebook_I", "search_clicks_P"), organic_vars = "newsletter", factor_vars = "events", window_start = "2016-11-23", window_end = "2018-08-22", adstock = "geometric", # To be defined separately hyperparameters = NULL, calibration_input = NULL ) print(InputCollect)
# Using dummy simulated data InputCollect <- robyn_inputs( dt_input = Robyn::dt_simulated_weekly, dt_holidays = Robyn::dt_prophet_holidays, date_var = "DATE", dep_var = "revenue", dep_var_type = "revenue", prophet_vars = c("trend", "season", "holiday"), prophet_country = "DE", context_vars = c("competitor_sales_B", "events"), paid_media_spends = c("tv_S", "ooh_S", "print_S", "facebook_S", "search_S"), paid_media_vars = c("tv_S", "ooh_S", "print_S", "facebook_I", "search_clicks_P"), organic_vars = "newsletter", factor_vars = "events", window_start = "2016-11-23", window_end = "2018-08-22", adstock = "geometric", # To be defined separately hyperparameters = NULL, calibration_input = NULL ) print(InputCollect)
robyn_mmm()
function activates Nevergrad to generate samples of
hyperparameters, conducts media transformation within each loop, fits the
Ridge regression, calibrates the model optionally, decomposes responses
and collects the result. It's an inner function within robyn_run()
.
robyn_mmm( InputCollect, hyper_collect, iterations, cores, nevergrad_algo, intercept = TRUE, intercept_sign, ts_validation = TRUE, add_penalty_factor = FALSE, objective_weights = NULL, dt_hyper_fixed = NULL, rssd_zero_penalty = TRUE, refresh = FALSE, trial = 1L, seed = 123L, quiet = FALSE, ... ) model_decomp(inputs = list())
robyn_mmm( InputCollect, hyper_collect, iterations, cores, nevergrad_algo, intercept = TRUE, intercept_sign, ts_validation = TRUE, add_penalty_factor = FALSE, objective_weights = NULL, dt_hyper_fixed = NULL, rssd_zero_penalty = TRUE, refresh = FALSE, trial = 1L, seed = 123L, quiet = FALSE, ... ) model_decomp(inputs = list())
InputCollect |
List. Contains all input parameters for the model.
Required when |
hyper_collect |
List. Containing hyperparameter bounds. Defaults to
|
iterations |
Integer. Number of iterations to run. |
cores |
Integer. Default to |
nevergrad_algo |
Character. Default to "TwoPointsDE". Options are
|
intercept |
Boolean. Should intercept(s) be fitted (default=TRUE) or set to zero (FALSE). |
intercept_sign |
Character. Choose one of "non_negative" (default) or
"unconstrained". By default, if intercept is negative, Robyn will drop intercept
and refit the model. Consider changing intercept_sign to "unconstrained" when
there are |
ts_validation |
Boolean. When set to |
add_penalty_factor |
Boolean. Add penalty factor hyperparameters to glmnet's penalty.factor to be optimized by nevergrad. Use with caution, because this feature might add too much hyperparameter space and probably requires more iterations to converge. |
objective_weights |
Numeric vector. Default to NULL to give equal weights
to all objective functions. Order: NRMSE, DECOMP.RSSD, MAPE (when calibration
data is provided). When you are not calibrating, only the first 2 values for
|
dt_hyper_fixed |
data.frame or named list. Only provide when loading
old model results. It consumes hyperparameters from saved csv
|
rssd_zero_penalty |
Boolean. When TRUE, the objective function DECOMP.RSSD will penalize models with more 0 media effects additionally. In other words, given the same DECOMP.RSSD score, a model with 50% 0-coef variables will get penalized by DECOMP.RSSD * 1.5 (larger error), while another model with no 0-coef variables gets un-penalized with DECOMP.RSSD * 1. |
refresh |
Boolean. Set to |
trial |
Integer. Which trial are we running? Used to ID each model. |
seed |
Integer. For reproducible results when running nevergrad and clustering. Each trial will increase the seed by 1 unit (i.e. 10 trials with seed 1 will share 9 results with 10 trials with seed 2). |
quiet |
Boolean. Keep messages off? |
... |
Additional parameters passed to |
inputs |
List. Elements to pass sub-functions |
List. MMM results with hyperparameters values.
Pack robyn_plots()
, robyn_csv()
, and robyn_clusters()
outcomes on robyn_run()
results. When UI=TRUE
, enriched
OutputModels
results with additional plots and objects.
Create a plot to visualize the convergence for each of the datasets
when running robyn_run()
, especially useful for when using ts_validation.
As a reference, the closer the test and validation convergence points are,
the better, given the time-series wasn't overfitted.
robyn_outputs( InputCollect, OutputModels, pareto_fronts = "auto", calibration_constraint = 0.1, plot_folder = NULL, plot_folder_sub = NULL, plot_pareto = TRUE, csv_out = "pareto", clusters = TRUE, select_model = "clusters", ui = FALSE, export = TRUE, all_sol_json = FALSE, quiet = FALSE, refresh = FALSE, ... ) ## S3 method for class 'robyn_outputs' print(x, ...) robyn_csv( InputCollect, OutputCollect, csv_out = NULL, export = TRUE, calibrated = FALSE ) pareto_front(xi, yi, pareto_fronts = 1, ...) robyn_immcarr( InputCollect, OutputCollect, solID = NULL, start_date = NULL, end_date = NULL, ... ) robyn_plots( InputCollect, OutputCollect, export = TRUE, plot_folder = OutputCollect$plot_folder, ... ) robyn_onepagers( InputCollect, OutputCollect, select_model = NULL, quiet = FALSE, export = TRUE, plot_folder = OutputCollect$plot_folder, baseline_level = 0, ... ) ts_validation(OutputModels, quiet = FALSE, ...) decomp_plot( InputCollect, OutputCollect, solID = NULL, exclude = NULL, baseline_level = 0 )
robyn_outputs( InputCollect, OutputModels, pareto_fronts = "auto", calibration_constraint = 0.1, plot_folder = NULL, plot_folder_sub = NULL, plot_pareto = TRUE, csv_out = "pareto", clusters = TRUE, select_model = "clusters", ui = FALSE, export = TRUE, all_sol_json = FALSE, quiet = FALSE, refresh = FALSE, ... ) ## S3 method for class 'robyn_outputs' print(x, ...) robyn_csv( InputCollect, OutputCollect, csv_out = NULL, export = TRUE, calibrated = FALSE ) pareto_front(xi, yi, pareto_fronts = 1, ...) robyn_immcarr( InputCollect, OutputCollect, solID = NULL, start_date = NULL, end_date = NULL, ... ) robyn_plots( InputCollect, OutputCollect, export = TRUE, plot_folder = OutputCollect$plot_folder, ... ) robyn_onepagers( InputCollect, OutputCollect, select_model = NULL, quiet = FALSE, export = TRUE, plot_folder = OutputCollect$plot_folder, baseline_level = 0, ... ) ts_validation(OutputModels, quiet = FALSE, ...) decomp_plot( InputCollect, OutputCollect, solID = NULL, exclude = NULL, baseline_level = 0 )
InputCollect , OutputModels
|
|
pareto_fronts |
Integer. Number of Pareto fronts for the output.
|
calibration_constraint |
Numeric. Default to 0.1 and allows 0.01-0.1. When
calibrating, 0.1 means top 10
selection. Lower |
plot_folder |
Character. Path for saving plots and files. Default
to |
plot_folder_sub |
Character. Sub path for saving plots. Will overwrite the default path with timestamp or, for refresh and allocator, simply overwrite files. |
plot_pareto |
Boolean. Set to |
csv_out |
Character. Accepts "pareto" or "all". Default to "pareto". Set to "all" will output all iterations as csv. Set NULL to skip exports into CSVs. |
clusters |
Boolean. Apply |
select_model |
Character vector. Which models (by |
ui |
Boolean. Save additional outputs for UI usage. List outcome. |
export |
Boolean. Export outcomes into local files? |
all_sol_json |
Logical. Add all pareto solutions to json export? |
quiet |
Boolean. Keep messages off? |
refresh |
Boolean. Refresh mode |
... |
Additional parameters passed to |
x |
|
OutputCollect |
|
calibrated |
Logical |
xi , yi
|
Numeric. Coordinates values per observation. |
solID |
Character vector. Model IDs to plot. |
start_date , end_date
|
Character/Date. Dates to consider when calculating immediate and carryover values per channel. |
baseline_level |
Integer, from 0 to 5. Aggregate baseline variables, depending on the level of aggregation you need. Default is 0 for no aggregation. 1 for Intercept only. 2 adding trend. 3 adding all prophet decomposition variables. 4. Adding contextual variables. 5 Adding organic variables. Results will be reflected on the waterfall chart. |
exclude |
Character vector. Manually exclude variables from plot. |
(Invisible) list. Class: robyn_outputs
. Contains processed
results based on robyn_run()
results.
Invisible NULL
.
Invisible list with ggplot
plots.
Invisible list with patchwork
plot(s).
Invisible list with ggplot
plots.
robyn_refresh()
builds updated models based on
the previously built models saved in the Robyn.RDS
object specified
in robyn_object
. For example, when updating the initial build with 4
weeks of new data, robyn_refresh()
consumes the selected model of
the initial build, sets lower and upper bounds of hyperparameters for the
new build around the selected hyperparameters of the previous build,
stabilizes the effect of baseline variables across old and new builds, and
regulates the new effect share of media variables towards the latest
spend level. It returns the aggregated results with all previous builds for
reporting purposes and produces reporting plots.
You must run robyn_save()
to select and save an initial model first,
before refreshing.
When should robyn_refresh()
NOT be used:
The robyn_refresh()
function is suitable for
updating within "reasonable periods". Two situations are considered better
to rebuild model instead of refreshing:
1. Most data is new: If initial model was trained with 100 weeks worth of data but we add +50 weeks of new data.
2. New variables are added: If initial model had less variables than the ones we want to start using on new refresh model.
robyn_refresh( json_file = NULL, robyn_object = NULL, dt_input = NULL, dt_holidays = Robyn::dt_prophet_holidays, refresh_steps = 4, refresh_mode = "manual", refresh_iters = 1000, refresh_trials = 3, bounds_freedom = NULL, plot_folder = NULL, plot_pareto = TRUE, version_prompt = FALSE, export = TRUE, calibration_input = NULL, objective_weights = NULL, ... ) ## S3 method for class 'robyn_refresh' print(x, ...) ## S3 method for class 'robyn_refresh' plot(x, ...)
robyn_refresh( json_file = NULL, robyn_object = NULL, dt_input = NULL, dt_holidays = Robyn::dt_prophet_holidays, refresh_steps = 4, refresh_mode = "manual", refresh_iters = 1000, refresh_trials = 3, bounds_freedom = NULL, plot_folder = NULL, plot_pareto = TRUE, version_prompt = FALSE, export = TRUE, calibration_input = NULL, objective_weights = NULL, ... ) ## S3 method for class 'robyn_refresh' print(x, ...) ## S3 method for class 'robyn_refresh' plot(x, ...)
json_file |
Character. JSON file to import previously exported inputs or
recreate a model. To generate this file, use |
robyn_object |
Character or List. Path of the |
dt_input |
data.frame. Should include all previous data and newly added data for the refresh. |
dt_holidays |
data.frame. Raw input holiday data. Load standard
Prophet holidays using |
refresh_steps |
Integer. It controls how many time units the refresh
model build move forward. For example, |
refresh_mode |
Character. Options are "auto" and "manual". In auto mode,
the |
refresh_iters |
Integer. Iterations per refresh. Rule of thumb is, the more new data added, the more iterations needed. More reliable recommendation still needs to be investigated. |
refresh_trials |
Integer. Trials per refresh. Defaults to 5 trials. More reliable recommendation still needs to be investigated. |
bounds_freedom |
Numeric. Percentage of freedom we'd like to allow for the new hyperparameters values compared with the model to be refreshed. If set to NULL (default) the value will be calculated as refresh_steps / rollingWindowLength. Applies to all hyperparameters. |
plot_folder |
Character. Path for saving plots and files. Default
to |
plot_pareto |
Boolean. Set to |
version_prompt |
Logical. If FALSE, the model refresh version will be
selected based on the smallest combined error of normalized NRMSE, DECOMP.RSSD, MAPE.
If |
export |
Boolean. Export outcomes into local files? |
calibration_input |
data.frame. Optional. Provide experimental results to calibrate. Your input should include the following values for each experiment: channel, liftStartDate, liftEndDate, liftAbs, spend, confidence, metric. You can calibrate any spend or organic variable with a well designed experiment. You can also use experimental results from multiple channels; to do so, provide concatenated channel value, i.e. "channel_A+channel_B". Check "Guide for calibration source" section. |
objective_weights |
Numeric vector. Default to NULL to give equal weights
to all objective functions. Order: NRMSE, DECOMP.RSSD, MAPE (when calibration
data is provided). When you are not calibrating, only the first 2 values for
|
... |
Additional parameters to overwrite original custom parameters passed into initial model. |
x |
|
List. The Robyn object, class robyn_refresh
.
List. Same as robyn_run()
but with refreshed models.
## Not run: # Loading dummy data data("dt_simulated_weekly") data("dt_prophet_holidays") # Set the (pre-trained and exported) Robyn model JSON file json_file <- "~/Robyn_202208081444_init/RobynModel-2_55_4.json" # Run \code{robyn_refresh()} with 13 weeks cadence in auto mode Robyn <- robyn_refresh( json_file = json_file, dt_input = dt_simulated_weekly, dt_holidays = Robyn::dt_prophet_holidays, refresh_steps = 13, refresh_mode = "auto", refresh_iters = 200, refresh_trials = 5 ) # Run \code{robyn_refresh()} with 4 weeks cadence in manual mode json_file2 <- "~/Robyn_202208081444_init/Robyn_202208090847_rf/RobynModel-1_2_3.json" Robyn <- robyn_refresh( json_file = json_file2, dt_input = dt_simulated_weekly, dt_holidays = Robyn::dt_prophet_holidays, refresh_steps = 4, refresh_mode = "manual", refresh_iters = 200, refresh_trials = 5 ) ## End(Not run)
## Not run: # Loading dummy data data("dt_simulated_weekly") data("dt_prophet_holidays") # Set the (pre-trained and exported) Robyn model JSON file json_file <- "~/Robyn_202208081444_init/RobynModel-2_55_4.json" # Run \code{robyn_refresh()} with 13 weeks cadence in auto mode Robyn <- robyn_refresh( json_file = json_file, dt_input = dt_simulated_weekly, dt_holidays = Robyn::dt_prophet_holidays, refresh_steps = 13, refresh_mode = "auto", refresh_iters = 200, refresh_trials = 5 ) # Run \code{robyn_refresh()} with 4 weeks cadence in manual mode json_file2 <- "~/Robyn_202208081444_init/Robyn_202208090847_rf/RobynModel-1_2_3.json" Robyn <- robyn_refresh( json_file = json_file2, dt_input = dt_simulated_weekly, dt_holidays = Robyn::dt_prophet_holidays, refresh_steps = 4, refresh_mode = "manual", refresh_iters = 200, refresh_trials = 5 ) ## End(Not run)
robyn_response()
returns the response for a given
spend level of a given paid_media_vars
from a selected model
result and selected model build (initial model, refresh model, etc.).
robyn_response( InputCollect = NULL, OutputCollect = NULL, json_file = NULL, robyn_object = NULL, select_build = NULL, select_model = NULL, metric_name = NULL, metric_value = NULL, date_range = NULL, dt_hyppar = NULL, dt_coef = NULL, quiet = FALSE, ... )
robyn_response( InputCollect = NULL, OutputCollect = NULL, json_file = NULL, robyn_object = NULL, select_build = NULL, select_model = NULL, metric_name = NULL, metric_value = NULL, date_range = NULL, dt_hyppar = NULL, dt_coef = NULL, quiet = FALSE, ... )
InputCollect |
List. Contains all input parameters for the model.
Required when |
OutputCollect |
List. Containing all model result.
Required when |
json_file |
Character. JSON file to import previously exported inputs or
recreate a model. To generate this file, use |
robyn_object |
Character or List. Path of the |
select_build |
Integer. Default to the latest model build. |
select_model |
Character. A model |
metric_name |
A character. Selected media variable for the response. Must be one value from paid_media_spends, paid_media_vars or organic_vars |
metric_value |
Numeric. Desired metric value to return a response for. |
date_range |
Character. Date(s) to apply adstocked transformations and pick mean spends
per channel. Set one of: "all", "last", or "last_n" (where
n is the last N dates available), date (i.e. "2022-03-27"), or date range
(i.e. |
dt_hyppar |
A data.frame. When |
dt_coef |
A data.frame. When |
quiet |
Boolean. Keep messages off? |
... |
Additional parameters passed to |
List. Response value and plot. Class: robyn_response
.
## Not run: # Having InputCollect and OutputCollect objects ## Recreate original saturation curve Response <- robyn_response( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = select_model, metric_name = "facebook_S" ) Response$plot ## Or you can call a JSON file directly (a bit slower) # Response <- robyn_response( # json_file = "your_json_path.json", # dt_input = dt_simulated_weekly, # dt_holidays = dt_prophet_holidays, # metric_name = "facebook_S" # ) ## Get the "next 100 dollar" marginal response on Spend1 Spend1 <- 20000 Response1 <- robyn_response( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = select_model, metric_name = "facebook_S", metric_value = Spend1, # total budget for date_range date_range = "last_1" # last two periods ) Response1$plot Spend2 <- Spend1 + 100 Response2 <- robyn_response( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = select_model, metric_name = "facebook_S", metric_value = Spend2, date_range = "last_1" ) # ROAS for the 100$ from Spend1 level (Response2$response_total - Response1$response_total) / (Spend2 - Spend1) ## Get response from for a given budget and date_range Spend3 <- 100000 Response3 <- robyn_response( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = select_model, metric_name = "facebook_S", metric_value = Spend3, # total budget for date_range date_range = "last_5" # last 5 periods ) Response3$plot ## Example of getting paid media exposure response curves imps <- 10000000 response_imps <- robyn_response( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = select_model, metric_name = "facebook_I", metric_value = imps ) response_imps$response_total / imps * 1000 response_imps$plot ## Example of getting organic media exposure response curves sendings <- 30000 response_sending <- robyn_response( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = select_model, metric_name = "newsletter", metric_value = sendings ) # response per 1000 sendings response_sending$response_total / sendings * 1000 response_sending$plot ## End(Not run)
## Not run: # Having InputCollect and OutputCollect objects ## Recreate original saturation curve Response <- robyn_response( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = select_model, metric_name = "facebook_S" ) Response$plot ## Or you can call a JSON file directly (a bit slower) # Response <- robyn_response( # json_file = "your_json_path.json", # dt_input = dt_simulated_weekly, # dt_holidays = dt_prophet_holidays, # metric_name = "facebook_S" # ) ## Get the "next 100 dollar" marginal response on Spend1 Spend1 <- 20000 Response1 <- robyn_response( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = select_model, metric_name = "facebook_S", metric_value = Spend1, # total budget for date_range date_range = "last_1" # last two periods ) Response1$plot Spend2 <- Spend1 + 100 Response2 <- robyn_response( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = select_model, metric_name = "facebook_S", metric_value = Spend2, date_range = "last_1" ) # ROAS for the 100$ from Spend1 level (Response2$response_total - Response1$response_total) / (Spend2 - Spend1) ## Get response from for a given budget and date_range Spend3 <- 100000 Response3 <- robyn_response( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = select_model, metric_name = "facebook_S", metric_value = Spend3, # total budget for date_range date_range = "last_5" # last 5 periods ) Response3$plot ## Example of getting paid media exposure response curves imps <- 10000000 response_imps <- robyn_response( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = select_model, metric_name = "facebook_I", metric_value = imps ) response_imps$response_total / imps * 1000 response_imps$plot ## Example of getting organic media exposure response curves sendings <- 30000 response_sending <- robyn_response( InputCollect = InputCollect, OutputCollect = OutputCollect, select_model = select_model, metric_name = "newsletter", metric_value = sendings ) # response per 1000 sendings response_sending$response_total / sendings * 1000 response_sending$plot ## End(Not run)
robyn_run()
consumes robyn_input()
outputs,
runs robyn_mmm()
, and collects all modeling results.
robyn_run( InputCollect = NULL, dt_hyper_fixed = NULL, json_file = NULL, ts_validation = FALSE, add_penalty_factor = FALSE, refresh = FALSE, seed = 123L, quiet = FALSE, cores = NULL, trials = 5, iterations = 2000, rssd_zero_penalty = TRUE, objective_weights = NULL, nevergrad_algo = "TwoPointsDE", intercept = TRUE, intercept_sign = "non_negative", lambda_control = NULL, outputs = FALSE, ... ) ## S3 method for class 'robyn_models' print(x, ...)
robyn_run( InputCollect = NULL, dt_hyper_fixed = NULL, json_file = NULL, ts_validation = FALSE, add_penalty_factor = FALSE, refresh = FALSE, seed = 123L, quiet = FALSE, cores = NULL, trials = 5, iterations = 2000, rssd_zero_penalty = TRUE, objective_weights = NULL, nevergrad_algo = "TwoPointsDE", intercept = TRUE, intercept_sign = "non_negative", lambda_control = NULL, outputs = FALSE, ... ) ## S3 method for class 'robyn_models' print(x, ...)
InputCollect |
List. Contains all input parameters for the model.
Required when |
dt_hyper_fixed |
data.frame or named list. Only provide when loading
old model results. It consumes hyperparameters from saved csv
|
json_file |
Character. JSON file to import previously exported inputs or
recreate a model. To generate this file, use |
ts_validation |
Boolean. When set to |
add_penalty_factor |
Boolean. Add penalty factor hyperparameters to glmnet's penalty.factor to be optimized by nevergrad. Use with caution, because this feature might add too much hyperparameter space and probably requires more iterations to converge. |
refresh |
Boolean. Set to |
seed |
Integer. For reproducible results when running nevergrad and clustering. Each trial will increase the seed by 1 unit (i.e. 10 trials with seed 1 will share 9 results with 10 trials with seed 2). |
quiet |
Boolean. Keep messages off? |
cores |
Integer. Default to |
trials |
Integer. Recommended 5 for default
|
iterations |
Integer. Recommended 2000 for default when using
|
rssd_zero_penalty |
Boolean. When TRUE, the objective function DECOMP.RSSD will penalize models with more 0 media effects additionally. In other words, given the same DECOMP.RSSD score, a model with 50% 0-coef variables will get penalized by DECOMP.RSSD * 1.5 (larger error), while another model with no 0-coef variables gets un-penalized with DECOMP.RSSD * 1. |
objective_weights |
Numeric vector. Default to NULL to give equal weights
to all objective functions. Order: NRMSE, DECOMP.RSSD, MAPE (when calibration
data is provided). When you are not calibrating, only the first 2 values for
|
nevergrad_algo |
Character. Default to "TwoPointsDE". Options are
|
intercept |
Boolean. Should intercept(s) be fitted (default=TRUE) or set to zero (FALSE). |
intercept_sign |
Character. Choose one of "non_negative" (default) or
"unconstrained". By default, if intercept is negative, Robyn will drop intercept
and refit the model. Consider changing intercept_sign to "unconstrained" when
there are |
lambda_control |
Deprecated in v3.6.0. |
outputs |
Boolean. If set to TRUE, will run |
... |
Additional parameters passed to |
x |
|
List. Class: robyn_models
. Contains the results of all trials
and iterations modeled.
List. Contains all trained models. Class: robyn_models
.
## Not run: # Having InputCollect results OutputModels <- robyn_run( InputCollect = InputCollect, cores = 2, iterations = 200, trials = 1 ) ## End(Not run)
## Not run: # Having InputCollect results OutputModels <- robyn_run( InputCollect = InputCollect, cores = 2, iterations = 200, trials = 1 ) ## End(Not run)
Use robyn_save()
to select and save as .RDS file the initial model.
robyn_save( InputCollect, OutputCollect, robyn_object = NULL, select_model = NULL, dir = OutputCollect$plot_folder, quiet = FALSE, ... ) ## S3 method for class 'robyn_save' print(x, ...) ## S3 method for class 'robyn_save' plot(x, ...) robyn_load(robyn_object, select_build = NULL, quiet = FALSE)
robyn_save( InputCollect, OutputCollect, robyn_object = NULL, select_model = NULL, dir = OutputCollect$plot_folder, quiet = FALSE, ... ) ## S3 method for class 'robyn_save' print(x, ...) ## S3 method for class 'robyn_save' plot(x, ...) robyn_load(robyn_object, select_build = NULL, quiet = FALSE)
InputCollect |
List. Contains all input parameters for the model.
Required when |
OutputCollect |
List. Containing all model result.
Required when |
robyn_object |
Character or List. Path of the |
select_model |
Character. A model |
dir |
Character. Existing directory to export JSON file to. |
quiet |
Boolean. Keep messages off? |
... |
Additional parameters passed to |
x |
|
select_build |
Integer. Default to the latest model build. |
(Invisible) list with filename and summary. Class: robyn_save
.
(Invisible) list with imported results
robyn_train()
consumes output from robyn_input()
and runs the robyn_mmm()
on each trial.
robyn_train( InputCollect, hyper_collect, cores, iterations, trials, intercept_sign, intercept, nevergrad_algo, dt_hyper_fixed = NULL, ts_validation = TRUE, add_penalty_factor = FALSE, objective_weights = NULL, rssd_zero_penalty = TRUE, refresh = FALSE, seed = 123, quiet = FALSE )
robyn_train( InputCollect, hyper_collect, cores, iterations, trials, intercept_sign, intercept, nevergrad_algo, dt_hyper_fixed = NULL, ts_validation = TRUE, add_penalty_factor = FALSE, objective_weights = NULL, rssd_zero_penalty = TRUE, refresh = FALSE, seed = 123, quiet = FALSE )
InputCollect |
List. Contains all input parameters for the model.
Required when |
hyper_collect |
List. Containing hyperparameter bounds. Defaults to
|
cores |
Integer. Default to |
iterations |
Integer. Recommended 2000 for default when using
|
trials |
Integer. Recommended 5 for default
|
intercept_sign |
Character. Choose one of "non_negative" (default) or
"unconstrained". By default, if intercept is negative, Robyn will drop intercept
and refit the model. Consider changing intercept_sign to "unconstrained" when
there are |
intercept |
Boolean. Should intercept(s) be fitted (default=TRUE) or set to zero (FALSE). |
nevergrad_algo |
Character. Default to "TwoPointsDE". Options are
|
dt_hyper_fixed |
data.frame or named list. Only provide when loading
old model results. It consumes hyperparameters from saved csv
|
ts_validation |
Boolean. When set to |
add_penalty_factor |
Boolean. Add penalty factor hyperparameters to glmnet's penalty.factor to be optimized by nevergrad. Use with caution, because this feature might add too much hyperparameter space and probably requires more iterations to converge. |
objective_weights |
Numeric vector. Default to NULL to give equal weights
to all objective functions. Order: NRMSE, DECOMP.RSSD, MAPE (when calibration
data is provided). When you are not calibrating, only the first 2 values for
|
rssd_zero_penalty |
Boolean. When TRUE, the objective function DECOMP.RSSD will penalize models with more 0 media effects additionally. In other words, given the same DECOMP.RSSD score, a model with 50% 0-coef variables will get penalized by DECOMP.RSSD * 1.5 (larger error), while another model with no 0-coef variables gets un-penalized with DECOMP.RSSD * 1. |
refresh |
Boolean. Set to |
seed |
Integer. For reproducible results when running nevergrad and clustering. Each trial will increase the seed by 1 unit (i.e. 10 trials with seed 1 will share 9 results with 10 trials with seed 2). |
quiet |
Boolean. Keep messages off? |
List. Iteration results to include in robyn_run()
results.
Update Robyn version from Github repository for latest "dev" version or from CRAN for latest "stable" version.
robyn_update(dev = TRUE, ...)
robyn_update(dev = TRUE, ...)
dev |
Boolean. Dev version? If not, CRAN version. |
... |
Parameters to pass to |
Invisible NULL
.
robyn_write()
generates light JSON files with all the information
required to replicate Robyn models. Depending on user inputs, there are
3 use cases: only the inputs data, input data + modeling results data,
and input data, modeling results + specifics of a single selected model.
To replicate a model, you must provide InputCollect, OutputCollect, and,
if OutputCollect contains more than one model, the select_model.
robyn_write( InputCollect, OutputCollect = NULL, select_model = NULL, dir = OutputCollect$plot_folder, add_data = TRUE, export = TRUE, quiet = FALSE, pareto_df = NULL, ... ) ## S3 method for class 'robyn_write' print(x, ...) robyn_read(json_file = NULL, step = 1, quiet = FALSE, ...) ## S3 method for class 'robyn_read' print(x, ...) robyn_recreate(json_file, quiet = FALSE, ...)
robyn_write( InputCollect, OutputCollect = NULL, select_model = NULL, dir = OutputCollect$plot_folder, add_data = TRUE, export = TRUE, quiet = FALSE, pareto_df = NULL, ... ) ## S3 method for class 'robyn_write' print(x, ...) robyn_read(json_file = NULL, step = 1, quiet = FALSE, ...) ## S3 method for class 'robyn_read' print(x, ...) robyn_recreate(json_file, quiet = FALSE, ...)
InputCollect |
|
OutputCollect |
|
select_model |
Character. Which model ID do you want to export into the JSON file? |
dir |
Character. Existing directory to export JSON file to. |
add_data |
Boolean. Include raw dataset. Useful to recreate models with a single file containing all the required information (no need of CSV). |
export |
Boolean. Export outcomes into local files? |
quiet |
Boolean. Keep messages off? |
pareto_df |
Dataframe. Save all pareto solutions to json file. |
... |
Additional parameters to export into a custom Extras element. |
x |
|
json_file |
Character. JSON file name to read and import. |
step |
Integer. 1 for import only and 2 for import and output. |
(invisible) List. Contains all inputs and outputs of exported model.
Class: robyn_write
.
## Not run: InputCollectJSON <- robyn_inputs( dt_input = Robyn::dt_simulated_weekly, json_file = "~/Desktop/RobynModel-1_29_12.json" ) print(InputCollectJSON) ## End(Not run)
## Not run: InputCollectJSON <- robyn_inputs( dt_input = Robyn::dt_simulated_weekly, json_file = "~/Desktop/RobynModel-1_29_12.json" ) print(InputCollectJSON) ## End(Not run)
saturation_hill
is a two-parametric version of the Hill
function that allows the saturation curve to flip between S and C shape.
Produce example plots for the Hill saturation curve.
saturation_hill(x, alpha, gamma, x_marginal = NULL) plot_saturation(plot = TRUE)
saturation_hill(x, alpha, gamma, x_marginal = NULL) plot_saturation(plot = TRUE)
x |
Numeric vector. |
alpha |
Numeric. Alpha controls the shape of the saturation curve. The larger the alpha, the more S-shape. The smaller, the more C-shape. |
gamma |
Numeric. Gamma controls the inflexion point of the saturation curve. The larger the gamma, the later the inflexion point occurs. |
x_marginal |
Numeric. When provided, the function returns the Hill-transformed value of the x_marginal input. |
plot |
Boolean. Do you wish to return the plot? |
Numeric values. Transformed values.
Other Transformations:
adstock_geometric()
,
transformations
saturation_hill(c(100, 150, 170, 190, 200), alpha = 3, gamma = 0.5)
saturation_hill(c(100, 150, 170, 190, 200), alpha = 3, gamma = 0.5)
Robyn only accepts daily, weekly and monthly data. This function
is only called in robyn_engineering()
.
set_holidays(dt_transform, dt_holidays, intervalType)
set_holidays(dt_transform, dt_holidays, intervalType)
dt_transform |
A data.frame. Transformed input data. |
dt_holidays |
A data.frame. Raw input holiday data. |
intervalType |
A character. Accepts one of the values:
|
List. Containing the all spend-exposure model results.
The Michaelis-Menten mic_men()
function is used to fit the spend
exposure relationship for paid media variables, when exposure metrics like
impressions, clicks or GRPs are provided in paid_media_vars
instead
of spend metric.
mic_men(x, Vmax, Km, reverse = FALSE) run_transformations(InputCollect, hyperparameters, ...)
mic_men(x, Vmax, Km, reverse = FALSE) run_transformations(InputCollect, hyperparameters, ...)
x |
Numeric value or vector. Input media spend when
|
Vmax |
Numeric Indicates maximum rate achieved by the system. |
Km |
Numeric. The Michaelis constant. |
reverse |
Boolean. Input media spend when |
InputCollect |
Default to NULL. |
hyperparameters |
List. Contains hyperparameter lower and upper bounds.
Names of elements in list must be identical to output of |
... |
Additional parameters passed to |
Numeric values. Transformed values.
Other Transformations:
adstock_geometric()
,
saturation_hill()
mic_men(x = 5:10, Vmax = 5, Km = 0.5)
mic_men(x = 5:10, Vmax = 5, Km = 0.5)