GitHub - Simulation-Decomposition/simdec-matlab: Sensitivity analysis using simulation decomposition

Warning This library is under active development and things can change at anytime! Suggestions and help are greatly appreciated.

Simulation decomposition or SimDec is an uncertainty and sensitivity analysis method, which is based on Monte Carlo simulation. SimDec consists of two major parts (and two corresponding funciton):

Computing sensitivity indices,
Visualizing the effects by creating multi-variable scenarios, mapping the output values to them, and color-coding the corresponding segments of the output distribution.

SimDec reveals the nature of causalities and interaction effects in the model.

Starting to work

Simply download the matlab functions and add them to your path when working with matlab.

Use sensitivity_indices.m for computing the indices and simdec_visualization.m for building the graphics.

Documentation

function [SI, FOE, SOE] = sensitivity_indices (output, inputs)

calculates how much variability of the output is explained by inputs.

INPUTS

output - target variable (Y), size [N_runs, 1].
inputs - input variables (Xs), size [N_runs, N_factors].

OUTPUTS

FOE - first-order effects (also called 'main' or 'individual'), size [N_factors, 1].
SOE - second-order effects (also called 'interaction'), size [N_factors, N_factors].
SI - sensitivity indices, combined first- and second-order effect of each input, size [N_factors, 1].

function [scenarios, scen_legend, boundaries_out] = simdec_visualization (output, inputs, SI, varargin)

builds SimDec visualization using data decomposition.

REQUIRED INPUTS

output - target variable (Y), size [N_runs, 1].
inputs - input variables (Xs), size [N_runs, N_factors].
SI - sensitivity indices.

OPTIONAL ARGUMENTS

DecompositionLimit - threshold of cumulative importance (sum(SI)) for selection of input variables for decomposition. Default value 0.8*sum(SI).
OrderOfVariables - for custom decomposition specify the order of variables for decomposition, use zero to exclude. For example, if 4 input variables, third and second are desired for decomposition, then specify OrderOfVariables as [0 2 1 0]. Default value [].
NumberOfStates - the number of states for each input variable, i.e. [0 3 2 0]. Default value [].
BoundaryType - defines how the numerical boundaries between the states of inputs are defined:
- 'precentile-based' for same amount of observations in each state (default value),
- 'interval-based' for equaly-spaced ranges of states.
StateBoundaries - maximums (numeric boundaries) of every state, leave the rest as NaN, e.g. [NaN 3 -1 NaN; NaN 5 0 NaN; NaN 7 NaN NaN]. Default value [].
OutputName - name of the output variable. Default value 'Y'.
InputNames - names of the input variables in the order of their appearance in the original dataset. Default value {X1, X2, X3...}.
MainColors - a cell array with HEX numbers of the main colors for decomposition (should correspond to the number of states of the first for decomposition input variable).
GraphType - 'stacked_histogram' as a default option and 'boxplot' as an alternative.

OUTPUTS

scenarios - an array of the same size as Y with scenario indices for every simulation run.
scen_legend - a scenario table that shows which states of which variables compose different sceanrios.
boundaries_out - numeric boundaries of states of input variables.

Example

The following procedure is saved in the main.m and uses example_data.xlsx.

1. Load data

First the simulated inputs and the output need to be specified. They can result from a Monte Carlo simulation arranged directly in matlab, or conducted elsewhere and then loaded through a file, like in this example.

    Matrix = xlsread ("example_data.xlsx");
    
    output = Matrix(:,1);
    inputs = Matrix(:,2:end);

2. Compute sensitivity indices

Function sensitivity_indices.m computes first-order effects FOE (main individual effect of every input variable), second-order effects SOE (interaction effects between pairs of variables) and combined sensitivity indices SI.

    [SI, FOE, SOE]  = sensitivity_indices (output, inputs)
    sum(SI) % shows what portion of the variance of the output is explained 
            % by the combined first- and second-order effects of the inputs

Here is the result it returns:

SI =

FOE =

SOE =

     0    0.0034    0.0015    0.0035
     0         0   -0.0605    0.1059
     0         0         0    0.0363
     0         0         0         0

sum(SI) =

1.0024

Each value shows what portion of the variance of the output is explained (negative SOE values indicate correlation). In this example, SI shows that the most important inputs are X2 (52%) and X4 (35%). SOE points out that there is interaction between X2 and X3 (11%) and correlation between X2 and X3 (-6%). In total, 100% of the output variance is explained (a few percentage points can be attributed to noise).

3. Visualize

Function simdec_visualization.m

chooses the most important input variables,
breaks them down into states,
forms scenarios out of all combinations of those states,
maps the scenarios onto the output values, and
visualizes these scenarios by color-coding the distribution of the output.

[scenarios, scen_legend, boundaries] = simdec_visualization (output, inputs, SI);

The SimDec graph and the corresponding legend is generated automatically when running this function.

That's it, your SimDec analysis is completed!

But you can customize it furhter.

And feel free to go an extra step in your reporting, - name the states (i.e., low, medium, high) and merge the cells of the legend with the same state. The help to make those automatic in would be greatly approeciated!

4. Customize

The simdec_visualization.m function has numerious optional arguments that can be used to polish the outlook of the results, tune and play with the decomposition set-up.

4.1. Boxplot

Add 'GraphType','boxplot' to the function's input to display the results in the form of boxplot instead of stacked histogram.

    [scenarios, scen_legend, boundaries] = simdec_visualization (output, inputs, SI,'GraphType','boxplot');

The boxplot visualization presents exactly the same decomposition and contains the same scenarios, color-coded in the same way as in the stacked histogram.

The boxplots are handy when some scenarios have little data and are poorly visible on the histogram.

4.2. Polishing: colors, names of variables

Use 'OutputName', 'InputNames' to add names of the variables. Specify 'MainColors' as a cell array of HEX codes of the desired main colors (the shades for subdivisions are produced automatically from the main colors).

    output_name = 'Output';
    input_names = {'Input1','Input2','Input3','Input4'};
    colors = {'#3F45D0','#DC267F','26DCD1'}; % HEX codes for the main colors
    
    [scenarios, scen_legend, boundaries] = simdec_visualization (output, inputs,...
        SI,'OutputName',output_name,'InputNames',input_names,'MainColors',colors);

4.3. Tuning decomposition: more variables, different state formation

Deafult number of variables for decomposition is defined beased on the threshold 0.8*sum(SI). The threshold can be changed by using 'DecompositionLimit' argument.

The 'BoundaryType' argument defines how the numeric range of input variables is broken down into states. The deafult value 'percentile-based' forms the states ensuring the same amount of observations in each state. The alternative 'interval-based' approach divides input ranges into equaly-spaced intervals. Changing this argument would not make any difference for independent uniformly distributed inputs.

    dec_limit = 0.9; % minimum overall importance [sum(SI)] of chosen for decomposition input variables
    boundary_type = 'interval-based'; 
                                   
    [scenarios, scen_legend, boundaries] = simdec_visualization (output, inputs,...
        SI,'DecompositionLimit',dec_limit,'BoundaryType',boundary_type);

4.4. Tuning decomposition: more variables, different state formation

The decomposition can be further fully customized by altering the amount and the order of inputs for decomposition ('OrderOfVariables'), defining custom amount of states ('NumberOfStates') and their boundaries ('StateBoundaries').

    manual_vars = [0 2 1 0]; % specify the order of variables for decomposition, 
        % use 0 to exclude, size (1, N_inputs). In this example we set that the third
        % input variable is used first, and then the second one.  
    manual_states = [0 3 2 0]; % specify the number of states for each variable, 
        % size (1, N_inputs), the position corresponds to the original order of inputs.
        % Three states for the second input variable and two states for the third. 
    manual_boundaries =  [ NaN    min(inputs(:,2))     min(inputs(:,3))    NaN
                           NaN         100                  657.5          NaN
                           NaN         650             max(inputs(:,3))    NaN      
                           NaN    max(inputs(:,2))          NaN            NaN]; 
        % specify numeric boundaries for every state, size(max(manual_states)+1, N_inputs).


    [scenarios, scen_legend, boundaries] = simdec_visualization (output, inputs,...
        SI,'OrderOfVariables',manual_vars,'NumberOfStates',manual_states,'StateBoundaries',manual_boundaries);

The optional attributes can be used in any combination.

Links

See how to read SimDec on wikipedia
Meet people behind SimDec
Stay in touch via our Sensitivity Analysis discord community

Citations

The algorithms and visualizations used in this package came primarily out of research at LUT University, Lappeenranta, Finland, and Stanford University, California, U.S., supported with grants from Business Finland, Wihuri Foundation, Foundation for Economic Education, and Natural Sciences and Engineering Research Council. If you use SimDec in your research we would appreciate a citation to the following publications:

Kozlova, M., & Yeomans, J. S. (2022). Monte Carlo Enhancement via Simulation Decomposition: A “Must-Have” Inclusion for Many Disciplines. INFORMS Transactions on Education, 22(3), 147-159. Available here.
Kozlova, M., Moss, R. J., Yeomans, J. S., & Caers, J. (2024). Uncovering Heterogeneous Effects in Computational Models for Sustainable Decision-making. Environmental Modelling & Software, 171, 105898. https://doi.org/10.1016/j.envsoft.2023.105898
Kozlova, M., Moss, R. J., Roy, P., Alam, A., & Yeomans, J. S. (forthcoming). SimDec algorithm and guidelines for its usage and interpretation. In M. Kozlova & J. S. Yeomans (Eds.), Sensitivity Analysis for Business, Technology, and Policymaking. Made Easy with Simulation Decomposition. Routledge.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

functions

functions

LICENSE

LICENSE

README.md

README.md

example_data.xlsx

example_data.xlsx

main.m

main.m

Repository files navigation

Starting to work

Documentation

function [SI, FOE, SOE] = sensitivity_indices (output, inputs)

function [scenarios, scen_legend, boundaries_out] = simdec_visualization (output, inputs, SI, varargin)

Example

1. Load data

2. Compute sensitivity indices

3. Visualize

4. Customize

4.1. Boxplot

4.2. Polishing: colors, names of variables

4.3. Tuning decomposition: more variables, different state formation

4.4. Tuning decomposition: more variables, different state formation

Links

Citations

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
functions		functions
LICENSE		LICENSE
README.md		README.md
example_data.xlsx		example_data.xlsx
main.m		main.m

License

Simulation-Decomposition/simdec-matlab

Folders and files

Latest commit

History

Repository files navigation

Starting to work

Documentation

function [SI, FOE, SOE] = sensitivity_indices (output, inputs)

function [scenarios, scen_legend, boundaries_out] = simdec_visualization (output, inputs, SI, varargin)

Example

1. Load data

2. Compute sensitivity indices

3. Visualize

4. Customize

4.1. Boxplot

4.2. Polishing: colors, names of variables

4.3. Tuning decomposition: more variables, different state formation

4.4. Tuning decomposition: more variables, different state formation

Links

Citations

About

Topics

Resources

License

Stars

Watchers

Forks

Languages