from IPython.display import IFrame
'https://app.powerbi.com/view?r=eyJrIjoiNDFlMTA2ZGItYmFkOC00Mjg3LTljODYtYzRkYzY2YmFmM2E4IiwidCI6ImZjNWRmZDc1LTQ1NzktNGQ3MC05YTE3LTk0MmQ0OWMwMTc2ZiIsImMiOjl9', width=933, height=700) IFrame(
Exporting Simulation Results
Exporting Simulation Results
Exporting Scenario Data For Further Analysis In High-End Business Intelligence Tools
Jupyter Notebooks are a perfect environment to create System Dynamics and Agent-based Models and analyse them in-depth – at least if you are a data scientist or computational modeler.
But what if you want to present your results to people who are not quite so tech savvy or don’t have all the necessary tools installed?
We face such situations quite often with our clients and at one point we asked ourselves:
- Why not use a high-end business intelligence tool such as Microsofts Power BI Desktop to create a polished dasboards?
- Why not share the reports using the Power BI service? After all, Power BI was created to create such data intelligence apps!
- Why not use Jupyter notebooks to create sophisticated simulation models (which is what the Jupyter+Python Ecosystem is good at) and then use Power BI for the fancy UI (which is what Power BI is good at)
To achieve this, all we really need BPTK-Py to do is to export the data generated by the simulation for those scenarios that are relevant for the report.
We can then import that data into Power BI and build the report using Power BI’s WYSIWIG tools.
To achieve this, we’ve added a new method called export_scenarios
to BPTK-Py which writes the data for a set of scenarios and interactive settings to an Excel file.
Here is what the method looks like for the customer acquisition model, which is one of the models we’ve provided with the tutorial:
bptk.export_scenarios(="smCustomerAcquisition",
scenario_manager=["customers","profit"],
equations='/path/to/exported/data/customer_aquisition.xlsx',
filename="interactiveScenario",
interactive_scenario=["customers","profit"],
interactive_equations= {
interactive_settings"advertisingSuccessPct":(0,0.2,0.01),
"referralFreeMonths":(0,40,10),
"referralProgramAdoptionPct":(0,12,1),
"referrals":(0,12,1)
} )
And here is a Power BI report we’ve created from the data, you can access it directly on PowerBI:
A Closer Look At BPTK-Py’s Export Function
This section takes a closer look at how the export function is implemented, just in case you would like to add some features or export the data in some other format.
## Load the BPTK Package
from BPTK_Py.bptk import bptk
= bptk() bptk
The first thing to remember is how scenario data is stored in a pandas
dataframe:
=bptk.plot_scenarios(
df=["smCustomerAcquisition"],
scenario_managers=["referSomeonePlease"],
scenarios=["customers","profit"],
equations=True
return_df )
1:10] df[
Notice that each indicator (aka equation, customers
and profit
in this example) has its own column and that the time dimension forms the index of the dataframe. Also notice that the name of the scenario referSomeonePlease
is not stored in the dataframe itself.
Now when it comes to displaying the data in an interactive report (like the one above) we would like to be able to switch between scenarios. So one thing we need to do is to add a column containing a name of the scenario to the dataframe. We also would like one large dataframe containing the data from all the scenarios.
# get all the scenarios from the smCustomerAquisition scenario manager
="smCustomerAcquisition"
scenario_manager= bptk.get_scenario_names([scenario_manager]) scenarios
scenarios
# List of equations we want to generate data for
=["customers","profit"] equations
# create a new dataframe with a column for each equation, indexed by time and scenario
= []
scenario_dfs for scenario in scenarios:
# first create a dataframe for each scenario
= bptk.plot_scenarios(
df =[scenario_manager],
scenario_managers=[scenario],
scenarios=equations,
equations=True)
return_df# add a colum which will contain the name of the scenario
"scenario"] = [scenario] * len(df.index)
df[# create a new column which will contain the time step (which won't be a unique index anymore, as we are concatenating many scenarios)
"time"] = df.index
df[+= [df] scenario_dfs
0:2] scenario_dfs[
import pandas as pd
# concatenate the data into one table using pandas concat function
= pd.concat(scenario_dfs, ignore_index=True, sort=False)
scenarios_tab ="id" scenarios_tab.index.name
scenarios_tab
We now have all the data for all the scenarios in one large dataframe. Each row is indexed by the scenario it belongs to. The timestamp is only unique within a given scenario.
Generating The Data For Scenario Comparison
The data we have generated so far is a table with a column for each indicator, indexed by scenario.
This is fine if you want to look at data scenario by scenario or plot two different indicators for the same scenario.
But what if you want to compare the same indicator for different scenarios?
In such a case, your data needs to be structured a little differently - essential we then want a table with a column for each scenario, indexed by the indicator.
To achieve this, we need to loop through the scenarios again:
# create a new dataframe with a column for each scenario, indexed by time and indicator
= []
indicator_dfs for scenario_no, scenario in enumerate(scenarios):
=[]
scenario_dfs# loop through the equations
for equation in equations:
# add a column which will contain the name of the indicator
= bptk.plot_scenarios(
df =[scenario_manager],
scenario_managers=[scenario],
scenarios=[equation],
equations=True)
return_df={equation:scenario},inplace=True)
df.rename(columnsif scenario_no is len(scenarios)-1:
"indicator"] = [equation] * len(df.index)
df["time"] = df.index
df[+=[df]
scenario_dfs
# conacate the indicators for the scenario (i.e. along axis 0)
= pd.concat(scenario_dfs, axis=0,ignore_index=True,sort=False)
indicators_scenario_tab
# create a new column which will contain the time step (which won't be a unique index anymore, as we are concatenating many scenarios)
+= [indicators_scenario_tab]
indicator_dfs
#concatenate all the scenario columns (i.e. along axis 1)
=pd.concat(indicator_dfs,axis=1,sort=False)
indicators_tab="id" indicators_tab.index.name
0:10] indicators_tab[
Generating The Data For Interactive Dashboards
In most cases creating an interactive report that just compares predefined scenarios is quite enough. But sometimes you would like to add a little dashboard to allow users to test different settings themselves, like the “Forecast” page in the example above.
The easiest way to achive this in Power BI is to use so called “What If” parameters to select a scenario from a set of pre-computed scenarios. We need to pre-compute them because currently Power BI doesn’t allow you to query data live with different parameters.
In most cases there will be thousands of “interactive” scenarios you need to pre-compute, so it is not feasible to enumerate them as a list. Instead the idea is to start with a base “interactive” scenario and then vary a set of parameters within a given range, much like in a Monte Carlo simulation.
# these are the scenarios initially defined
"smCustomerAcquisition"]) bptk.list_scenarios([
# a new scenario which will act as a base for generating the data needed for interactive dashboards
="smCustomerAcquisition",scenarios={
bptk.register_scenarios(scenario_manager"interactiveScenario":{
"constants":{
"referrals":0,
"advertisingSuccessPct":0.1,
"referralFreeMonths":3,
"referralProgamAdoptionPct":10
}
}
}
)
Another call to list_scenarios
shows that the interactive scenario has now been added:
"smCustomerAcquisition"]) bptk.list_scenarios([
Now that we have a scenario, we need to define the “What if” parameters:
# for each parameter we define the range of settings - e.g. referrals ranges from 0 to twelve, with a step of one: 1,2, 3... 12.
# advertisingSuccessPct ranges from 0 to 0.2, with a steop of 0.01: 0, 0.01, 0.02,....
= {
interactive_settings"advertisingSuccessPct":(0,0.2,0.01),
"referralFreeMonths":(0,40,10),
"referralProgramAdoptionPct":(0,12,1),
"referrals":(0,12,1)
}
Now we need to pre-compute all possible combinations, which is quite a few for the ranges defined above:
import numpy as np # will use the arange function to create ranges with fractional steps
len(np.arange(0,0.2,0.01))*len(np.arange(0,40,10))*len(np.arange(0,12,1))*len(np.arange(0,12,1))
import itertools # will use the product function to iterate through all possible combinations
# generate all combinations of the settings
= [interactive_settings[key] for key in interactive_settings]
dimensions #now generate all possible settings
= list(itertools.product(*tuple(itertools.starmap(np.arange, dimensions)))) settings
The last line of code uses some advanved functional programming to generates all possible combinations of the interactive parameters … let’s take a look at what it does using just two interactive parameters:
list(itertools.product(*tuple(itertools.starmap(np.arange,[(1,4,1),(4,7,1)]))))
len(settings)
# IMPORTANT: this code takes quite some time to run (ca. 4min on my machine)
# a variable to store the dataframes
= []
interactive_dfs ="interactiveScenario"
interactive_scenario=["customers","profit"]
interactive_equations= bptk.get_scenario(scenario_manager, interactive_scenario)
scenario # now apply the settings to the scenario
for setting in settings:
for setting_index, key in enumerate(interactive_settings):
scenario.set_property_value(key, setting[setting_index])
bptk.reset_scenario_cache(=scenario_manager,
scenario_manager=interactive_scenario
scenario
)= bptk.plot_scenarios(
df =[scenario_manager],
scenario_managers=[interactive_scenario],
scenarios=interactive_equations,
equations=True
return_df
)# add columns for the settings
for setting_index, key in enumerate(interactive_settings):
= [setting[setting_index]] * len(df.index)
df[key] # explicitly set a time column
"time"] = df.index
df[+= [df]
interactive_dfs # concatenate the interactive scenarios
= pd.concat(interactive_dfs, ignore_index=True, sort=False) interactive_tab
len(interactive_tab)
11520*61 # number of scenarios * number of timesteps
Writing The Dataframes To An Excel File
Now that we have the dataframes containg the data, we can write them to an Excel file using Pandas ExcelWriter
function. Note that this relies on the xlsxwriter
package:
import xlsxwriter
="./data/customer_acquisition.xlsx"
filenamewith pd.ExcelWriter(filename) as writer:
="scenarios")
scenarios_tab.to_excel(writer, sheet_name="indicators")
indicators_tab.to_excel(writer, sheet_name="interactive") interactive_tab.to_excel(writer, sheet_name
Calling The Export Function Directly
Here is how you would call the export_scenarios
function directly – it you don’t pass a filename it returns a dictionary containing the dataframes for both the scenarios and the interactive dashboard.
Important: With the given parameters the export function generates over 11.000 interactive scenarios amounting to around 30MB of data. On my machine (a Macbook Pro with 16MB of RAM) the function takes just under three minutes to complete.
%%time
## save the file in the current working directory
import os
= os.path.join(os.getcwd(),"data","customer_acquisition.xlsx")
filename
## Load the BPTK Package
bptk.export_scenarios(="smCustomerAcquisition",
scenario_manager=["customers","profit"],
equations=filename,
filename="interactiveScenario",
interactive_scenario=["customers","profit"],
interactive_equations= {
interactive_settings"advertisingSuccessPct":(0,0.2,0.01),
"referralFreeMonths":(0,40,10),
"referralProgramAdoptionPct":(0,12,1),
"referrals":(0,12,1)
} )