Batch Analysis for hawk/dove simulation with multiple risk attitudes, no adjustment¶

  • What % of time do agents play Hawk, by risk attitude? risk-inclined (R=1) and risk-avoidant (R8) agents play Hawk?
  • Cumulative wealth analysis by risk attitude

To track how often agents play Hawk, we need to collect data for every round.

This data was generated by running:

simulatingrisk/hawkdovemulti/batch_run.py --params no_adjustment --agent-data --collect-data every_round

Each row in the data file represents one round for each agent.

In [172]:
import polars as pl


# df = pd.read_csv("../../data/hawkdovemulti/2025-07-22T170057_747737_agent.csv")
# df = pd.read_csv("../../data/hawkdovemulti/2025-07-23T135510_859267_agent.csv")
# which batch run data to use; use variable to ensure we use matching agent and model data
batch_run_date = "2025-07-24T120337_924060"

# load agent data, drop unneeded columns, and add numeric value 1 for played hawk, 0 for played dove

df = (
    pl.read_csv(f"../../data/hawkdovemulti/{batch_run_date}_agent.csv")
        .drop("risk_level_changed")  # drop risk_level_changed; not relevant here (no adjustment = no changes)
        .rename({'risk_level': 'risk_attitude'})  # code still uses risk_level internally; relabel as risk attitude
        .with_columns(
            # add a numeric field to turn choice of play to 1/0 hawk, for aggregation            
            played_hawk=pl.when(pl.col("choice").eq("hawk")).then(1).otherwise(0)
        )
)
df.head()
Out[172]:
shape: (5, 8)
RunIditerationStepAgentIDrisk_attitudechoicepointsplayed_hawk
i64i64i64i64i64stri64i32
11103"dove"120
11118"dove"120
11128"hawk"121
11134"hawk"181
11148"hawk"151

Percent of the time agents play Hawk, by risk attitude¶

What % of time do risk-inclined (R=1) and risk-avoidant (R8) agents play Hawk?

  • Guess from observation is is >90% for R=1, <10% for R8, but we want to have statistics for this: for X trials, how many of them does R1 play Hawk more than 90% of the time?
  • Also useful to have statistics e.g. R=2 played Hawk between 80-90% of the time, or whatever the result is.
In [208]:
# each row in the data frame is a play by an agent on the grid
# group by risk level, then:
# - count the number of rows (= total number of plays)
# - sum the played_hawk field (= number of times played hawk)
# - calculate percent of turns played hawk

hawk_by_risk_attitude = (
    df.group_by("risk_attitude")
        .agg(n_plays=pl.col("played_hawk").count(), n_plays_hawk=pl.col("played_hawk").sum())
    .with_columns(
        pct_plays_hawk=pl.col("n_plays_hawk").truediv(pl.col("n_plays")).mul(100)
    )
)
In [209]:
# output core fields as nicely styled table
(hawk_by_risk_attitude
    .select("risk_attitude", "pct_plays_hawk")
    .sort("risk_attitude")
    .rename({"risk_attitude": "Risk Attitude", "pct_plays_hawk": "% plays Hawk"})
    .style.tab_header(title="% of time agents play Hawk by Risk Attitude")
    .fmt_number("% plays Hawk", decimals=1)
)
Out[209]:
% of time agents play Hawk by Risk Attitude
Risk Attitude % plays Hawk
0 98.6
1 93.0
2 86.8
3 70.4
4 60.2
5 39.6
6 29.6
7 13.2
8 7.0
9 1.4
In [175]:
import altair as alt

alt.Chart(hawk_by_risk_attitude).mark_bar(width=10).encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),                                                          
    y=alt.Y("pct_plays_hawk", title="% time plays Hawk")
).properties(title="% of time agents play Hawk")
Out[175]:

Context: how many runs are these numbers drawn from?

In [176]:
total_unique_runs = len(df["RunId"].unique())
total_iterations = len(df["iteration"].unique())
n_combinations = int(total_unique_runs / total_iterations)

# Step is the round count indicator
longest_run = df["Step"].max()    # highest across all runs
# average of max value for each run. Group by run, get max Step, average —> returns a dataframe, get first Step value
average_run = df.group_by("RunId").agg(pl.col("Step").max()).mean()["Step"].first()

print(f"""{total_unique_runs:,} total unique runs; {total_iterations} iterations of {n_combinations} different parameter combinations.

Longest run: {longest_run} steps
Average run: {average_run:.1f} steps
""")
13,500 total unique runs; 100 iterations of 135 different parameter combinations.

Longest run: 109 steps
Average run: 44.9 steps

Analysis filterd by simulation parameters¶

In [177]:
# identify the last round of each run
# for both wealth analysis and model parameters, we want to look at the last round (Step) of each run
last_round_df = df.group_by("RunId").agg(pl.col("Step").max())
In [178]:
# load model data and filter to last round for each run
full_model_df = pl.read_csv(f"../../data/hawkdovemulti/{batch_run_date}_model.csv")
model_df = last_round_df.join(full_model_df, on=['RunId', 'Step'], how="left")
# limit to only those fields that are needed for our analysis
model_df = model_df.select("RunId", "iteration", "risk_distribution", "play_neighborhood", "observed_neighborhood", "grid_size")
model_df.head()
Out[178]:
shape: (5, 6)
RunIditerationrisk_distributionplay_neighborhoodobserved_neighborhoodgrid_size
i64i64stri64i64i64
905454"skewed right"2485
1142727"bimodal"845
788787"skewed left"445
236161"uniform"42425
131313"uniform"242410
In [179]:
print(f"""Simulation parameters:

Grid size: {', '.join(str(n) for n in sorted(model_df["grid_size"].unique()))}
Iniital risk distribution: {', '.join(str(n) for n in sorted(model_df["risk_distribution"].unique()))}
Play neighborhood size: {', '.join(str(n) for n in sorted(model_df["play_neighborhood"].unique()))}
Observed neighborhood sized: {', '.join(str(n) for n in sorted(model_df["observed_neighborhood"].unique()))}

""")
Simulation parameters:

Grid size: 5, 10, 25
Iniital risk distribution: bimodal, normal, skewed left, skewed right, uniform
Play neighborhood size: 4, 8, 24
Observed neighborhood sized: 4, 8, 24


In [180]:
# join agent data with model data so we can filter by starting parameters
agent_df_params = df.join(model_df, on=["RunId", "iteration"], how="left")

Grid size¶

In [181]:
hawk_by_gridsize_riskattitude = (
    agent_df_params.group_by("grid_size", "risk_attitude")
         .agg(n_plays=pl.col("played_hawk").count(), n_plays_hawk=pl.col("played_hawk").sum())
    .with_columns(
        pct_plays_hawk=pl.col("n_plays_hawk").truediv(pl.col("n_plays")).mul(100)
    )
)

alt.Chart(hawk_by_gridsize_riskattitude).mark_bar(width=10).encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),                                                          
    y=alt.Y("pct_plays_hawk", title="% time plays Hawk")
).facet(column=alt.Column("grid_size", title="Grid Size")).properties(title="% of time agents play Hawk")
Out[181]:

Play neighborhood¶

In [182]:
hawk_by_playnhood_riskattitude = (
    agent_df_params.group_by("play_neighborhood", "risk_attitude")
         .agg(n_plays=pl.col("played_hawk").count(), n_plays_hawk=pl.col("played_hawk").sum())
    .with_columns(
        pct_plays_hawk=pl.col("n_plays_hawk").truediv(pl.col("n_plays")).mul(100)
    )
)

alt.Chart(hawk_by_playnhood_riskattitude).mark_bar(width=10).encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),                                                          
    y=alt.Y("pct_plays_hawk", title="% time plays Hawk")
).facet(column=alt.Column("play_neighborhood", title="Play Neighborhood")).properties(title="% of time agents play Hawk")
Out[182]:

Observed neighborhood¶

In [183]:
hawk_by_obsnhood_riskattitude = (
    agent_df_params.group_by("observed_neighborhood", "risk_attitude")
         .agg(n_plays=pl.col("played_hawk").count(), n_plays_hawk=pl.col("played_hawk").sum())
    .with_columns(
        pct_plays_hawk=pl.col("n_plays_hawk").truediv(pl.col("n_plays")).mul(100)
    )
)

alt.Chart(hawk_by_obsnhood_riskattitude).mark_bar(width=10).encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),                                                          
    y=alt.Y("pct_plays_hawk", title="% time plays Hawk")
).facet(column=alt.Column("observed_neighborhood", title="Observed Neighborhood")).properties(title="% of time agents play Hawk")
Out[183]:

Initial risk distribution¶

In [184]:
hawk_by_initialdist_riskattitude = (
    agent_df_params.group_by("risk_distribution", "risk_attitude")
         .agg(n_plays=pl.col("played_hawk").count(), n_plays_hawk=pl.col("played_hawk").sum())
    .with_columns(
        pct_plays_hawk=pl.col("n_plays_hawk").truediv(pl.col("n_plays")).mul(100)
    )
)

alt.Chart(hawk_by_initialdist_riskattitude).mark_bar(width=10).encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),                                                          
    y=alt.Y("pct_plays_hawk", title="% time plays Hawk")
).facet(facet=alt.Facet("risk_distribution", title="Initial Risk Distribution"), columns=3
).properties(title="% of time agents play Hawk")
Out[184]:

Cumulative Wealth analysis¶

  • mean and quartiles for wealth by R
  • does mean vary between Rs or is it roughly the same?
    • quartiles look different, but need a statistic; esp. compare lower-R quartile to higher-R quartile
      • expect/hope that lower quartile is higher for R1 than R8, higher quartile is higher for R8 than R1
      • what's going on in the middle?

Because our simulations run for different lengths before stopping, and because our simulations include a range of different play neighborhoods (which affects total payoff), we scale points (wealth) by play neighborhood and simulation run length for comparison across simulations. The scaled points is multipled by 100 to make the point ranges more intelligible and comparable to a simulation run.

$ scaled\_points = ((points / play\_neighborhood) / simulation\_runlength) * 100 $

In [210]:
# combine the last round dataframe and full agent dataframe to get just the last round of each run
agents_last_round_df = (
    last_round_df.join(df, on=['RunId', 'Step'], how="left")
        # join on model parameters, for filtering and scaling by play neighborhood
        .join(model_df, on=["RunId", "iteration"])
        .with_columns(
            # calculate a scaled points value so we can compare across runs with different length and play neighborhood
            scaled_points=pl.col("points").truediv(pl.col("play_neighborhood")).truediv(pl.col("Step")).mul(100)
        )
    
)

Now plot scaled wealth distribution by risk attitude.

In [186]:
wealthchart_title = alt.TitleParams(
    "Cumulative Wealth by Risk Attitude",
    subtitle=["Wealth scaled by simulation length and play neighborhood"]
)

cum_wealth_boxplot = alt.Chart(agents_last_round_df, title=wealthchart_title).mark_boxplot(extent='min-max').encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.X("scaled_points", title="Wealth "),
) 
cum_wealth_boxplot
Out[186]:

To check our scaling, we compared it with cumulative wealth at round 31 across all simulations, scaled only by play neighborhood; the distribution was similar to the scaled wealth plot.

Calculate quartiles and other statistics and output as a table, for reference.

In [187]:
# Altair boxplot is calculating quartiles for us, but we can calculate them directly as well
wealth_by_risk_attitude = agents_last_round_df.group_by("risk_attitude").agg(
    min=pl.col("scaled_points").min(), 
    max=pl.col("scaled_points").max(),
    median=pl.col("scaled_points").median(), 
    mean=pl.col("scaled_points").mean(), 
    Q1=pl.col("scaled_points").quantile(0.25),
    Q2=pl.col("scaled_points").quantile(0.5),
    Q3=pl.col("scaled_points").quantile(0.75),
).sort("risk_attitude")
In [213]:
# define a custom box plot method using layered plots,
# so that we can quickly generate plots from statistics generated by polars

def custom_boxplot(df):        
    # calculate risk group labels, for color
    plot_df = df.with_columns(
        risk_type=pl.when(pl.col.risk_attitude < 3).then(pl.lit("Risk Inclined")).when(pl.col.risk_attitude < 7).then(pl.lit("Risk Moderate")).otherwise(pl.lit("Risk Avoidant"))
    )
    # create base chart to use across layers
    base_chart = alt.Chart(plot_df)
    
    # area chart for Q1 to Q3
    area_chart = base_chart.mark_rect(width=15).encode(
        y=alt.Y('Q1').axis(offset=12),  # add offset so axis does not crowd rectangle
        y2='Q3', 
        x=alt.X('risk_attitude:Q', title='Risk Attitude'), 
        tooltip=['min', 'max', 'mean', 'Q1', 'median', 'Q3'],
        color=alt.Color('risk_type', title='Risk Attitude')
    )
    # line chart for min-max spread
    # specifying a stroke for point on the line only adds the min point
    minmax_line_chart = base_chart.mark_line(
        point=alt.OverlayMarkDef(filled=False, shape='stroke', color='black', strokeWidth=2)
    ).encode(
        alt.Y('min'),
        alt.Y2('max'),
        x='risk_attitude:Q'
    )
    # add a black stroke for the max
    max_marks = base_chart.mark_point(shape='stroke', size=55, color='black').encode(
        y='max',
        x=alt.X('risk_attitude:Q'),
    )
    # add a black stroke for the min
    median_marks = base_chart.mark_point(shape='stroke', size=100, strokeWidth=1, color='white').encode(
        y='median',
        x='risk_attitude:Q')

    # mean line ? 
    mean_line_chart = base_chart.mark_line(interpolate="monotone", color="black", opacity=0.5).encode(
        x=alt.X("risk_attitude:Q"),
        y=alt.Y("mean", title="Wealth (mean)").scale(zero=False),
    )
    
    return (mean_line_chart + minmax_line_chart + area_chart + median_marks + max_marks).resolve_axis('shared')

This is a customized box plot, which includes all the same information as the simpler version above. This one is based on numbers pre-calculated on the polars dataframe, which is more efficient than altair.

This version adds strokes for the min and max values (whiskers), colors the Q1-Q3 box portion of the plot based on the risk attitude groups, and adds a line graph of the mean behind the box plot.

Mouse over the box to view statistics for each risk attitude (min, max, mean, Q1-Q3).

In [214]:
custom_boxplot(wealth_by_risk_attitude).properties(title=wealthchart_title)
Out[214]:
In [190]:
# output wealth by risk attitude as nicely styled table

(wealth_by_risk_attitude
    .rename({"risk_attitude": "Risk Attitude"})
    .style.tab_header(title="Cumulative Wealth by Risk Attitude", subtitle="Wealth scaled by simulation length and play neighborhood")
    .fmt_number(decimals=1)
    .fmt_number("Risk Attitude", decimals=0)
)
Out[190]:
Cumulative Wealth by Risk Attitude
Wealth scaled by simulation length and play neighborhood
Risk Attitude min max median mean Q1 Q2 Q3
0 0.0 300.0 149.7 148.7 114.9 149.7 174.6
1 0.0 300.0 147.6 144.0 113.2 147.6 167.8
2 0.0 300.0 147.6 145.4 112.5 147.6 170.6
3 0.0 300.0 137.5 143.4 110.8 137.5 166.0
4 0.0 300.0 134.7 141.7 109.9 134.7 157.3
5 1.7 300.0 137.5 139.4 114.3 137.5 150.8
6 43.2 300.0 143.0 143.3 125.4 143.0 154.3
7 64.7 300.0 149.3 147.0 137.0 149.3 158.1
8 87.1 300.0 150.4 151.0 140.3 150.4 161.7
9 96.8 203.2 150.0 150.3 141.4 150.0 160.9
In [191]:
# define a custom method to plot mean and quartiles for a specified simulation parameter

def plot_mean_quartiles(df, field, field_label):
    # takes a dataframe, the field in the dataframe for the simulation parameter, and displayable label for the field
    
    # create a selection filter bound to the legend
    selection = alt.selection_point(fields=[field], bind='legend')
    base_chart = alt.Chart(df)
    # curved line for the mean wealth by risk attitude
    wealth_mean_chart = base_chart.mark_line(interpolate="monotone").encode(
        x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
        y=alt.Y("mean", title="Wealth (mean)").scale(zero=False),
        color=alt.Color(f"{field}:N", title=field_label),
        opacity=alt.when(selection).then(alt.value(1.0)).otherwise(alt.value(0.4))
    ).add_params(
        selection
    )
    # curved area chart for wealth quartile spread
    wealth_spread = base_chart.mark_area(interpolate="monotone").encode(
        x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
        y=alt.Y("Q3").scale(zero=False),
        y2="Q1",
        color=alt.Color(f"{field}:N", title=field_label),
        opacity=alt.when(selection).then(alt.value(0.3)).otherwise(alt.value(0.1))
    ).add_params(
        selection
    )
    
    # combine the charts for multiple ways to view
    chart_wealth_title = wealthchart_title.copy()
    chart_wealth_title['text'] = f"{wealthchart_title['text']} and {field_label} — Mean and Quartiles"

    # display mean curve chart, wealth area chart, and mean layered with the wealth
    return (wealth_mean_chart | wealth_spread | (wealth_mean_chart + wealth_spread)).resolve_legend(color="shared").properties(title=chart_wealth_title)

Analysis filtered by simulation parameters¶

How does the wealth distribution vary based on other starting parameters?

Grid size¶

In [192]:
# calculate mean & quartiles for risk attitude by grid size
wealth_by_risk_grid = agents_last_round_df.group_by("risk_attitude", "grid_size").agg(
    min=pl.col("scaled_points").min(), 
    max=pl.col("scaled_points").max(),
    median=pl.col("scaled_points").median(),
    mean=pl.col("scaled_points").mean(), 
    Q1=pl.col("scaled_points").quantile(0.25),
    Q2=pl.col("scaled_points").quantile(0.5),
    Q3=pl.col("scaled_points").quantile(0.75),
).sort("grid_size", "risk_attitude")
In [215]:
custom_boxplot(wealth_by_risk_grid).facet(column=alt.Column("grid_size", title="Grid Size")).properties(title=wealthchart_title)
Out[215]:
In [194]:
plot_mean_quartiles(wealth_by_risk_grid, 'grid_size', 'Grid Size')
Out[194]:

Click on items in the Grid Size legend to change the chart opacity to focus on a particular group.

In [195]:
# turn grid size into label for context in the table
(wealth_by_risk_grid.with_columns(grid_size=pl.lit("Grid Size: ").add(pl.col("grid_size").cast(pl.datatypes.String)))
    .rename({"risk_attitude": "Risk Attitude"})
    .style.tab_header(title="Cumulative Wealth by Risk Attitude and Grid Size", subtitle="Wealth scaled by simulation length and play neighborhood")
     .tab_stub(rowname_col="Risk Attitude", groupname_col="grid_size")
    .fmt_number(decimals=1)
    .fmt_number("Risk Attitude", decimals=0)
)
Out[195]:
Cumulative Wealth by Risk Attitude and Grid Size
Wealth scaled by simulation length and play neighborhood
min max median mean Q1 Q2 Q3
Grid Size: 5
0 0.0 300.0 150.0 148.2 125.0 150.0 162.0
1 0.0 300.0 148.6 142.6 113.2 148.6 161.6
2 0.0 300.0 147.6 143.5 109.4 147.6 167.9
3 1.4 300.0 137.0 141.0 104.4 137.0 162.1
4 2.4 300.0 133.0 139.7 103.2 133.0 156.9
5 4.2 300.0 137.2 137.4 104.8 137.2 150.4
6 64.0 300.0 141.4 140.4 119.8 141.4 152.4
7 79.9 300.0 148.3 144.9 135.0 148.3 156.0
8 88.8 300.0 150.0 149.3 139.1 150.0 160.0
9 96.8 203.2 149.5 148.5 142.2 149.5 155.6
Grid Size: 10
0 0.0 300.0 150.0 149.6 117.7 150.0 174.8
1 0.0 300.0 147.6 144.0 113.1 147.6 167.6
2 0.0 300.0 147.5 144.9 111.7 147.5 170.2
3 0.0 300.0 136.7 142.6 107.8 136.7 164.9
4 1.3 300.0 133.5 141.2 106.8 133.5 156.9
5 2.5 300.0 137.2 138.8 111.3 137.2 150.8
6 50.0 300.0 142.7 142.7 125.0 142.7 154.3
7 68.6 300.0 149.3 146.7 136.5 149.3 158.2
8 90.3 300.0 150.4 150.9 139.7 150.4 161.7
9 96.8 203.2 150.0 150.2 141.5 150.0 160.5
Grid Size: 25
0 0.0 300.0 149.6 148.6 114.9 149.6 174.6
1 0.0 300.0 147.6 144.1 113.2 147.6 168.0
2 0.0 300.0 147.6 145.6 112.6 147.6 170.6
3 0.0 300.0 137.6 143.7 111.3 137.6 166.3
4 0.0 300.0 134.7 141.9 110.9 134.7 157.3
5 1.7 300.0 137.5 139.6 115.1 137.5 150.8
6 43.2 300.0 143.1 143.6 125.5 143.1 154.4
7 64.7 300.0 149.4 147.2 137.1 149.4 158.1
8 87.1 300.0 150.4 151.1 140.3 150.4 161.7
9 96.8 203.2 150.1 150.4 141.3 150.1 160.9

Play neighborhood¶

In [196]:
# calculate mean & quartiles for risk attitude by play neighborhood
wealth_by_risk_playnhood = agents_last_round_df.group_by("risk_attitude", "play_neighborhood").agg(
    min=pl.col("scaled_points").min(), 
    max=pl.col("scaled_points").max(),
    median=pl.col("scaled_points").median(),
    mean=pl.col("scaled_points").mean(), 
    Q1=pl.col("scaled_points").quantile(0.25),
    Q2=pl.col("scaled_points").quantile(0.5),
    Q3=pl.col("scaled_points").quantile(0.75),
).sort("play_neighborhood", "risk_attitude")
In [216]:
custom_boxplot(wealth_by_risk_playnhood).facet(column=alt.Column("play_neighborhood", title="Play Neighborhood")).properties(title=wealthchart_title)
Out[216]:
In [198]:
plot_mean_quartiles(wealth_by_risk_playnhood, 'play_neighborhood', 'Play Neighborhood')
Out[198]:

Click on items in the Play Neighborhood legend to change the chart opacity to focus on a particular group.

In [199]:
(wealth_by_risk_playnhood.with_columns(play_neighborhood=pl.lit("Play Neighborhood: ").add(pl.col("play_neighborhood").cast(pl.datatypes.String)))
    .rename({"risk_attitude": "Risk Attitude"})
    .style.tab_header(title="Cumulative Wealth by Risk Attitude and Play Neighborhood", subtitle="Wealth scaled by simulation length and play neighborhood")
     .tab_stub(rowname_col="Risk Attitude", groupname_col="play_neighborhood")
    .fmt_number(decimals=1)
    .fmt_number("Risk Attitude", decimals=0)
)
Out[199]:
Cumulative Wealth by Risk Attitude and Play Neighborhood
Wealth scaled by simulation length and play neighborhood
min max median mean Q1 Q2 Q3
Play Neighborhood: 4
0 0.0 300.0 150.0 151.7 79.8 150.0 220.2
1 0.0 300.0 149.5 148.6 88.7 149.5 191.1
2 0.0 300.0 149.2 150.6 99.2 149.2 212.1
3 0.0 300.0 129.5 147.7 101.8 129.5 186.6
4 0.0 300.0 125.5 145.4 102.1 125.5 175.0
5 1.7 300.0 126.6 140.1 104.7 126.6 151.6
6 43.2 300.0 137.3 144.4 113.7 137.3 161.3
7 64.7 300.0 148.7 146.4 125.8 148.7 162.3
8 87.7 300.0 150.0 150.6 128.2 150.0 169.4
9 96.8 203.2 150.0 149.4 127.4 150.0 171.8
Play Neighborhood: 8
0 1.2 298.8 149.6 148.3 113.7 149.6 185.5
1 2.3 299.4 148.5 143.3 113.1 148.5 169.3
2 1.2 298.8 148.7 144.4 112.5 148.7 171.2
3 3.4 299.4 136.7 142.2 111.3 136.7 168.2
4 4.0 298.8 132.7 140.1 110.5 132.7 156.6
5 40.7 299.3 137.3 138.3 116.1 137.3 150.2
6 69.4 299.2 141.5 141.9 125.6 141.5 151.8
7 82.8 298.8 149.2 146.4 137.2 149.2 156.9
8 87.1 296.5 150.2 150.6 138.7 150.2 162.1
9 98.0 202.0 150.0 150.3 138.3 150.0 162.0
Play Neighborhood: 24
0 46.0 249.2 148.8 146.0 126.2 148.8 162.2
1 34.2 247.2 137.9 140.2 120.3 137.9 157.3
2 63.5 248.9 138.3 141.3 119.3 138.3 161.7
3 63.9 254.4 138.5 140.4 117.3 138.5 157.8
4 66.4 253.8 138.0 139.6 118.8 138.0 155.2
5 82.6 266.5 141.8 139.9 125.5 141.8 151.3
6 82.7 257.0 146.0 143.8 135.0 146.0 154.2
7 90.6 244.8 150.0 148.3 142.1 150.0 157.0
8 95.1 247.2 153.2 151.9 146.0 153.2 158.6
9 113.3 189.9 150.8 151.2 145.7 150.8 157.8

Observed neighborhood¶

In [200]:
# calculate mean & quartiles for risk attitude by play neighborhood
wealth_by_risk_obsvnhood = agents_last_round_df.group_by("risk_attitude", "observed_neighborhood").agg(
    min=pl.col("scaled_points").min(), 
    max=pl.col("scaled_points").max(),
    median=pl.col("scaled_points").median(),
    mean=pl.col("scaled_points").mean(), 
    Q1=pl.col("scaled_points").quantile(0.25),
    Q2=pl.col("scaled_points").quantile(0.5),
    Q3=pl.col("scaled_points").quantile(0.75),
).sort("observed_neighborhood", "risk_attitude")
In [217]:
custom_boxplot(wealth_by_risk_obsvnhood).facet(column=alt.Column("observed_neighborhood", title="Observed Neighborhood")).properties(title=wealthchart_title)
Out[217]:
In [202]:
plot_mean_quartiles(wealth_by_risk_obsvnhood, 'observed_neighborhood', 'Observed Neighborhood')
Out[202]:

Click on items in the Observed Neighborhood legend to change the chart opacity to focus on a particular group.

In [203]:
(wealth_by_risk_obsvnhood.with_columns(observed_neighborhood=pl.lit("Observed Neighborhood: ").add(pl.col("observed_neighborhood").cast(pl.datatypes.String)))
    .rename({"risk_attitude": "Risk Attitude"})
    .style.tab_header(title="Cumulative Wealth by Risk Attitude and Observed Neighborhood", subtitle="Wealth scaled by simulation length and play neighborhood")
     .tab_stub(rowname_col="Risk Attitude", groupname_col="observed_neighborhood")
    .fmt_number(decimals=1)
    .fmt_number("Risk Attitude", decimals=0)
)
Out[203]:
Cumulative Wealth by Risk Attitude and Observed Neighborhood
Wealth scaled by simulation length and play neighborhood
min max median mean Q1 Q2 Q3
Observed Neighborhood: 4
0 0.0 300.0 150.0 153.1 125.5 150.0 183.9
1 38.7 300.0 148.6 151.0 118.2 148.6 174.3
2 38.7 300.0 149.6 154.9 124.6 149.6 183.1
3 67.7 300.0 145.3 152.8 124.2 145.3 168.5
4 64.4 300.0 147.7 154.6 124.9 147.7 169.8
5 71.8 300.0 144.0 148.2 127.3 144.0 154.4
6 69.4 300.0 145.8 150.5 129.7 145.8 157.7
7 82.8 300.0 149.4 148.8 137.7 149.4 156.5
8 87.1 300.0 150.0 150.8 138.3 150.0 158.6
9 96.8 203.2 150.0 148.9 139.1 150.0 158.1
Observed Neighborhood: 8
0 0.0 300.0 149.6 147.9 114.9 149.6 174.2
1 0.0 300.0 147.6 143.2 113.3 147.6 167.3
2 0.0 300.0 147.6 145.0 112.8 147.6 170.6
3 0.0 300.0 138.6 143.9 112.5 138.6 168.4
4 1.3 300.0 132.7 140.2 112.9 132.7 154.2
5 60.5 300.0 136.8 139.0 117.9 136.8 150.4
6 72.6 300.0 143.1 142.5 126.2 143.1 152.6
7 84.7 300.0 149.5 147.3 137.5 149.5 157.9
8 95.1 296.7 150.4 150.8 141.8 150.4 161.3
9 96.8 203.2 150.0 150.5 141.7 150.0 161.0
Observed Neighborhood: 24
0 0.0 300.0 149.5 145.1 113.7 149.5 173.9
1 0.0 300.0 137.9 137.9 111.3 137.9 161.8
2 0.0 300.0 131.5 136.5 104.1 131.5 162.1
3 0.0 300.0 118.9 133.5 103.4 118.9 153.2
4 0.0 300.0 117.8 130.4 103.0 117.8 149.6
5 1.7 300.0 125.8 131.2 104.3 125.8 149.5
6 43.2 298.5 138.2 137.0 111.8 138.2 152.4
7 64.7 291.1 149.2 145.0 131.5 149.2 160.3
8 87.7 247.0 151.3 151.6 141.9 151.3 162.5
9 96.8 203.2 150.4 151.5 141.9 150.4 162.1

Initial risk distribution¶

In [204]:
# calculate mean & quartiles for risk attitude by play neighborhood
wealth_by_risk_dist = agents_last_round_df.group_by("risk_attitude", "risk_distribution").agg(
    min=pl.col("scaled_points").min(), 
    max=pl.col("scaled_points").max(),
    median=pl.col("scaled_points").median(), 
    mean=pl.col("scaled_points").mean(), 
    Q1=pl.col("scaled_points").quantile(0.25),
    Q2=pl.col("scaled_points").quantile(0.5),
    Q3=pl.col("scaled_points").quantile(0.75),
).sort("risk_distribution", "risk_attitude")
In [218]:
custom_boxplot(wealth_by_risk_dist).facet(facet=alt.Facet("risk_distribution", title="Initial Risk Distribution"), columns=3
).properties(title=wealthchart_title)
Out[218]:
In [206]:
plot_mean_quartiles(wealth_by_risk_dist, 'risk_distribution', 'Initial Risk Distribution')
Out[206]:

Click on items in the Distribution legend to change the chart opacity to focus on a particular group.

In [207]:
# output values as a nice table so we can reference them if needed
(wealth_by_risk_dist.with_columns(risk_distribution=pl.lit("Initial Risk Distribution: ").add(pl.col("risk_distribution")))
    .rename({"risk_attitude": "Risk Attitude"})
    .style.tab_header(title="Cumulative Wealth by Risk Attitude and Initial Risk Distribution", subtitle="Wealth scaled by simulation length and play neighborhood")
     .tab_stub(rowname_col="Risk Attitude", groupname_col="risk_distribution")
    .fmt_number(decimals=1)
    .fmt_number("Risk Attitude", decimals=0)
)
Out[207]:
Cumulative Wealth by Risk Attitude and Initial Risk Distribution
Wealth scaled by simulation length and play neighborhood
min max median mean Q1 Q2 Q3
Initial Risk Distribution: bimodal
0 0.0 300.0 150.0 152.1 124.8 150.0 185.5
1 0.0 300.0 150.0 153.1 125.2 150.0 185.5
2 0.0 300.0 150.0 153.3 125.4 150.0 185.5
3 0.0 300.0 150.0 156.0 126.2 150.0 185.1
4 1.4 300.0 150.0 158.7 137.1 150.0 184.7
5 8.9 300.0 150.0 157.1 137.9 150.0 167.7
6 74.2 300.0 150.0 153.9 138.0 150.0 161.7
7 96.8 300.0 150.0 150.4 138.3 150.0 158.9
8 96.8 300.0 150.0 150.1 138.3 150.0 158.9
9 96.8 203.2 150.0 149.0 138.3 150.0 158.5
Initial Risk Distribution: normal
0 1.2 300.0 152.4 160.1 148.3 152.4 167.8
1 3.2 300.0 149.2 148.0 113.6 149.2 166.9
2 2.4 300.0 144.0 143.4 108.6 144.0 163.4
3 1.3 300.0 125.8 135.3 105.5 125.8 151.2
4 2.4 300.0 121.8 131.0 104.0 121.8 148.1
5 52.1 300.0 123.0 126.8 104.2 123.0 143.7
6 67.7 300.0 126.6 129.2 105.9 126.6 146.4
7 82.8 300.0 137.9 132.1 109.4 137.9 149.2
8 87.1 300.0 143.5 136.2 116.1 143.5 149.9
9 100.0 200.8 149.2 146.8 144.0 149.2 151.1
Initial Risk Distribution: skewed left
0 0.0 300.0 126.6 128.3 111.8 126.6 149.1
1 0.0 300.0 123.9 126.1 102.4 123.9 148.6
2 0.0 300.0 117.9 123.6 99.6 117.9 146.8
3 0.0 300.0 115.3 121.4 101.0 115.3 137.0
4 0.0 300.0 123.0 123.2 103.7 123.0 137.0
5 51.8 300.0 131.1 128.9 114.9 131.1 139.7
6 60.4 300.0 135.5 133.0 125.0 135.5 141.9
7 79.9 300.0 137.3 135.4 126.2 137.3 143.5
8 90.3 300.0 137.5 135.9 126.6 137.5 143.6
9 96.8 200.4 137.5 135.3 126.6 137.5 143.4
Initial Risk Distribution: skewed right
0 3.2 300.0 187.5 193.3 169.4 187.5 220.7
1 1.3 300.0 187.1 191.3 168.5 187.1 220.2
2 2.4 300.0 187.1 190.8 168.3 187.1 220.2
3 1.2 300.0 186.2 185.0 152.4 186.2 212.1
4 0.0 300.0 176.5 178.1 148.4 176.5 205.5
5 1.7 300.0 156.0 163.3 138.1 156.0 180.3
6 43.2 300.0 154.7 158.3 138.6 154.7 169.7
7 64.7 300.0 155.6 154.8 144.5 155.6 163.1
8 87.7 300.0 156.6 156.6 149.8 156.6 163.6
9 99.0 203.2 157.7 157.3 150.4 157.7 163.2
Initial Risk Distribution: uniform
0 0.0 300.0 150.8 158.1 137.9 150.8 185.1
1 0.0 300.0 150.8 157.7 137.6 150.8 184.4
2 0.0 300.0 150.7 157.6 137.5 150.7 184.7
3 0.0 300.0 150.0 155.8 131.9 150.0 179.8
4 0.0 300.0 149.6 154.5 131.9 149.6 172.8
5 4.0 300.0 148.4 150.6 137.0 148.4 156.5
6 56.5 300.0 148.8 149.4 137.9 148.8 154.8
7 84.7 300.0 149.2 147.8 138.3 149.2 154.0
8 93.1 300.0 149.2 147.8 138.4 149.2 154.2
9 96.8 203.2 149.3 147.4 138.7 149.3 154.2