Visualization of data and simulations

In this notebook, we illustrate the visualization functions of the petab library to visualize measurements or simulations.

Some basic visualizations can be generated from a PEtab problem directly, without the need for a visualization specification file. This is illustrated in the first part of this notebook. For more advanced visualizations, a visualization specification file is required. This is illustrated in the second part of this notebook.

For the following demonstrations, we will use two example problems obtained from the Benchmark collection, Fujita_SciSignal2010 and Isensee_JCB2018. Their specifics don’t matter for the purpose of this notebook—we just need some PEtab problems to work with.

from pathlib import Path

import matplotlib.pyplot as plt

import petab
from petab import Problem
from petab.visualize import plot_problem

example_dir_fujita = Path("example_Fujita")
petab_yaml_fujita = example_dir_fujita / "Fujita.yaml"
example_dir_isensee = Path("example_Isensee")
petab_yaml_isensee = example_dir_isensee / "Isensee_no_vis.yaml"
petab_yaml_isensee_vis = example_dir_isensee / "Isensee.yaml"

# we change some settings to make the plots better readable
petab.visualize.plotting.DEFAULT_FIGSIZE[:] = (10, 8)
plt.rcParams["figure.figsize"] = petab.visualize.plotting.DEFAULT_FIGSIZE
plt.rcParams["font.size"] = 12
plt.rcParams["figure.dpi"] = 150
plt.rcParams["legend.fontsize"] = 10

Visualization without visualization specification file

Plotting measurements

For the most basic visualization, we can use the plot_problem() function.

# load PEtab problem
petab_problem = Problem.from_yaml(petab_yaml_fujita)

# plot measurements
petab.visualize.plot_problem(petab_problem);

../_images/239041a3e2e242aab53405ca83e7b64d60b6803f7dfe9248ddf77be4403498c4.png

As nothing was specified regarding what should be plotted, the defaults were used. Namely, it was assumed that measurements are time series data, and they were grouped by observables.

Subplots / subsetting the data

Measurements or simulations can be grouped by observables, simulation conditions, or datasetIds with the plot_problem() function. This can be specified by setting the value of group_by parameter to 'observable' (default), 'simulation', or 'dataset' and by providing corresponding ids in grouping_list, which is a list of lists. Each sublist specifies a separate plot and its elements are either simulation condition IDs or observable IDs or the dataset IDs.

By observable

We can specify how many subplots there should be and what should be plotted on each of them. It can easily be done by providing grouping_list, which by default specifies, which observables should be plotted on a particular plot. The value of grouping_list should be a list of lists, each sublist corresponds to a separate plot.

petab.visualize.plot_problem(
    petab_problem,
    grouping_list=[["pEGFR_tot"], ["pAkt_tot", "pS6_tot"]],
    group_by="observable",
)
plt.gcf().set_size_inches(10, 4)

../_images/bd73d4052f95b885ec934b3d0b573340bb4bc655e7221ed8a49d300cda03220c.png

By simulation condition

Another option is to specify which simulation conditions should be plotted:

petab.visualize.plot_problem(
    petab_problem,
    grouping_list=[
        ["model1_data1"],
        ["model1_data2", "model1_data3"],
        ["model1_data4", "model1_data5", "model1_data6"],
    ],
    group_by="simulation",
);

../_images/53a34596ada13a47160a701a5842225811468d786a9658cd4a97b5bad4f5eaa2.png

By datasetId

Finally, measurements can be grouped by datasetIds as specified in the measurements table, by passing lists of datasetIds. Each sublist corresponds to a subplot:

petab.visualize.plot_problem(
    petab_problem,
    grouping_list=[
        [
            "model1_data1_pEGFR_tot",
            "model1_data2_pEGFR_tot",
            "model1_data3_pEGFR_tot",
            "model1_data4_pEGFR_tot",
            "model1_data5_pEGFR_tot",
            "model1_data6_pEGFR_tot",
        ],
        [
            "model1_data1_pAkt_tot",
            "model1_data2_pAkt_tot",
            "model1_data3_pAkt_tot",
            "model1_data4_pAkt_tot",
            "model1_data5_pAkt_tot",
            "model1_data6_pAkt_tot",
        ],
        [
            "model1_data1_pS6_tot",
            "model1_data2_pS6_tot",
            "model1_data3_pS6_tot",
            "model1_data4_pS6_tot",
            "model1_data5_pS6_tot",
            "model1_data6_pS6_tot",
        ],
    ],
    group_by="dataset",
);

Plotting simulations

We can also plot simulations together with the measurements, for example, to judge the model fit. For this, we need to provide a simulation file as simulations_df. A simulation file has the same format as the measurement file, but instead of the measurement column, it contains simulation outputs in the simulation column. The simulations are plotted as solid lines, while the measurements are plotted as dashed lines:

simu_file_Fujita = example_dir_fujita / "Fujita_simulatedData.tsv"

sim_cond_id_list = [
    ["model1_data1"],
    ["model1_data6"],
]
petab_problem = Problem.from_yaml(petab_yaml_fujita)
plot_problem(
    petab_problem,
    simulations_df=simu_file_Fujita,
    grouping_list=sim_cond_id_list,
    group_by="simulation",
    plotted_noise="provided",
)
plt.gcf().set_size_inches(10, 4)

../_images/7edf25431df53dda4eb6c2bd95d8cf48ed362842cf65c380d3fb7a651fdcb496.png

It is also possible to plot only the simulations without the measurements by setting petab_problem.measurement_df = None.

Visualization with a visualization specification file

As described in the PEtab documentation, the visualization specification file is a tab-separated value file specifying which data to plot in which way. In the following, we will build up a visualization specification file step by step.

Without a visualization file, the independent variable defaults to time, and each observable is plotted in a separate subplot:

petab_problem = Problem.from_yaml(petab_yaml_fujita)
petab.visualize.plot_problem(petab_problem);

First, let us create a visualization specification file with only mandatory columns. In fact, there is only one mandatory column: plotId. The most basic visualization file looks like this:

petab_problem.visualization_df = petab.get_visualization_df(
    example_dir_fujita / "visuSpecs" / "Fujita_visuSpec_mandatory.tsv"
)
petab_problem.visualization_df

	plotId
0	plot1

This way, all data will be shown in a single plot, taking time as independent variable. This is not very appealing yet, but we will improve it step by step.

petab.visualize.plot_problem(petab_problem)
plt.gcf().set_size_inches(10, 4)

../_images/5f027d12ddff07e9a481ba28aa58e084811da5ca1816b371b4df8550bbac4f07.png

Logarithmic scale and offset

Let’s change some settings. For example, we can change the scale of the y-axis to logarithmic and apply an offset for the independent variable:

petab_problem.visualization_df = petab.get_visualization_df(
    example_dir_fujita / "visuSpecs" / "Fujita_visuSpec_1.tsv"
)
petab_problem.visualization_df

	plotId	xOffset	yScale
0	plot1	100	log

petab.visualize.plot_problem(petab_problem)
plt.gcf().set_size_inches(10, 4)

../_images/eb65cfa5797eaec0dce468571c49e11614cbceb596f4063e7fcf752694aba817.png

Subplots by observable

Next, to make the plot less crowded, we group the measurements by observables by adding two subplots and specifying which observables to plot on each via the yValues column:

petab_problem.visualization_df = petab.get_visualization_df(
    example_dir_fujita / "visuSpecs" / "Fujita_visuSpec_2.tsv"
)
petab_problem.visualization_df

	plotId	yValues	yOffset	yScale	plotName
0	plot1	pEGFR_tot	0	log	pEGFR total
1	plot2	pAkt_tot	300	lin	pAkt total and pS6 total
2	plot2	pS6_tot	305	lin	pAkt total and pS6 total

petab.visualize.plot_problem(petab_problem)
plt.gcf().set_size_inches(10, 4)

../_images/0c37d39e28f7875f8173f837212df977393bb3998519d28a8e4882162f3c498b.png

Subplots by dataset

We can also plot different datasets (as specified by the optional datasetId column in the measurement table) in separate subplots:

petab_problem.visualization_df = petab.get_visualization_df(
    example_dir_fujita
    / "visuSpecs"
    / "Fujita_visuSpec_individual_datasets.tsv"
)
petab_problem.visualization_df

	plotId	plotTypeData	datasetId	xValues
0	plot1	provided	model1_data1_pEGFR_tot	time
1	plot2	provided	model1_data2_pEGFR_tot	time
2	plot2	provided	model1_data3_pEGFR_tot	time
3	plot3	provided	model1_data4_pEGFR_tot	time
4	plot3	provided	model1_data5_pEGFR_tot	time
5	plot3	provided	model1_data6_pEGFR_tot	time

petab.visualize.plot_problem(petab_problem)
plt.gcf().set_size_inches(10, 4)

../_images/46b1e49176a42b597bcf647d273844f97480373425e979569d577f4cffb68e41.png

Legend entries

So far, the legend entries don’t look very nice. We can change them by specifying the desired labels in the legendEntries column:

petab_problem.visualization_df = petab.get_visualization_df(
    example_dir_fujita / "visuSpecs" / "Fujita_visuSpec_datasetIds.tsv"
)
petab_problem.visualization_df

	plotId	plotTypeData	datasetId	xValues	legendEntry
0	plot1	provided	model1_data1_pEGFR_tot	time	Data 1, pEGFR total
1	plot1	provided	model1_data2_pEGFR_tot	time	Data 2, pEGFR total
2	plot1	provided	model1_data3_pEGFR_tot	time	Data 3, pEGFR total
3	plot1	provided	model1_data4_pEGFR_tot	time	Data 4, pEGFR total
4	plot1	provided	model1_data5_pEGFR_tot	time	Data 5, pEGFR total
5	plot1	provided	model1_data6_pEGFR_tot	time	Data 6, pEGFR total
6	plot2	provided	model1_data1_pAkt_tot	time	Data 1, pAkt total
7	plot2	provided	model1_data2_pAkt_tot	time	Data 2, pAkt total
8	plot2	provided	model1_data3_pAkt_tot	time	Data 3, pAkt total
9	plot2	provided	model1_data4_pAkt_tot	time	Data 4, pAkt total
10	plot2	provided	model1_data5_pAkt_tot	time	Data 5, pAkt total
11	plot2	provided	model1_data6_pAkt_tot	time	Data 6, pAkt total
12	plot3	provided	model1_data1_pS6_tot	time	Data 1, pS6 total
13	plot3	provided	model1_data2_pS6_tot	time	Data 2, pS6 total
14	plot3	provided	model1_data3_pS6_tot	time	Data 3, pS6 total
15	plot3	provided	model1_data4_pS6_tot	time	Data 4, pS6 total
16	plot3	provided	model1_data5_pS6_tot	time	Data 5, pS6 total
17	plot3	provided	model1_data6_pS6_tot	time	Data 6, pS6 total

petab.visualize.plot_problem(petab_problem)
plt.gcf().set_size_inches(10, 6)

../_images/6957a05052d49af3e1b1164893d5c66aa53ebf8e26ecc5357c07a54f1ead1805.png

Plotting individual replicates

If the measurement file contains replicates, the replicates can also be visualized individually by setting the value for plotTypeData to replicate. Below, you can see the same measurement data plotted as mean and standard deviations (left) and as replicates (right):

petab_problem = Problem.from_yaml(petab_yaml_isensee)
petab_problem.visualization_df = petab.get_visualization_df(
    example_dir_isensee / "Isensee_visualizationSpecification_replicates.tsv"
)
plot_problem(
    petab_problem,
    simulations_df=example_dir_isensee / "Isensee_simulationData.tsv",
)
plt.gcf().set_size_inches(16, 9)

../_images/5ab48cb0500203aec28f9cb568a6963eb8efd62f551dd732987810064b2ac8c0.png

Scatter plots

If both measurements and simulated data are available, they can be visualized as scatter plot by setting plotTypeSimulation to ScatterPlot:

petab_problem = Problem.from_yaml(petab_yaml_isensee)
petab_problem.visualization_df = petab.get_visualization_df(
    example_dir_isensee / "Isensee_visualizationSpecification_scatterplot.tsv"
)
plot_problem(
    petab_problem,
    simulations_df=example_dir_isensee / "Isensee_simulationData.tsv",
)
plt.gcf().set_size_inches(10, 4)

../_images/c8bb3766138adc491fc3626f6e4b252ae090bc3ff1b4735cff1005ee137ab269.png

Further examples

Here are some further visualization examples, including barplots (by setting plotTypeSimulation to BarPlot):

petab_problem = petab.Problem.from_yaml(petab_yaml_isensee_vis)
plot_problem(
    petab_problem,
    simulations_df=example_dir_isensee / "Isensee_simulationData.tsv",
)
plt.gcf().set_size_inches(20, 12)

../_images/8af61e3e62a18373873c7c2402e6545a9b371c843da8f22a0020e796b0860e09.png

Also with a visualization file, there is the option to plot only simulations, only measurements, or both, as was illustrated above in the examples without a visualization file.

Refer to the PEtab documentation for descriptions of all possible settings. If you have any questions or encounter some problems, please create a GitHub issue. We will be happy to help!