Content uploaded by Martin Bicher

Author content

All content in this area was uploaded by Martin Bicher on Feb 15, 2019

Content may be subject to copyright.

Proceedings of the 2018 Winter Simulation Conference

M. Rabe, A. A. Juan, N. Mustafee, A. Skoogh, S. Jain, and B. Johansson, eds.

GEPOC ABM: A GENERIC AGENT-BASED POPULATION MODEL FOR AUSTRIA

Martin Bicher

Institute for Analysis and Scientiﬁc Computing

TU Wien

Wiedner Hauptstraße 8-10

1040 Vienna, AUSTRIA

Christoph Urach

Niki Popper

dwh Simulation Services

dwh GmbH

Neustiftgasse 57-59

1070 Vienna, AUSTRIA

ABSTRACT

Since 2015 researchers in Austrian health-care research project DEXHELPP (Decision Support for Health

Policy and Planning) beneﬁt from having access to a validated generic agent-based population model

(GEPOC ABM) of Austria’s population. This simulation model delivers a valid virtual image of Austria’s

population and is also able to make feasible prognoses. During the last years the model has been extended,

remodeled and applied to several use-cases. We were able to add aspects like vaccination strategies,

treatment pathways or spread of infectious diseases which underlines the ﬂexibility of the implementation.

Yet, a number of challenges have been identiﬁed, being the basis to contribute to the general discussion of

population models. We will discuss evolving challenges according performance issues and present a newly

implemented time-update approach. Thereafter we will discuss different parametrization concepts when

adding a disease model. Finally we will present how we integrated GIS information based on Delauney

Triangulation.

1 INTRODUCTION

With about 8.7 million inhabitants, 190 thousand emigrations and deaths and 260 thousand immigrants and

births, Austria’s total population ﬂuctuated by about 2.2 percent in the course of 2016 (Statistik Austria

2016). This percentage is neither statistically high or low in comparison with other years or other countries,

but it gives an idea about the total volume of population ﬂuctuation and its potential impact on deducible

numbers. It makes clear that any decision-support for policy making and planning can only be valid if it

considers a model accounting for the underlying population dynamics.

Austrian research project DEXHELPP (Decision Support for Health Policy and Planning) poses a

platform for collaboration of health-care stake holders, medical experts, modeling and simulation experts,

statisticians, data scientists and visualization experts. By combining their skills they perform innovative,

joint and data based research on all levels of the health system. With a wide range of integrated technologies

they provide interactive tools for prognosis and decision support for policy making. In order to create a

valid common founding for their decision-support tools research on population modeling and simulation

is one of the most important research areas of this project:

GEPOC, short for Generic Population Concept, is a vital research part of DEXHELPP since 2014.

It is founded on the idea that a related number of valid population models can be used as a basis for

many different applied decision support models. In the ﬁrst stage of the project, two structurally different

population-models have been developed and validated: GEPOC SD and GEPOC ABM. The ﬁrst one

was developed using the method of system-dynamics (SD) and is (mathematically spoken) an ordinary

differential equation model with several hundred coupled equations. The second model is a stochastic

agent-based model (ABM). Both models have been validated using data from the Austrian Bureau of

2656978-1-5386-6572-5/18/$31.00 ©2018 IEEE

Bicher, Urach, and Popper

Statistics (for details, see (Bicher et al. 2015)). Finally, in fall 2016, also a third population model was

added to the collection in form of a partial differential equation (PDE) model (Bicher and Popper 2016)..

1.1 Introduction to GEPOC ABM

All mentioned population models have been sufﬁciently validated and are tested to produce equivalent

results. In the next chapters, we will focus on the agent-based approach GEPOC ABM, as this model

became the center of population based health-care research in DEXHELPP and has grown to a powerful

and versatile simulation tool for any kind of population-based research problem in Austria. Hereby the

coincidence of two important factors was responsible for this success:

•Intensive collaboration with health-care stakeholders provided the possibility for application of

GEPOC ABM as a base model for many diverse health-care related research problems.

•Continued research on population modeling and continuous improvement of GEPOC ABM in

collaboration with modeling and simulation experts from different institutions.

In this work we want to present the overall view on this versatile population model in detail for the

ﬁrst time. Besides giving a formal model deﬁnition we will emphasis on valuable lessons-learned from

iteratively applying and improving the model. We will present interesting technical as well as model-

theoretic challenges related to the model and its implementation and state our approaches to overcome

them.

2 BASIC MODEL DEFINITION AND IMPLEMENTATION

As mentioned GEPOC ABM is an agent-based simulation model and has been validated to ﬁrstly, depict

the status quo of Austria’s population between 1991 and 2017 and secondly to make feasible prognoses

matching the forecasts of the Austrian Bureau of Statistics (on the aggregate level). GEPOC ABM is

deﬁned via its initialization and its time-dynamics:

Initial Setup: Given a certain start date of the simulation an agent-based model with N+1 agents is

initialized. The ﬁrst Nof them stand representative for the inhabitants of Austria and will be denoted as

person-agents henceforth. Each person-agent is given a certain birth-date and (biological) sex. We will

refer to them as female and male agents with a certain age. The remaining N+1-st agent will play the

role of the government and will be denoted as government-agent.

Time Dynamics: The model is updated in not-necessarily equidistant time-steps which are deﬁned

a-priori. Each time-step consists of two parts:

In the ﬁrst part all person-agents are iterated in random order. For each addressed agent, the model

decides about death, emigration and birth of agents using an event-based strategy. First of all, random

numbers decide about whether the addressed agent is scheduled to emigrate, die and/or (for female agents)

had an offspring in the regarded time-step. For each action scheduled this way a uniformly distributed

random number samples a date for the scheduled action and adds it to an event-list. After all possible

events have been regarded the event-list is sorted and processed in correct order. Death and emigration

events lead to a removal of the agent (skipping all further planned events) while the birth event leads to

a newborn agent with correspondent birth-date added to the model. This strategy is sketched in Figure

1. After all person-agents have been iterated, the government agent generates a certain number of new

person-agents (representing immigrants) and adds them to the model. This concludes one model time-step.

This model deﬁnition has changed from the original deﬁnition of GEPOC ABM ((Bicher et al. 2015))

at two points. Firstly, the original model was updated in equidistant time-steps. This small enhancement

became relevant to satisfy the need to execute the model in monthly steps (which may take between 28 and

31 days). Secondly, the mechanism for agent-updates switched from a classic probability-based (markovian)

to an event-based approach. We will discuss the beneﬁts of this strategy in Section 2.2 and take a look at

the implementation ﬁrst.

2657

Bicher, Urach, and Popper

Simulation Time-Step

Update simulation time

Loop person-agents in

random order

dies? Create Random

Death date

emigrates?

female?

recreates?

Create

new

agent

Immigrate new

agents

yes

yes

no

yes

no

no

yes

Loop

finished

Create Random

Emigration date

Create Random

Birth date

Sort planned

actions by date

Loop over

planned

actions

Death?

Birth?

Kill

agent

Break

loop

Loop

finished

Agent

survives

Emigration?

yes

no

Figure 1: Discrete-event motivated strategy encapsulated in a basically time-discrete update of the person-

agents in GEPOC ABM.

For our application we found it more useful to implement the model from the scratch than using

existing ABM frameworks like Netlogo (Tisue and Wilensky 2004), Anylogic (Grigoryev 2012), Mesa

(Masad and Kazil 2015), JADE (Bellifemine et al. 1999) or Mason (Luke et al. 2004). Neither of the

mentioned was capable of 1) dealing with the high total number of required agents, 2) load and process

all necessary parametrisation data (with reasonable preprocessing time) and 3) provide sufﬁcient ﬂexibility

for all potential model extensions. Moreover, as we are dealing with very sensitive health-care data and

research questions we wanted to stay in full control of all parts of the simulation and did not want to rely

on often loosely documented 3rd party frameworks that work nicely for scientiﬁc applications, but reveal

shortcomings and bugs when it comes to real-world applications.

We decided to implement the model using the (primarily) object-oriented programming language

Python3. Firstly, most Python interpreters can be used free of charge and work platform independent

which makes the model easily transferable. Secondly, Python programming requires the use of proper

indentation making the code easily readable. Thirdly, millions of freely available Python packages provide

high performance algorithms and interfaces to almost any known data format.

2.1 Code Performance

Although sub-packages like Numpy and SciPy provide highly efﬁcient and vectorized algorithms to speed

up computation times, Python (alike other dynamically typed, interpreted languages) is known to execute

comparably slow. Therefore, execution of the simulation model with the full population of Austria (i.e.

run the model with 8-9 million agents) is very time and memory consuming. To give a quick example,

the execution of a 365day-time-step with 79000 agents takes a Intel R

CoreTMi5-5200U processor about

2.02 sec without making use of multithreading. This number scales linearly with the number of agents and

time-steps.

2658

Bicher, Urach, and Popper

The easiest and most obvious solution to this problem is running the model with a reduced number of

agents (i.e. one tenth or one hundredth of Austria’s original population) instead. Afterwards the simulation

results can easily be rescaled to the original size. This strategy was quickly approved to be valid from the

modeling perspective: It is a direct consequence of the Law of Large Numbers that the aggregated simulation

results with full population match the rescaled aggregated simulation results with reduced population. The

only difference is the size of stochastic ﬂuctuations which is proven to be larger when running the model

with reduced number of agents (Note, that this result is not only valid for models without interaction as in

this case, but also for a broad range of models with interaction. For more information see (Bicher 2017;

Bicher and Popper 2015)). To compensate for the higher ﬂuctuations with a downscaled population the

simulation can be evaluated more often in Monte Carlo experiments, which increases computation time

with a smaller extent.

Surprisingly, the described strategy encountered harsh opposition at decision-makers and its credibility

was decreased. Discussing the model’s internal logic its easier to communicate, that an agent poses for a

statistical-representative of one real person instead of 10 or 100. Hence, we had to get it executable with

the full population in reasonable time.

Besides standard means for code optimization two interesting technical measures have been implemented

that ﬁnally improved performance of the code.

•The generation of new person-agents has a massive impact on the computation time due to sampling of

multivariate random numbers with user-deﬁned distribution functions. As this is needed extensively

often when generating the initial model population a Markov-Chain Monte-Carlo (MCMC) sampling

algorithm was applied for this purpose. We made use of the performant implementation of this

algorithm in the PyMC package of Python3 (Patil et al. 2010).

•As many applications of GEPOC ABM did not make use of agent-agent contacts or did only

require very local contacts (see Section 3) we used Python’s native subprocess package to make

the simulation model capable for multi-threading. Hereby, the initial population is split into a

predeﬁned number of parts which can be distributed among an arbitrary number of computation

kernels. Hence, as long as it is sufﬁcient that person agents have a very limited range of contact

partners, GEPOC ABM can be executed fully parallelized.

Our current work in this area is focuses on improving the parallelization capabilities of GEPOC ABM

to allow limited contacts between person agents in different threads comparable to (Collier et al. 2015).

Summarizing, we learned the lesson, that performance is still an issue in population models. Strategies to

cope with this, have to include not only methods to increase performance but also stakeholder interests.

2.2 Time-Update Strategy

To be fully versatile as a generic framework GEPOC ABM has to be capable of dealing with processes

on different time scales. While e.g. infectious diseases like inﬂuenza spread in a few days or weeks is

usually requires many years and decades to observe the impact of demographic changes on the health-care

landscape.

The currently most prominent concept to overcome this problem is simulating the model in continuous

time – i.e. using a discrete-event strategy (Buss and Al Rowaei 2010). Hereby, agents are emigrated

and immigrated, die and are born at corresponding event dates which additionally schedule new future

events. After each occurred event the simulation instantaneously skips to the next scheduled event and the

model-time is enhanced. For the multi time-scale problem in GEPOC ABM this strategy would clearly

be beneﬁcial to a classic time-discrete update as the mechanism is independent of the observed time-scale

and scope. Yet, we found two arguments why this type of update is not optimal for our applications (or at

least requires further research).

2659

Bicher, Urach, and Popper

•Finding the next event to occur is always related to a sorting problem. With Ndenoting the initial

number of agents in the model the computational efforts of the ABM consists of iteratively executing

the occurring events (resulting in a problem of O(N)) and correctly inserting the newly scheduled

events to the event list (e.g. using a standard divide-and-conquer algorithm with O(log(N))).

Therefore, the total computational efforts of the model calculate to O(Nlog(N)) which is delicately

larger than using a time-discrete strategy with O(N)effort. Though, there has been progress in

reducing the computational efforts of continuous-time population models by using internal model

logic (Reinhardt and Uhrmacher 2017; Warnke et al. 2016) they can never depend linearly on the

number of agents. Hence, this kind of update strategy is signiﬁcantly slower (at least as long as

the model does not use agent-agent contacts).

•Discrete event update is known to cause difﬁculties if there exists a global interaction level. We

explain this problem on a short example: Suppose, GEPOC ABM is used to investigate the effects

of overpopulation. Therefore, the population density of the country is assumed to have a negative

impact on the death rate. As the population density changes with every occurring event, it is

impossible for a person-agent to correctly deﬁne its own death date in advance. The only solution

to this problem would be, to re-sample all death dates of all agents whenever the population density

changes. This leads to a massive overhead.

The second option to update ABMs is applying discrete time-steps: Instead of deciding when a speciﬁc

event happens the model iterates through time asking if a speciﬁc event occurred in a regarded time-

interval. Hereby so called transition probabilities are used. For the multi-scale problem in GEPOC ABM

the simulation needs to be executable (and valid) with time-steps of arbitrary lengths. Hereby, two problems

occur:

•Firstly, it is mathematically impossible to correctly transform transition probabilities from one to a

different time-step length without changing the (expected) simulation outcome. This is exhaustively

discussed in (Bicher 2017) and is best imagined by a simple gedankenexperiment: Say, a female

agent has a probability ptto give birth to a child during a time-interval with length t. Now, assume

that the time-step length should be halved to t/2. Hence, we are looking for a rescaled probability

pt/2so that two steps of the rescaled model lead to the same results as one step of the original

one. Easily seen, this task is impossible to solve as (independent of the choice of pt/2) the rescaled

model makes it possible that two children are born after the regarded time-interval.

•Secondly, the occurrence of two or more events in one model time-step leads to causality problems.

Especially in the case of population models it makes a crucial difference if an agent dies before it

recreates, emigrates before it dies, recreates before it emigrates or vice versa. Hence, using discrete

time-steps always requires additional model logic.

Consequently neither of the two time-update strategies is optimally suited for a generic population model.

The proposed solution presented in the model deﬁnition and in Figure 1 can be interpreted as an event-based

strategy embedded in a time-discrete update. On the global level, there is a time-step that manages the

update of the time variable. For most transition probabilities we applied the approximation formula

p∆t0=1−(1−p∆t)∆t0

∆t(1)

to scale transition probabilities from one to a different time-step length (∆t→∆t0). This formula is motivated

from geometric distribution.

On the agent-level, the boolean-statement that something happens is linked to an event with occurrence

time when it happens. Hereby, ordering of events is clear from the start and illogical event sequences

are excluded. It is possible to e.g. hospitalize, treat and release an agent in just one model time-step

automatically generating plausible hospitalization and release dates. Hence, as an additional beneﬁt, it is

not always necessary to use atomically small time-steps to investigate small time-scopes. Summarizing,

2660

Bicher, Urach, and Popper

we learned the lesson, that there is no optimal time-update strategy for a generic population model. Event

oriented concepts appear promising, but require further research.

3 APPLICATIONS AND MODEL EXTENSIONS

GEPOC ABM has already proven its ﬂexibility as a basis model for population based research in various

areas. Since its validation in 2015 GEPOC ABM has been used for several health-care related applications

of which we speciﬁcally want to explain the three largest in detail.

Vaccination Rates: Eradication of measles and polio is one of many goals the World Health Association

(WHO) is trying to achieve until year 2020. Hereby, besides other factors especially high vaccination

numbers among the population play a key role. In case a high percentage (about 95% are estimated) of all

inhabitants are vaccinated so-called herd-immunity effects will prevent potential epidemics from breaking

out which, in the long run, leads to the full eradication of the disease. To stay in control about the progress

every country is obliged to yearly report the percentage of vaccinated infants among their age-cohort – we

will furthermore refer to this number as “vaccination rate” – to the WHO.

Though numbers of sold vaccination doses as well as age of their recipients are (quite) well known in

Austria calculation of these rates for reporting reasons is not as simple as it seems. Due to ﬂuctuations

among the population primarily caused by high immigrant/refugee numbers a dynamic simulation model

was used to correctly determine the vaccination rates and improve the formally used calculation method.

We extended GEPOC ABM to get an image about the current MMR (measles, mumps, rubella) and

polio vaccination rates in Austria. According to availability of doses (gained from data about real sold

doses) and the vaccination regimen each person agent is assigned vaccinations. With speciﬁcally calculated

vaccination rates for regular immigrants and refugees the model fully considered the effects of a ﬂuctuating

population. The simulated numbers were reported by the Austrian Ministry of Health and Women’s Affairs

and can be accessed via the web-page of the WHO or in two short reports about the current situation

in Austria (Bundesministerium f¨

ur Gesundheit und Frauen 2017; Bundesministerium f¨

ur Gesundheit und

Frauen 2016). Besides giving access to a more precise calculation method GEPOC ABM additionally

provides deeper insights into the dangers of measles outbreak. E.g. using accredited estimates for the

chance that a vaccination successfully immunizes the recipient and people who were immunized by past

illnesses we are additionally able to give information about the percentage and distribution of immune

persons.

Re-hospitalization of Psychiatric Patients: Re-hospitalization rates of psychiatric patients are con-

sidered as a metric of quality of care. Yet, risk factors which enforce high percentages of re-hospitalized

patients are still not fully understood and are a heavily researched area. In order to test the plausibility of

several risk factors commonly believed by domain experts, and to compare different types of health service

interventions in terms of differences in re-hospitalization outcomes, a simulation model was implemented.

GEPOC ABM was extended by several functionalities. First, person-agents were given a probability

to visit mental hospitals and have a stay of several days during which they are diagnosed. Afterwards,

every person-agent has a certain chance to become re-hospitalized again dependent on diagnosis, sex,

age and other risk factors with were key objects of the investigation. Assuming that the chance depends

on the mean-distance to the nearest hospital, person-agents were assigned a residence (NUTS3 region).

Hereby, impact of infrastructural changes could were tested. Moreover, assuming that the chance depends

on co-morbidities, diabetes mellitus was implemented as background disease. This way also the inﬂuence

of our aging society was analyzed. More information about this model is found in (Zauner et al. 2017;

Bicher et al. 2017).

Number, Severity and Diagnosis of Stroke Incidences: Implementation of stroke units in hospitals

is a heavily discussed topic (Wilbacher 2005). On the one hand, these units are known to signiﬁcantly

decrease the risk of mortality and consequential damage in case of a stroke incident compared to regular

hospital units (Barnett 2000). On the other hand, operation of these specialized units is expensive, especially

2661

Bicher, Urach, and Popper

when not in use. Therefore, DEXHELPP started with rigorous analysis on the need for stroke treatment

using a dynamic simulation model.

Person-agents in GEPOC ABM were extended by a chance to suffer from a stroke with a certain

severity and a speciﬁc type (diagnosis). This chance is implemented to depend from the person-agent’s

age, sex and residence district as well as having had a previous stroke incident. Hereby, we were able to

observe stroke-related parameters which (in Austria) cannot be accessed from data like the average number

of stroke incidences per person or the total number of stroke-caused deaths. The model is not yet fully

validated, but will contribute to improve services provided for stroke treatment by giving a very detailed

picture of the need.

Motivated by these three applications a couple of toolboxes have been developed that can optionally be

used to extend GEPOC ABM if needed. Hereby, certain parts that have been required for the case-studies

and were deemed to have potential use in future applications were made reusable in a more generic form.

We will present the two most interesting here.

3.1 Parametrization of Diseases via Incidence and Prevalence

Taking a closer look at the three applications presented above the experienced modeler will quickly observe

that none of them relies on any contacts between person-agents (Note, that the ﬁrst mentioned application

modeled measles vaccinations and not measles infections). GEPOC ABM offers the possibility to implement

contacts e.g. between persons/patients/hospitals/physicians, but the given research problems deﬁned by our

collaborating decision makers (e.g. Austrian Ministry of Health, Main Association of Social Insurances,

Gesundheit ¨

Osterreich GmbH) hardly required this functionality yet. Although we made use of contacts

in smaller and more academic studies (patients ↔doctors in (Nowotny, K. 2018)), the three important

applications presented earlier taught us that simulation-based research in Health Technology Assessment,

Health System Research and Health Services Research does not necessarily rely on contacts or contact-

networks. On the one hand, this can be considered as good news as GEPOC ABM can make full use of

parallelization. On the other hand, the dynamics of the resulting models are scientiﬁcally less interesting.

Causes for the lack of need in contact-based models in health-care applications can only be speculated.

One possible reason might be that the impact of non-transmittable diseases (e.g. cardiovascular diseases,

neurological diseases, chronic progressive diseases) on the health-care system is massive – even compared

to infectious diseases.

For this reason we decided to implement a toolbox that makes it possible to quickly extend GEPOC

ABM with a non-transmittable disease. We united the mechanism used for diabetes mellitus in the re-

hospitalization module and the mechanism for stroke incidences in the last application to form one generically

applicable model add-on. As diabetes is parametrized using prevalence data and stroke is parametrized

using incidence data the generic module is capable for using both data of these epidemiological key ﬁgures.

Hereby it is important to mention that the strategy only considers new cases and does not regard the recovery

from the medical condition.

Incidence or to be precise the incidence rate is deﬁned as a measure for the probability of at least one

occurrence of a certain medical condition in the observed time-interval. An incidence rate of Iper year

implies that a person who does not show the regarded medical condition before has a probability of Ito

show the medical condition after one year. Often incidence rates are given as average number of persons

showing the condition per 1000 or 10000 as it is easier to interpret.

Incidence rates can be used to extend GEPOC ABM in a very natural way. Every healthy person-agent

schedules the “medical condition”-event in the course of the regarded time-step with a probability directly

calculated from the incidence rate. In case GEPOC ABM is run with yearly steps, the incidence rate can

be taken directly, otherwise it is rescaled using formula (1). Although incidences are sufﬁcient to simulate

new cases it is necessary to know about the prevalence at least for the initial setup of the person-agents.

Hence, incidence rates alone are usually not sufﬁcient to parametrize the model.

2662

Bicher, Urach, and Popper

Prevalence is a measure for the total number of persons suffering from a speciﬁc medical condition

and is usually given as a fraction of the total population. As for the incidence rate we often ﬁnd this number

described as number of cases per 1000, 10000 or 100000 persons to make it easier to depict.

In the contrast to incidence rates, the extension of GEPOC ABM using prevalences is not that natural.

We found it most convenient to follow a two phase strategy. First, the model time-step is executed as deﬁned

in Section 2 (including immigration). Hereby, the total population Pand the fraction F0of person-agents

suffering from the medical condition are counted directly after execution of all agent-events. Thereafter, the

known prevalence Fof the medical condition is compared with F0. If data and model are valid, F0<Fshould

result as the number of cases is only reduced in the ﬁrst phase (deaths, emigrations, recoveries). Hence,

(F−F0)Pdescribes the total number of person-agents that should suffer from the medical consideration

according to the data, but do not show this behavior in the model so far. Therefore in phase two, (F−F0)P

healthy person-agents are randomly picked from the agent population to start suffering from the medical

condition. Easily seen, this strategy becomes more accurate the smaller the used time-step and the more

prevalence data points are given. If the step-width of the model time-steps is chosen smaller than the

time-resolution of the data it is useful to linearly interpolate the data points to avoid unsteady jumps of the

prevalence in the model.

Clearly, in case of direct conﬂict the incidence strategy would be preferred as it is the more natural

way parametrising a disease in an ABM. Yet, incidence data for diseases is usually harder to get. The

strategy for parametrization of prevalence might seem unusual for an agent-based model, but gives perfect

control about the total number of cases and has proven to be perfectly suited for simulation of chronic

diseases like diabetes mellitus. Summarizing, we learned the lesson, that a lot of problems don’t require

agent-agent contacts. It is important to have the possibility, but its same important to get rid of it, if not

needed or applicable.

3.2 Giving Agents a Place to Live

As seen in the Stroke and the Re-Hospitalization application of GEPOC ABM it is often necessary to

extend the person-agents properties by a residence. One could mention this feature to be a necessary

feature of population models in general, but turns out to be a massive overhead if not needed. We decided

to generalize the ﬁndings of the two case studies that required agent residences in a generic Geography

toolbox that samples residences to person-agents.

In the course of this development soon a couple of problems occurred. Firstly, the administrative

landscape is permanently changing: Each year a couple of districts and municipalities are dissolved, joined

or reassembled. A very prominent example for this is the former district “Wien Umgebung” which was

split up into four neighbored districts in 2016. Secondly, different partitions of Austria are not always

compatible. It happens quite often that smaller units are not uniquely contained in larger units. For example

one quickly ﬁnds ZIP regions that belong to two or more different political districts. The administrative

regions for health-care service (“Versorgungsregionen”) even overlap with the Austrian federal states.

In order to develop a generic solution that works independently of the investigated partition of Austria

we decided to sample residences in form of GIS coordinates. This method is beneﬁcial compared to sampled

regions as a coordinate is always linked to one unique region per investigated partition. This region may

change with time if units are joined or separated, but can always be found as long as the GPS outline of

the partition is known.

We implemented the following algorithm to sample a random GPS coordinate with respect to a given

partition of Austria (equivalent to the one presented in Section 3.3.1 in (Gallagher et al. 2018)):

1. Sample a random region the person-agent is planned to live in according to a given distribution.

2. Sample a uniformly distributed point inside the region according to its GSP outline.

2663

Bicher, Urach, and Popper

Hereby we worked hard to improve the performance of the latter part. Standard algorithms to sample a

uniformly distributed coordinate in a given region are based on a rejection algorithm. I.e. a uniformly

distributed point inside the bounding-box of the polygon (or to be precise multi-polygon) is sampled and

accepted if it lies inside the regarded region. The strategy requires to check if the sampled point lies inside

the polygon at least once which requires that many scalar multiplications as corner-points on the outline.

It is particularly inefﬁcient if shapes are not 0-connected (as the district of “Amstetten” seen in Figure 2),

not 1-connected (as the district of “St P¨

olten Land” seen in Figure 2), or elongated and diagonally oriented.

Hence, we decided to use a different strategy based on the idea that there exists an explicit formula

to calculate a uniformly distributed point inside a triangle. Given two independent uniformly distributed

random numbers r1and r2between 0 and 1 and three points A,B,C∈R2forming a triangle then

x:=A(1−√r1) + B(1−r2)√r1+Cr2√r1(2)

is a uniformly distributed point inside 4ABC (Osada et al. 2002). As we could not ﬁnd a full proof for

this statement in literature we added it to the Appendix section.

Using this formula our strategy states as follows.

2.a Perform a Constrained Delauney Triangulation (CDT) of the shape and calculate the areas of all

resulting triangles. Note, that this has to be done only once for each region and can be reused.

2.b Pick one random triangle from the list of triangles weighted by their area.

2.c Pick a uniformly distributed point inside the triangle according to formula (2)

The concept of the CDT is visualized on the two aforementioned districts in Figure 2. Experiments showed

that this version of the method is about ten times more efﬁcient than the rejection algorithm. Figure 3 shows

100000 sampled residences according to a given distribution on municipality level (Austria is partitioned

in about 2700 of them). Highly populated areas, especially the large cities Vienna, Graz, Linz, Salzburg

and Innsbruck are well visible. Also the inﬂuence of the Alps which range from the south-west almost

until Vienna in the north-east is very picturesque.

Although the sampling algorithm works nicely the Geography module of GEPOC ABM can not yet

be considered a validated generic model extension so far especially due to a lack of parametrization data.

First of all, joining and splitting of regions cause problems with standardized data storage and acquisition

for parametrization of the module. Secondly, data availability for parametrization of internal migration

of person-agents is unfortunately insufﬁcient. We currently plan to include settlement information from

the Global Human Settlement Project (Florczyk et al. 2016) to make population distribution even more

realistic. Summarizing, we learned the lesson, that sampling of solely residential regions (Federal States,

NUTS3 Regions, Political Districts,. . . ) is not sustainable. We require sampled coordinates.

4 CONCLUSION

As seen in the three case studies GEPOC ABM has already proven its worth as a generic population base

module for different health-care related research problems. Due to our close collaboration with decision

makers we are able to continuously improve and extend the model to make it easier applicable and more

ﬂexible. Hereby we were taught valuable lessons about population modeling and modularity of simulation

models which we shared in this work.

Still, there are many open questions which require further research. The parametrization of spatial

aspects and hereby especially the internal migration involves data difﬁculties which we plan to solve in the

next years. Also the usage of a large computation cluster for reduction of calculation times is planned very

soon. Finally, we aim to apply the model for research problems apart from health-care to get additional

insights.

2664

Bicher, Urach, and Popper

Figure 2: Constrained Delauney Triangulation of districts “St. P ¨

olten Land” (left) and “Amstetten” (right)

for GIS-coordinate sampling (status Jan 1st 2013). The colors of the triangles indicate their area.

Figure 3: Sampled residences for 100000 agents according to distribution for municipalities (Jan 1st 2013).

REFERENCES

Barnett, H. J. M. 2000. “The Imperative to Develop Dedicated Stroke Centers”. Journal of the American

Medical Association 283(23):3125.

Bellifemine, F., A. Poggi, and G. Rimassa. 1999. “JADE–A FIPA-compliant agent framework”. In Pro-

ceedings of the Practical Applications of Intelligent Agents, 97–108.

Bicher, M. 2017. Classiﬁcation of Microscopic Models with Respect to Aggregated System Behaviour.

Dissertation, Institute for Analysis and Scientiﬁc Computing, TU Wien, Vienna, Austria.

Bicher, M., B. Glock, F. Miksch, G. Schneckenreither, and N. Popper. 2015. “Deﬁnition, Validation and

Comparison of Two Population Models for Austria”. In Proceedings of 4th UBT Annual International

Conference on Business,Technology and Innovation, edited by E. Hajrizi, 174–179. Durres, Albania:

UBT - Higher Education Institution.

Bicher, M., and N. Popper. 2015. “Spatial Effects in Stochastic Microscopic Models - Case Study and

Analysis”. IFAC-PapersOnLine 48(1):153–158.

2665

Bicher, Urach, and Popper

Bicher, M., and N. Popper. 2016. “Mean-Field Approximation of a Microscopic Population Model for

Austria”. In Proceedings of the 9th EUROSIM Congress on Modelling and Simulation, 544–545. Oulu,

Finland.

Bicher, M., C. Urach, G. Zauner, C. Rippinger, and N. Popper. 2017. “Calibration of a Stochastic Agent-

Based Model for Re-Hospitalization Numbers of Psychatric Patients”. In Proceedings of the 2017

Winter Simulation Conference, edited by W.K.V. Chan et al., 12. Piscataway, New Jersey: IEEE.

Bundesministerium f¨

ur Gesundheit und Frauen 2016. “Kurzbericht: Evaluierung der Masern - Durchimp-

fungsraten”. Technical report, BMGF, Vienna, Austria.

Bundesministerium f¨

ur Gesundheit und Frauen 2017. “Kurzbericht: Evaluierung der Polio-

Durchimpfungsraten”. Technical report, BMGF, Vienna, Austria.

Buss, A., and A. Al Rowaei. 2010. “A comparison of the accuracy of discrete event and discrete time”.

In Proceedings of the 2010 Winter Simulation Conference, edited by B. Johansson et al., 1468–1477:

IEEE.

Collier, N., J. Ozik, and C. M. Macal. 2015. “Large-scale agent-based modeling with repast hpc: A case

study in parallelizing an agent-based model”. In European Conference on Parallel Processing, 454–465.

Springer.

Florczyk, A. J., S. Ferri, V. Syrris, T. Kemper, M. Halkia, P. Soille, and M. Pesaresi. 2016. “A new

European settlement map from optical remotely sensed data”. Journal of Selected Topics in Applied

Earth Observations and Remote Sensing 9(5):1978–1992.

Gallagher, S., L. F. Richardson, S. L. Ventura, and W. F. Eddy. 2018. “SPEW: Synthetic Populations and

Ecosystems of the World”. Journal of Computational and Graphical Statistics 0(0):1–12.

Grigoryev, I. 2012. AnyLogic 6 in three days: a quick course in simulation modeling. Hampton, NJ:

AnyLogic North America.

Luke, S., C. Ciofﬁ-Revilla, L. Panait, and K. Sullivan. 2004. “Mason: A new multi-agent simulation

toolkit”. In Proceedings of the 2004 swarmfest workshop, Volume 8, 316–327. Michigan, USA.

Masad, D., and J. Kazil. 2015. “MESA: an agent-based modeling framework”. In 14th PYTHON in Science

Conference, edited by K. Huff et al., 53–60.

Nowotny, K. 2018, June. “ECO - Land¨

arzte gesucht: Immer mehr Orte ohne Ordination”. TV documentary.

Osada, R., T. Funkhouser, B. Chazelle, and D. Dobkin. 2002. “Shape distributions”. ACM Transactions on

Graphics (TOG) 21(4):807–832.

Patil, A., D. Huard, and C. J. Fonnesbeck. 2010. “PyMC: Bayesian stochastic modelling in Python”. Journal

of statistical software 35(4):1.

Reinhardt, O., and A. M. Uhrmacher. 2017, April. “An Efﬁcient Simulation Algorithm for Continuous-

Time Agent-Based Linked Lives Models”. In ANSS 2017 Spring Simulation Multi-Conference. Virginia

Beach, Virginia.

Statistik Austria 2016. Statistisches Jahrbuch ¨

Osterreich 2016. Verlag ¨

Osterreich GmbH.

Tisue, S., and U. Wilensky. 2004. “NetLogo: A simple environment for modelling complexity”. In

International Conference on Complex Systems, Volume 21, 16–21. Boston, Massachusetts.

Warnke, T., O. Reinhardt, and A. M. Uhrmacher. 2016. “Population-based CTMCS and agent-based models”.

In 2016 Winter Simulation Conference (WSC), 1253–1264. Piscataway, New Jersey: IEEE.

Wilbacher, I. 2005. “Stroke Units - ¨

Osterreich im Internationalen Vergleich”. Technical report, HVB EBM.

Zauner, G., C. Urach, M. Bicher, N. Popper, and F. Endel. 2017. “Spatial psychiatric hospitalization

modelling in an international setting - an agent based approach”. In Proceedings of the International

Workshop on Innovative Simulation for Health Care 2017. Barcelona, Spain. To Appear.

APPENDIX

Proof of statement (2).

2666

Bicher, Urach, and Popper

Proof. Based on two independent uniform random numbers r1,r2with common density

fX:R2→R+:(r1,r2)T7→ 1

we deﬁne the transformation

φA,B,C:R2→R2:(r1,r2)T7→ A(1−√r1) + B(1−r2)√r1+Cr2√r1

=B+ (A−B)(1−√r1)+(C−B)r2√r1

and aim to show that φA,B,Cuniformly maps the unit square [0,1]2onto the triangle 4ABC. Firstly, we

deﬁne φA,B,Cas the conjunction of two separate mappings. With

φ0:R2→R2:(r1,r2)T7→ (1−√r1)

r2√r1

we get

φA,B,C(r1,r2) = B+ ((A−B),(C−B))φ0(r1,r2).

Hereby an afﬁne transformation is applied on the image of φ0. As afﬁne transformations (a) map triangles

onto triangles and (b) conserve the uniformity of a distribution, it is sufﬁcient to show that φ0maps r1,r2

onto the triangle 4(1,0)(0,0)(0,1)and that this mapping conserves the uniformity.

The ﬁrst statement is trivially fulﬁlled. To show the second, we apply the transformation formula for

probability densities

fφ0(y1,y2) = fX(φ0−1(y1,y2))detJφ0−1(y1,y2).

We calculate

φ0−1(y1,y2) = (1−y1)2

y2

1−y1,and Jφ0−1(y1,y2) = −2(1−y1)y2

(1−y1)2

01

(1−y1)!.

Therefore,

fφ0(y1,y2)≡2=1

Area(4(1,0)(0,0)(0,1))

shows that the transformed density is (as well) constant. Therefore, the image of φ0and also the image of

φA,B,Cis uniformly distributed on the stated triangle proving (2).

AUTHOR BIOGRAPHIES

MARTIN BICHER is research associate at the TU Wien and scientiﬁc employee at dwh Simulation

Services GmbH. He ﬁnished his PhD in Technical Mathematics at TU Wien in Winter 2017. His doctoral

thesis was about mean-ﬁeld behaviour of microscopic models. Email address: martin.bicher@tuwien.ac.at.

CHRISTOPH URACH studied Technical Mathematics at TU Wien and specialised on Mathematical

Modelling and Simulation in the ﬁeld of HTA (Health Technology Assessment). He currently works at

dwh simulation services in the department of health economics where he is developing applicable model

structures for evaluation of health care interventions. He is also working on a PhD thesis supervised by

Prof. Dr. Felix Breitenecker. Email address: christoph.urach@dwh.at.

NIKI POPPER is CEO of dwh - Simulation Services GmbH and research associate at TU Wien. He

is responsible key-researcher of K-Project DEXHELPP and head of the corresponding association. His

research focus lies on comparison of different modeling techniques. niki.popper@dexhelpp.at.

2667