5 Supervisor Behaviour
This section should be read with the supervisor(3)
manual page in STDLIB, where all details about the supervisor behaviour is given.
5.1 Supervision Principles
A supervisor is responsible for starting, stopping, and monitoring its child processes. The basic idea of a supervisor is that it is to keep its child processes alive by restarting them when necessary.
Which child processes to start and monitor is specified by a list of child specifications
. The child processes are started in the order specified by this list, and terminated in the reversed order.
5.2 Example
The callback module for a supervisor starting the server from gen_server Behaviour
can look as follows:
-module(ch_sup). -behaviour(supervisor). -export([start_link/0]). -export([init/1]). start_link() -> supervisor:start_link(ch_sup, []). init(_Args) -> SupFlags = #{strategy => one_for_one, intensity => 1, period => 5}, ChildSpecs = [#{id => ch3, start => {ch3, start_link, []}, restart => permanent, shutdown => brutal_kill, type => worker, modules => [cg3]}], {ok, {SupFlags, ChildSpecs}}.
The SupFlags
variable in the return value from init/1
represents the supervisor flags
.
The ChildSpecs
variable in the return value from init/1
is a list of child specifications
.
5.3 Supervisor Flags
This is the type definition for the supervisor flags:
sup_flags() = #{strategy => strategy(), % optional intensity => non_neg_integer(), % optional period => pos_integer()} % optional strategy() = one_for_all | one_for_one | rest_for_one | simple_one_for_one
-
strategy
specifies therestart strategy
. -
intensity
andperiod
specify themaximum restart intensity
.
5.4 Restart Strategy
The restart strategy is specified by the strategy
key in the supervisor flags map returned by the callback function init
:
SupFlags = #{strategy => Strategy, ...}
The strategy
key is optional in this map. If it is not given, it defaults to one_for_one
.
one_for_one
If a child process terminates, only that process is restarted.
Figure 5.1: One_For_One Supervisionone_for_all
If a child process terminates, all other child processes are terminated, and then all child processes, including the terminated one, are restarted.
Figure 5.2: One_For_All Supervisionrest_for_one
If a child process terminates, the rest of the child processes (that is, the child processes after the terminated process in start order) are terminated. Then the terminated child process and the rest of the child processes are restarted.
simple_one_for_one
See simple-one-for-one supervisors
.
5.5 Maximum Restart Intensity
The supervisors have a built-in mechanism to limit the number of restarts which can occur in a given time interval. This is specified by the two keys intensity
and period
in the supervisor flags map returned by the callback function init
:
SupFlags = #{intensity => MaxR, period => MaxT, ...}
If more than MaxR
number of restarts occur in the last MaxT
seconds, the supervisor terminates all the child processes and then itself.
When the supervisor terminates, then the next higher-level supervisor takes some action. It either restarts the terminated supervisor or terminates itself.
The intention of the restart mechanism is to prevent a situation where a process repeatedly dies for the same reason, only to be restarted again.
The keys intensity
and period
are optional in the supervisor flags map. If they are not given, they default to 1
and 5
, respectively.
5.6 Child Specification
The type definition for a child specification is as follows:
child_spec() = #{id => child_id(), % mandatory start => mfargs(), % mandatory restart => restart(), % optional shutdown => shutdown(), % optional type => worker(), % optional modules => modules()} % optional child_id() = term() mfargs() = {M :: module(), F :: atom(), A :: [term()]} modules() = [module()] | dynamic restart() = permanent | transient | temporary shutdown() = brutal_kill | timeout() worker() = worker | supervisor
-
id
is used to identify the child specification internally by the supervisor.The
id
key is mandatory.Note that this identifier occasionally has been called "name". As far as possible, the terms "identifier" or "id" are now used but in order to keep backwards compatibility, some occurences of "name" can still be found, for example in error messages.
-
start
defines the function call used to start the child process. It is a module-function-arguments tuple used asapply(M, F, A)
.It is to be (or result in) a call to any of the following:
supervisor:start_link
gen_server:start_link
gen_fsm:start_link
gen_event:start_link
- A function compliant with these functions. For details, see the
supervisor(3)
manual page.
The
start
key is mandatory. -
restart
defines when a terminated child process is to be restarted.- A
permanent
child process is always restarted. - A
temporary
child process is never restarted (not even when the supervisor restart strategy isrest_for_one
orone_for_all
and a sibling death causes the temporary process to be terminated). - A
transient
child process is restarted only if it terminates abnormally, that is, with another exit reason thannormal
,shutdown
, or{shutdown,Term}
.
The
restart
key is optional. If it is not given, the default valuepermanent
will be used. - A
-
shutdown
defines how a child process is to be terminated.-
brutal_kill
means that the child process is unconditionally terminated usingexit(Child, kill)
. - An integer time-out value means that the supervisor tells the child process to terminate by calling
exit(Child, shutdown)
and then waits for an exit signal back. If no exit signal is received within the specified time, the child process is unconditionally terminated usingexit(Child, kill)
. - If the child process is another supervisor, it is to be set to
infinity
to give the subtree enough time to shut down. It is also allowed to set it toinfinity
, if the child process is a worker. See the warning below:
WarningBe careful when setting the shutdown time to
infinity
when the child process is a worker. Because, in this situation, the termination of the supervision tree depends on the child process; it must be implemented in a safe way and its cleanup procedure must always return.The
shutdown
key is optional. If it is not given, and the child is of typeworker
, the default value5000
will be used; if the child is of typesupervisor
, the default valueinfinity
will be used. -
-
type
specifies if the child process is a supervisor or a worker.The
type
key is optional. If it is not given, the default valueworker
will be used. -
modules
are to be a list with one element[Module]
, whereModule
is the name of the callback module, if the child process is a supervisor, gen_server or gen_fsm. If the child process is a gen_event, the value shall bedynamic
.This information is used by the release handler during upgrades and downgrades, see
Release Handling
.The
modules
key is optional. If it is not given, it defaults to[M]
, whereM
comes from the child's start{M,F,A}
.
Example: The child specification to start the server ch3
in the previous example look as follows:
#{id => ch3, start => {ch3, start_link, []}, restart => permanent, shutdown => brutal_kill, type => worker, modules => [ch3]}
or simplified, relying on the default values:
#{id => ch3, start => {ch3, start_link, []} shutdown => brutal_kill}
Example: A child specification to start the event manager from the chapter about gen_event
:
#{id => error_man, start => {gen_event, start_link, [{local, error_man}]}, modules => dynamic}
Both server and event manager are registered processes which can be expected to be always accessible. Thus they are specified to be permanent
.
ch3
does not need to do any cleaning up before termination. Thus, no shutdown time is needed, but brutal_kill
is sufficient. error_man
can need some time for the event handlers to clean up, thus the shutdown time is set to 5000 ms (which is the default value).
Example: A child specification to start another supervisor:
#{id => sup, start => {sup, start_link, []}, restart => transient, type => supervisor} % will cause default shutdown=>infinity
5.7 Starting a Supervisor
In the previous example, the supervisor is started by calling ch_sup:start_link()
:
start_link() -> supervisor:start_link(ch_sup, []).
ch_sup:start_link
calls function supervisor:start_link/2
, which spawns and links to a new process, a supervisor.
- The first argument,
ch_sup
, is the name of the callback module, that is, the module where theinit
callback function is located. - The second argument,
[]
, is a term that is passed as is to the callback functioninit
. Here,init
does not need any indata and ignores the argument.
In this case, the supervisor is not registered. Instead its pid must be used. A name can be specified by calling supervisor:start_link({local, Name}, Module, Args)
or supervisor:start_link({global, Name}, Module, Args)
.
The new supervisor process calls the callback function ch_sup:init([])
. init
shall return {ok, {SupFlags, ChildSpecs}}
:
init(_Args) -> SupFlags = #{}, ChildSpecs = [#{id => ch3, start => {ch3, start_link, []}, shutdown => brutal_kill}], {ok, {SupFlags, ChildSpecs}}.
The supervisor then starts all its child processes according to the child specifications in the start specification. In this case there is one child process, ch3
.
supervisor:start_link
is synchronous. It does not return until all child processes have been started.
5.8 Adding a Child Process
In addition to the static supervision tree, dynamic child processes can be added to an existing supervisor with the following call:
supervisor:start_child(Sup, ChildSpec)
Sup
is the pid, or name, of the supervisor. ChildSpec
is a child specification
.
Child processes added using start_child/2
behave in the same way as the other child processes, with the an important exception: if a supervisor dies and is recreated, then all child processes that were dynamically added to the supervisor are lost.
5.9 Stopping a Child Process
Any child process, static or dynamic, can be stopped in accordance with the shutdown specification:
supervisor:terminate_child(Sup, Id)
The child specification for a stopped child process is deleted with the following call:
supervisor:delete_child(Sup, Id)
Sup
is the pid, or name, of the supervisor. Id
is the value associated with the id
key in the child specification
.
As with dynamically added child processes, the effects of deleting a static child process is lost if the supervisor itself restarts.
5.10 Simplified one_for_one Supervisors
A supervisor with restart strategy simple_one_for_one
is a simplified one_for_one
supervisor, where all child processes are dynamically added instances of the same process.
The following is an example of a callback module for a simple_one_for_one
supervisor:
-module(simple_sup). -behaviour(supervisor). -export([start_link/0]). -export([init/1]). start_link() -> supervisor:start_link(simple_sup, []). init(_Args) -> SupFlags = #{strategy => simple_one_for_one, intensity => 0, period => 1}, ChildSpecs = [#{id => call, start => {call, start_link, []}, shutdown => brutal_kill}], {ok, {SupFlags, ChildSpecs}}.
When started, the supervisor does not start any child processes. Instead, all child processes are added dynamically by calling:
supervisor:start_child(Sup, List)
Sup
is the pid, or name, of the supervisor. List
is an arbitrary list of terms, which are added to the list of arguments specified in the child specification. If the start function is specified as {M, F, A}
, the child process is started by calling apply(M, F, A++List)
.
For example, adding a child to simple_sup
above:
supervisor:start_child(Pid, [id1])
The result is that the child process is started by calling apply(call, start_link, []++[id1])
, or actually:
call:start_link(id1)
A child under a simple_one_for_one
supervisor can be terminated with the following:
supervisor:terminate_child(Sup, Pid)
Sup
is the pid, or name, of the supervisor and Pid
is the pid of the child.
Because a simple_one_for_one
supervisor can have many children, it shuts them all down asynchronously. This means that the children will do their cleanup in parallel and therefore the order in which they are stopped is not defined.
5.11 Stopping
Since the supervisor is part of a supervision tree, it is automatically terminated by its supervisor. When asked to shut down, it terminates all child processes in reversed start order according to the respective shutdown specifications, and then terminates itself.
© 2010–2017 Ericsson AB
Licensed under the Apache License, Version 2.0.