Library Guidelines

This document outlines general guidelines, anti-patterns, and rules for those writing and publishing Elixir libraries meant to be consumed by other developers.

Getting started

You can create a new Elixir library by running the mix new command:

$ mix new my_library

The project name is given in the snake_case convention where all letters are lowercase and words are separate with underscores. This is the same convention used by variables, function names and atoms in Elixir. See the Naming Conventions document for more information.

Every project has a mix.exs file, with instructions on how to build, compile, run tests, and so on. Libraries commonly have a lib directory, which includes Elixir source code, and a test directory. A src directory may also exist for Erlang sources.

For more information on running your project, see the official Mix & OTP guide or Mix documentation.

Applications with supervision tree

The mix new command also allows the --sup option to scaffold an application with a supervision tree out of the box. We talk about supervision trees later on when discussing one of the common anti-patterns when writing libraries.

Publishing

Writing code is only the first of many steps to publish a package. We strongly recommend developers to:

  • Choose a versioning schema. Elixir requires versions to be in the format MAJOR.MINOR.PATCH but the meaning of those numbers is up to you. Most projects choose Semantic Versioning.

  • Choose a license. The most common licenses in the Elixir community are the MIT License and the Apache License 2.0. The latter is also the one used by Elixir itself.

  • Run the code formatter. The code formatter formats your code according to a consistent style shared by your library and the whole community, making it easier for other developers to understand your code and contribute.

  • Write tests. Elixir ships with a test-framework named ExUnit. The project generated by mix new includes sample tests and doctests.

  • Write documentation. The Elixir community is proud of treating documentation as a first-class citizen and making documentation easily accessible. Libraries contribute to the status quo by providing complete API documentation with examples for their modules, types and functions. See the Writing Documentation guide for more information. Projects like ExDoc can be used to generate HTML and EPUB documents from the documentation. ExDoc also supports "extra pages", like this one that you are reading. Such pages augment the documentation with tutorials, guides and references.

Projects are often made available to other developers by publishing a Hex package. Hex also supports private packages for organizations. If ExDoc is configured for the Mix project, publishing a package on Hex will also automatically publish the generated documentation to HexDocs.

Anti-patterns

In this section we document common anti-patterns to avoid when writing libraries.

Avoid using exceptions for control-flow

You should avoid using exceptions for control-flow. For example, instead of:

try do
  contents = File.read!("some_path_that_may_or_may_not_exist")
  {:it_worked, contents}
rescue
  File.Error ->
    :it_failed
end

you should prefer:

case File.read("some_path_that_may_or_may_not_exist") do
  {:ok, contents} -> {:it_worked, contents}
  {:error, _} -> :it_failed
end

As a library author, it is your responsibility to make sure users are not required to use exceptions for control-flow in their applications. You can follow the same convention as Elixir here, using the name without ! for returning :ok/:error tuples and appending ! for a version of the function which raises an exception.

It is important to note that a name without ! does not mean a function will never raise. For example, even File.read/1 can fail in case of bad arguments:

iex> File.read(1)
** (FunctionClauseError) no function clause matching in IO.chardata_to_string/1

The usage of :ok/:error tuples is about the domain that the function works on, in this case, file system access. Bad arguments, logical errors, invalid options should raise regardless of the function name. If in doubt, prefer to return tuples instead of raising, as users of your library can always match on the results and raise if necessary.

Avoid working with invalid data

Elixir programs should prefer to validate data as close to the end user as possible, so the errors are easy to locate and fix. This practice also saves you from writing defensive code in the internals of the library.

For example, imagine you have an API that receives a filename as a binary. At some point you will want to write to this file. You could have a function like this:

def my_fun(some_arg, file_to_write_to, options \\ []) do
  ...some code...
  AnotherModuleInLib.invoke_something_that_will_eventually_write_to_file(file_to_write_to)
  ...more code...
end

The problem with the code above is that, if the user supplies an invalid input, the error will be raised deep inside the library, which makes it confusing for users. Furthermore, when you don't validate the values at the boundary, the internals of your library are never quite sure which kind of values they are working with.

A better function definition would be:

def my_fun(some_arg, file_to_write_to, options \\ []) when is_binary(file_to_write_to) do

Elixir also leverages pattern matching and guards in function clauses to provide clear error messages in case invalid arguments are given.

This advice does not only apply to libraries but to any Elixir code. Every time you receive multiple options or work with external data, you should validate the data at the boundary and convert it to structured data. For example, if you provide a GenServer that can be started with multiple options, you want to validate those options when the server starts and rely only on structured data throughout the process life cycle. Similarly, if a database or a socket gives you a map of strings, after you receive the data, you should validate it and potentially convert it to a struct or a map of atoms.

Avoid application configuration

You should avoid using the application environment (see Application.get_env/2) as the configuration mechanism for libraries. The application environment is global which means it becomes impossible for two dependencies to use your library in two different ways.

Let's see a simple example. Imagine that you implement a library that breaks a string in two parts based on the first occurrence of the dash - character:

defmodule DashSplitter do
  def split(string) when is_binary(string) do
    String.split(string, "-", parts: 2)
  end
end

Now imagine someone wants to split the string in three parts. You decide to make the number of parts configurable via the application environment:

def split(string) when is_binary(string) do
  parts = Application.get_env(:dash_splitter, :parts, 2)
  String.split(string, "-", parts: parts)
end

Now users can configure your library in their config/config.exs file as follows:

config :dash_splitter, :parts, 3

Once your library is configured, it will change the behaviour of all users of your library. If a library was expecting it to split the string in 2 parts, since the configuration is global, it will now split it in 3 parts.

The solution is to provide configuration as close as possible to where it is used and not via the application environment. In case of a function, you could expect keyword lists as a new argument:

def split(string, opts \\ []) when is_binary(string) and is_list(opts) do
  parts = Keyword.get(opts, :parts, 2)
  String.split(string, "-", parts: parts)
end

In case you need to configure a process, the options should be passed when starting that process.

The application environment should be reserved only for configurations that are truly global, for example, to control your application boot process and its supervision tree.

For all remaining scenarios, libraries should not force their users to use the application environment for configuration. If the user of a library believes that certain parameter should be configured globally, then they can wrap the library functionality with their own application environment configuration.

Avoid compile-time application configuration

Assuming you need to use the application configuration and you cannot avoid it as explained in the previous section, you should also avoid compile-time application configuration. For example, instead of doing this:

@http_client Application.fetch_env!(:my_app, :http_client)

def request(path) do
  @http_client.request(path)
end

you should do this:

def request(path) do
  http_client().request(path)
end

defp http_client() do
  Application.fetch_env!(:my_app, :http_client)
end

That's because by reading the application in the module body and storing it in a module attribute, we are effectively reading the configuration at compile-time, which may become an issue when configuring the system later.

If, for some reason, you must read the application environment at compile time, use Application.compile_env/2. Read the "Compile-time environment" section of the Application docs for more information.

Avoid use when an import is enough

A library should not provide use MyLib functionality if all use MyLib does is to import/alias the module itself. For example, this is an anti-pattern:

defmodule MyLib do
  defmacro __using__(_) do
    quote do
      import MyLib
    end
  end

  def some_fun(arg1, arg2) do
    ...
  end
end

The reason why defining the __using__ macro above should be avoided is because when a developer writes:

defmodule MyApp do
  use MyLib
end

it allows use MyLib to run any code into the MyApp module. For someone reading the code, it is impossible to assess the impact that use MyLib has in a module without looking at the implementation of __using__.

The following code is clearer:

defmodule MyApp do
  import MyLib
end

The code above says we are only bringing in the functions from MyLib so we can invoke some_fun(arg1, arg2) directly without the MyLib. prefix. Even more important, import MyLib says that we have an option to not import MyLib at all as we can simply invoke the function as MyLib.some_fun(arg1, arg2).

If the module you want to invoke a function on has a long name, such as SomeLibrary.Namespace.MyLib, and you find it verbose, you can leverage the alias/2 special form and still refer to the module as MyLib.

While there are situations where use SomeModule is necessary, use should be skipped when all it does is to import or alias other modules. In a nutshell, alias should be preferred, as it is simpler and clearer than import, while import is simpler and clearer than use.

Avoid macros

Although the previous section could be summarized as "avoid macros", both topics are important enough to deserve their own sections.

To quote the official guide on Macros:

Even though Elixir attempts its best to provide a safe environment for macros, the major responsibility of writing clean code with macros falls on developers. Macros are harder to write than ordinary Elixir functions and it's considered to be bad style to use them when they're not necessary. So write macros responsibly.

Elixir already provides mechanisms to write your everyday code in a simple and readable fashion by using its data structures and functions. Macros should only be used as a last resort. Remember that explicit is better than implicit. Clear code is better than concise code.

When you absolutely have to use a macro, make sure that a macro is not the only way the user can interface with your library and keep the amount of code generated by a macro to a minimum. For example, the Logger module provides Logger.debug/2, Logger.info/2 and friends as macros that are capable of extracting environment information, but a low-level mechanism for logging is still available with Logger.bare_log/3.

Avoid using processes for code organization

A developer must never use a process for code organization purposes. A process must be used to model runtime properties such as:

  • Mutable state and access to shared resources (such as ETS, files, and others)
  • Concurrency and distribution
  • Initialization, shutdown and restart logic (as seen in supervisors)
  • System messages such as timer messages and monitoring events

In Elixir, code organization is done by modules and functions, processes are not necessary. For example, imagine you are implementing a calculator and you decide to put all the calculator operations behind a GenServer:

def add(a, b) do
  GenServer.call(__MODULE__, {:add, a, b})
end

def handle_call({:add, a, b}, _from, state) do
  {:reply, a + b, state}
end

def handle_call({:subtract, a, b}, _from, state) do
  {:reply, a - b, state}
end

This is an anti-pattern not only because it convolutes the calculator logic but also because you put the calculator logic behind a single process that will potentially become a bottleneck in your system, especially as the number of calls grow. Instead just define the functions directly:

def add(a, b) do
  a + b
end

def subtract(a, b) do
  a - b
end

Use processes only to model runtime properties, never for code organization. And even when you think something could be done in parallel with processes, often it is best to let the callers of your library decide how to parallelize, rather than impose a certain execution flow in users of your code.

Avoid spawning unsupervised processes

You should avoid spawning processes outside of a supervision tree, especially long-running ones. Instead, processes must be started inside supervision trees. This guarantees developers have full control over the initialization, restarts, and shutdown of the system.

If your application does not have a supervision tree, one can be added by changing def application inside mix.exs to include a :mod key with the application callback name:

def application do
  [
    extra_applications: [:logger],
    mod: {MyApp.Application, []}
  ]
end

and then defining a my_app/application.ex file with the following template:

defmodule MyApp.Application do
  # See https://hexdocs.pm/elixir/Application.html
  # for more information on OTP Applications
  @moduledoc false

  use Application

  def start(_type, _args) do
    children = [
      # Starts a worker by calling: MyApp.Worker.start_link(arg)
      # {MyApp.Worker, arg}
    ]

    # See https://hexdocs.pm/elixir/Supervisor.html
    # for other strategies and supported options
    opts = [strategy: :one_for_one, name: MyApp.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

This is the same template generated by mix new --sup.

Each process started with the application must be listed as a child under the Supervisor above. We call those "static processes" because they are known upfront. For handling dynamic processes, such as the ones started during requests and other user inputs, look at the DynamicSupervisor module.

One of the few times where it is acceptable to start a process outside of a supervision tree is with Task.async/1 and Task.await/2. Opposite to Task.start_link/1, the async/await mechanism gives you full control over the spawned process life cycle - which is also why you must always call Task.await/2 after starting a task with Task.async/1. Even though, if your application is spawning multiple async processes, you should consider using Task.Supervisor for better visibility when instrumenting and monitoring the system.

© 2012 Plataformatec
Licensed under the Apache License, Version 2.0.
https://hexdocs.pm/elixir/1.10.4/library-guidelines.html