Dialyzing legacy code

Wednesday, 3rd July, 2019

** Introduction

Typespecs are handy documentation for erlang functions, but they really come to life when used with Dialyzer []. Dialyzer analyzes a codebase and checks that functions behave according to their typespec. This post runs quickly through using dialyzer on an existing codebase, my own LIGA project.

** Preparation

First, build a PLT (Persistent Lookup Table) for the project. Include a list of erlang apps the project uses, and provide a project-specific location:

$ dialyzer --build_plt --apps kernel stdlib erts eunit --output_plt .liga.plt
  Compiling some key modules to native code... done in 0m25.63s
  Creating PLT .liga.plt ...
Unknown functions:
  compile:file/2
  compile:forms/2
  compile:noenv_forms/2
  compile:output_generated/1
  crypto:block_decrypt/4
  crypto:start/0
Unknown types:
  compile:option/0
 done in 0m24.01s
done (passed successfully)

This will output a dialyzer PLT in a newly-created local file .liga.plt.

More applications can be added after building:

$ dialyzer --add_to_plt --apps compiler crypto --plt .liga.plt

It is possible to run Dialyzer over a whole codebase in one sweep. The simplest way is to give dialyzer a list of directories to analyse, e.g.:

$ dialyzer -r src/ test/ --src

The --src flag tells dialyzer to find and check .erl files (default is to check compiled .beam files).

Build tools like erlang.mk have wrappers too, e.g.:

$ make dialyze

When dialyzing a legacy codebase, the above is likely to produce a lot of warnings, so going module-by-module might be more manageable. Here is a simple workflow:

  1. $ dialyzer src/liga_intmap.erl
  2. [… edit liga_intmap.erl as desired …]
  3. $ make
  4. $ dialyzer --add_to_plt -c ebin/liga_intmap.beam --plt .liga.plt

The last step adds the functions in the module to the PLT. Once the whole codebase has been done for the first time, future checks in batch mode (i.e., after changes to the code) should return with few or no warnings.

When dialyzing a codebase module-by-module, we check each module, make any desired changes, then recompile the code and add the module’s beam file (along with any other updated beam files) to the project’s PLT. Dialyzer will issue warnings for any “unkown functions” (i.e., functions in modules it doesn’t know about). To avoid as many of these as possible, we go through the modules working up the dependency tree, starting at leaf modules (without dependencies).

Grapherl can render a dependency tree of erlang modules as a .png file, e.g.:

$ grapherl -m ebin/ liga.png

LIGA module dependency graph

Here are some example warnings I got when dialyzing LIGA.

labmap.erl

$ dialyzer src/liga_labmap.erl
  Checking whether the PLT .liga.plt is up-to-date... yes
  Proceeding with analysis...
liga_labmap.erl:77: The pattern  can never match the type 
 done in 0m0.14s
done (warnings were emitted)

Dialyzer doesn’t see macros (including records). As the line defining ?VERSION as original is commented out, the clause of versioned_weights/3 that matches it appears to be superfluous.

data_server.erl

$ dialyzer src/data_server.erl
  Checking whether the PLT .liga.plt is up-to-date... yes
  Proceeding with analysis...
data_server.erl:82: The created fun has no local return
data_server.erl:83: The call data_server:get_with_complement(Lab::any(),Ra::any(),{'nm',_},0) breaks the contract (atom(),any(),pcage(),non_neg_integer() | 'all') -> {[labelled_string()],[labelled_string()]}
 done in 0m0.22s
done (warnings were emitted)

“no local return”
– this can mean that the specified function never returns, in which case the typespec can mark the function’s return type as no_return(). It can also (perhaps more often) mean that dialyzer itself crashed while checking the function. In my experience the next warning gives a clue to the cause of the crash.

“breaks the contract”
– there is a mismatch between type expectations between calling function and function being called. The error might just be in the typespec annotations — or two parts of the codebase might have gotten out of sync. In either case tis is important to resolve.

liga_writer.erl

This warning is from erlang.mk’s ‘make dialyze’.

liga_writer.erl:18: Expression produces a value of type 'true' | {'error','bad_directory'}, but this value is unmatched

The line in question is:

    code:add_path("."),

Where the code is using a stdlib function for its side-effects rather than for its return value. The function does return a value though, and the code would be safer and clearer if the expected value was matched:

    true = code:add_path("."),

** Conclusion

Dialyzer is quite simple to use, and helps improve the coherence and clarity of a codebase. As well as the documentation, the Dialyzer chapter of Learn You Some Erlang is worth a read.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: