maleadt     posts     about

Debugging Julia with Address Sanitizer

Address sanitizer is a useful tool for debugging various memory problems, from invalid accesses to mismanagement or leaks. It is similar to Valgrind’s memcheck, but uses compile-time instrumentation to lower the cost.

In this post I’ll explain how to use Clang’s address sanitizer (or ASAN) with Julia. This is somewhat tricky, as the Julia compiler uses LLVM for code generation purposes. Long story short, this implies that all instances of LLVM (ie. the one Julia is compiled with, and the one used for code generation) have to match up exactly for the instrumentation to work as expected.

LLVM toolchain

We’ll start by building a toolchain to compile Julia with. As mentioned before, all LLVM instances in play have to match up exactly for instrumentation to work, so we’ll use Julia’s build infrastructure to generate us an LLVM toolchain.

Start by checking-out Julia, and creating an out-of-tree build directory:

$ git clone https://github.com/JuliaLang/julia
$ cd julia
$ make O=configure sanitize_toolchain

This build will need to provide clang, so create a Make.user containing BUILD_LLVM_CLANG=1. In addition, LLVM does not build its sanitizers with autotools, so add override LLVM_USE_CMAKE=1 to that file as well. And because that triggers LLVM bug #23649, also add USE_LLVM_SHLIB=0

Now execute make install-llvm from the deps subfolder. When it finishes, check if binaries have been written to usr/bin (due to what’s probably a bug in LLVM’s build scripts), and move them to usr/tools if they have.

Sanitized Julia

Now that we have a working toolchain, we’ll use it to compile a sanitized version of the Julia compiler and libraries. Start by creating a new out-of-tree build directory using make O=configure sanitize. But this time, our Make.user will be significantly more complex:

TOOLCHAIN=$(BUILDROOT)/sanitize_toolchain/usr/tools

# use our new toolchain
USECLANG=1
override CC=$(TOOLCHAIN)/clang
override CXX=$(TOOLCHAIN)/clang++
export ASAN_SYMBOLIZER_PATH=$(TOOLCHAIN)/llvm-symbolizer

# enable ASAN
override SANITIZE=1
override LLVM_SANITIZE=1

# autotools doesn't have a self-sanitize mode
override LLVM_USE_CMAKE=1

# make the GC use regular malloc/frees, which are intercepted by ASAN
override WITH_GC_DEBUG_ENV=1

# default to a debug build for better line number reporting
override JULIA_BUILD_MODE=debug

Now kick-off the build using make from the sanitize build directory. Barring any memory issues triggered during system image generation, this should yield a sanitized julia binary and system image.

Running the test-suite

The test-suite is a beast, and because ASAN keeps track of a lot of information it easily takes over 128GiB of memory to run it to completion. Instead, we’ll tune ASAN to consume less memory at the expense of accuracy and report detail.

Julia however already configures default ASAN options, which we need to copy when specifying a different set. Do so by defining the ASAN_OPTIONS environment variable and assigning it the value of detect_leaks=0:allow_user_segv_handler=1:fast_unwind_on_malloc=0:malloc_context_size=2. This copies aforementioned default values, and caps backtrace collection.

Using CUDA packages

If you thought all that was convoluted, prepare for some more. ASAN uses so-called shadow memory to store information about memory allocations. There is a correspondence between regular memory addresses and their shadow counterpart, and this mapping is fixed in order to keep the instrumentation overhead low. Sadly, the default shadow memory location overlaps with fixed memory allocated by CUDA (presumably for its unified virtual address space).

Because the shadow memory is fixed, we need to patch both instances of LLVM (easiest to add a patch to llvm.mk) and have it pick a different shadow offset:

--- lib/Transforms/Instrumentation/AddressSanitizer.cpp
+++ lib/Transforms/Instrumentation/AddressSanitizer.cpp
@@ -359,7 +359,7 @@
       if (IsKasan)
         Mapping.Offset = kLinuxKasan_ShadowOffset64;
       else
-        Mapping.Offset = kSmallX86_64ShadowOffset;
+        Mapping.Offset = kDefaultShadowOffset64;
     } else if (IsMIPS64)
       Mapping.Offset = kMIPS64_ShadowOffset64;
     else if (IsAArch64)
--- projects/compiler-rt/lib/asan/asan_mapping.h
+++ projects/compiler-rt/lib/asan/asan_mapping.h
@@ -146,7 +146,7 @@
 #  elif SANITIZER_IOS
 #    define SHADOW_OFFSET kIosShadowOffset64
 #  else
-#   define SHADOW_OFFSET kDefaultShort64bitShadowOffset
+#   define SHADOW_OFFSET kDefaultShadowOffset64
 #  endif
 # endif
 #endif

Note that you might need to redefine a different macro for your platform.

Sanitizing older versions of Julia

If you want to sanitize older versions of Julia, before the switch to LLVM 3.9, there’s yet other issues: only LLVM 3.9 is compatible with recent versions of glibc, while the CMake build system of LLVM 3.7 doesn’t export all necessary public symbols. You can work around these issues by using a sufficiently old system, and overriding the LLVM version to 3.8 (by specifying override LLVM_VER=3.8.1 in the Make.user of both build directories) or preventing it from generating a shared library (by specifying USE_LLVM_SHLIB=0 in the Make.user of the final Julia build).