Yardanico's blog

Random stuff, mostly programming. My GitHub

Amalgamating Nim programs

Posted at Mar 19, 2021

# Intro

Over the last few years, there were a couple of threads on the Nim forum about the possibility of making a single self-contained (more-or-less) C file from a Nim program. Most of the time, those threads were about participating in a competition that doesn’t have Nim or writing a C assignment in Nim.

Today I finally decided to try to do this myself and, to my surprise, succeeded.

# Starting out

Amalgamation is a technique that makes it possible to combine the whole project (usually in C or C++) into a few source files (usually one). It makes distributing libraries easier in some cases, or can even improve the performance, although usually, you can achieve similar results by enabling Link-Time Optimization.

The most popular library out there that can be amalgamated is probably SQLite. In SQLite’s case, this is achieved by carefully separating code into different files that still works when combined in a single file.

It is certainly possible to adjust the Nim’s C backend to output C files that can be combined into a single one and compiled, but this requires certain effort. So I didn’t go this route and started searching for a program that can amalgamate separate C files by itself, and found one - C Intermediate Language.

The project is aimed at creating a simplified subset of C, but it has a feature called “merger” that we’ll be using.

# Building CIL

First of all, we need to compile CIL itself. I chose to use a maintained fork instead of the original so that it’s more likely that it works :)

For building CIL, I just followed the installation section in the README. Here’s the listing of what I did (might or might not work for you):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Install OCaml and opam on Arch
$ sudo pacman -Sy ocaml opam

# Clone the repo
$ git clone https://github.com/goblint/cil && cd cil

# Create a local opam environment ("switch"). This will take a while
$ opam switch create . 

# Add local opam environment to the current shell
# This command is specific to the fish shell. For others, you might need to modify it
$ eval (opam env)
# Configure the compilation, the prefix is the _opam directory
# in CWD (because we use `opam switch`)
# For POSIX-shells replace () with ``
$ ./configure --prefix=(opam config var prefix)

# Build CIL
$ make
# This make install fails for me, but it's needed for commands below
$ make install
# To fix the error in the previous step we need to remove the dataslices dir
$ rm -r _opam/lib/goblint-cli/dataslices/
# Finally install it into currentdir/_opam/bin/
$ make install

# Add it to our PATH (your directory will of course be different)
$ set PATH /home/dian/Stuff/cil/_opam/bin $PATH

# Patching nimbase.h

Now we’re almost ready to use it for Nim programs! There’s one thing left - we need to patch nimbase.h (it’s a file that provides some generic defines for different compilers on different platforms). You can usually find nimbase.h in your_nim_dist/lib/nimbase.h.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
diff --git a/lib/nimbase.h b/lib/nimbase.h
index a83bd3006..27bd48c77 100644
--- a/lib/nimbase.h
+++ b/lib/nimbase.h
@@ -75,10 +75,10 @@ __AVR__
 #endif
 /* ------------------------------------------------------------------------- */
 
-#if defined(__GNUC__) && !defined(__ZEPHYR__)
+//#if defined(__GNUC__) && !defined(__ZEPHYR__)
 /* Zephyr does some magic in it's headers that override the GCC stdlib. This breaks that. */
-#  define _GNU_SOURCE 1
-#endif
+//#  define _GNU_SOURCE 1
+//#endif
 
 #if defined(__TINYC__)
 /*#  define __GNUC__ 3
@@ -196,7 +196,7 @@ __AVR__
 #  define N_LIB_EXPORT_VAR  __declspec(dllexport)
 #  define N_LIB_IMPORT  extern __declspec(dllimport)
 #else
-#  define N_LIB_PRIVATE __attribute__((visibility("hidden")))
+#  define N_LIB_PRIVATE
 #  if defined(__GNUC__)
 #    define N_CDECL(rettype, name) rettype name
 #    define N_STDCALL(rettype, name) rettype name
@@ -325,8 +325,8 @@ namespace USE_NIM_NAMESPACE {
 typedef unsigned char NIM_BOOL; // best effort
 #endif
 
-NIM_STATIC_ASSERT(sizeof(NIM_BOOL) == 1, ""); // check whether really needed
-NIM_STATIC_ASSERT(CHAR_BIT == 8, "");
+//NIM_STATIC_ASSERT(sizeof(NIM_BOOL) == 1, ""); // check whether really needed
+//NIM_STATIC_ASSERT(CHAR_BIT == 8, "");
   // fail fast for (rare) environments where this doesn't hold, as some implicit
   // assumptions would need revisiting (e.g. `uint8` or https://github.com/nim-lang/Nim/pull/18505)
 
@@ -547,7 +547,7 @@ static inline void GCGuard (void *ptr) { asm volatile ("" :: "X" (ptr)); }
 #endif
 
 // Test to see if Nim and the C compiler agree on the size of a pointer.
-NIM_STATIC_ASSERT(sizeof(NI) == sizeof(void*) && NIM_INTBITS == sizeof(NI)*8, "");
+//NIM_STATIC_ASSERT(sizeof(NI) == sizeof(void*) && NIM_INTBITS == sizeof(NI)*8, "");
 
 #ifdef USE_NIM_NAMESPACE
 }

The main change here is that we don’t want to define _GNU_SOURCE because CIL doesn’t support most GNU-specific constructs. It also doesn’t support the NIM_STATIC_ASSERT macro so we disable that as well, along with N_LIB_PRIVATE.

You can apply the patch with a simple git apply /path/to/nimbase.diff while in the Nim distribution directory.

# Hello, Amalgamation!

Now that we have all the necessary things prepared, we can test out our amalgamation setup.

First of all, since Nim expects a compiler binary to be an executable file, we need to create cilly.sh that will call cilly (CIL’s main tool) with all needed arguments:

1
2
#!/usr/bin/env sh
cilly --noPrintLn --merge --keepmerged $@

Save it somewhere in your $PATH.

Now let’s create our first amalgamated Nim program:

1
2
# hello.nim
echo "Hello, Amalgamation!"

We also need to specify that we want to use cilly.sh as our C compiler, so let’s go ahead and create the configuration for our Nim file:

1
2
3
# hello.nim.cfg
gcc.exe="cilly.sh"
gcc.linkerexe="cilly.sh"

We could’ve provided those on the CLI, but it’s better to create a separate config file.

And let’s finally compile our Nim program and create an amalgamation:

nim c -d:danger hello.nim

There might be some warnings, but that’s fine. After the compilation we’ll have a new file called hello_comb.c in our directory, and that’s the amalgamation that we wanted to get!

Now, if you check the line count of the file, you might be surprised - it’s 6.6K lines long (at the time of writing).

Why is that? The main reason is that Nim treats C as a backend and makes many of its own types and functions. That line count is not big - CIL merges all C files of your Nim program and system includes too, so the final size is relatively small. To improve it even more, you might want to consider using some of these options:

After adding those two options, the resulting line count is down to 3.4K lines of code. You can check out the resulting C file in this gist.

# Afterword

While this is an interesting matter, I think that amalgamations are rarely useful. It’s really hard to make them portable, and they can lead to more weird bugs. That said, you can use them if you really, really need to have a single C file.

Some relevant discussions:

Thanks for reading!