Sunday 29 November 2015

GNU make and java

Today I had another go at looking at using meta-rules with gnu make to create a 'automatic makefile' system for building java applications. Again I just "had a look" and now the day is all gone.

This is the current input for termz.

java_PROGRAMS=termz

termz_VERSION=0.0
termz_DISTADD=Makefile java.make \
 jni/Makefile \
 jni/termz-jni.c

# compiling
termz_SOURCES_DIRS=src
termz_RESOURCES_DIRS=src
termz_LIBADD=../zcl/dist/zcl.jar

# packaging (runtime)
termz_RT_MAIN=au.notzed.termz.TermZ
termz_RT_LIBADD=../zcl/dist/zcl.jar

# native targets
termz_PLATFORMS=gnu-amd64
# native libs, internal or external.  path:libname
termz_RT_JNIADD=../zcl/jni/bin:zcl jni/bin:termz

# Manually hook in the jni build
jni/bin/gnu-amd64/libtermz.so: build/termz_built
 make -C jni TARGET=gnu-amd64

clean::
 make -C jni clean

include java.make

make (jar)

This builds the class files in build/termz, then the jni libraries via the manual hook. In another staging area (build/termz_dist) it merges the classes and the resource files (stripping the leading paths properly). It uses javapackager to create an executable jar file from this staged tree which includes references to the RT_LIBADD libs (moved to lib/). Finally it copies the RT_LIBADD jar files into bin/lib/ and the native libraries into bin/lib/{platform} so that the jar can be executed directly - only java.library.path must be set.

Not shown but an alternative target type is to use java_JARS instead of java_PROGRAMS. In this case jar is used to package up the class and resource files and in this case no staging is needed.

make tar

This tars up all the sources, resources, and DISTADD into a tar file. No staging is required and all files are taken 'in-place'. The tar file extracts to "termz-0.0/".

make clean

Blows away build and bin and (indirectly) jni/bin. All 'noise' is stored in these directories.

I'm using metaprogramming so that each base makefile can define multiple targets of different types. This is all using straight gnu make - there is no preprocessing or other tools required.

java.make does all the "magic" of course. While it's fairly straightforward ... it can also be a little obtuse and hairy at times. But the biggest difficulty is just deciding what to implement and the conventions to use for each of them. Even the variable names themselves.

There were a couple of messier problems that needed solving although i'd solved the former one last time I looked at this some time ago.

class-relative names

For makefile operation the filenames need to be specified absolutely or relative to the Makefile itself. But for other operations such as building source jars one needs the class-relative name. The easiest approach is just to hard-code this to "src/" but I decided I wanted to add more flexibility than this allows.

The way I solved this problem was to have a separate variable which defines the possible roots of any sources or resources. Depending on what sort of representation I need I can then match and manipulate on these roots to form the various outputs. The only names specified by the user are the filenames themselves.

For example when forming a jar file in-place I need to be able to convert a resource name such as "src/au/notzed/terms/fonts/misc-fixed-semicondensed-6x13.png" into the sequence for calling jar as "-C" "src" au/notzed/terms/fonts/..." so that it appears in the correct location in the jar file.

I use this macro:

# Call with $1=root list, $2=file list
define JAR_deroot=
    $$(foreach root,$1,\
 $$(patsubst $$(root)/%,-C $$(root) %,\
  $$(filter $$(root)/%,$2))) \
   $$(filter-out $$(addsuffix /%,$1),$2)
endef

I'll list the relevant bits of the template which lead up to this being used.

# Default if not set
$1_RESOURCES_ROOTS ?= $$($1_RESOURCES_DIRS)

# Searches for any files which aren't .java
$1_RESOURCES_SCAN := $$(if $$($1_RESOURCES_DIRS),$$(shell find $$($1_RESOURCES_DIRS) \
  -type d -name CVS -o -name '.*' -prune \
   -o -type f -a \! -name '*java' -a \! -name '*~' -a \! -name '.*' -print))

# Merge with any supplied explicitly
$1_RES:=$$($1_RESOURCES_SCAN) $$($1_RESOURCES)

# Build the jar
$$($1_JAR): $(stage)/$1_built
        ...
        jar cf ... \
          $(call JAR_deroot,$$($1_RESOURCES_ROOTS),$$($1_RES)) \
        ...

At this point I don't care about portability with the use of things like find. Perhaps I will look into guilified make in the future (i'm a firm believer in using make as the portability layer as the current maintainer is).

And some example output. The last 2 lines are the result of this JAR_deroot macro.

jar cf bin/termz-0.0.jar   -C build/termz . \
 -C src au/notzed/termz/fonts/misc-fixed-6x13-iso-8859-1.png \
 -C src au/notzed/termz/cl/render-terminal.cl 

A lot of this stuff is there to make using it easier. For example if you just want to find all the non-java files in a given directory root and that contains the package names already you can just specify name_RESOURCES_DIRS and that's it. But you could also list each file individually (there are good reasons to do this for "real" projects), put the resources in other locations or scatter them about and it all "just works".

How?

Just looking at the JAR_deroot macro ... what is it doing? It's more or less doing the following pseudo-java, but in an implicit/functional/macro sort of way and using make's functions. It's not my favourite type of programming it has to be said, so i'm sure experts may scoff.

  StringBuilder sb = new StringBuilder();
  // convert to relative paths
  for (String root: resourcerootslist) {
    for (String path: resourcelist) {
      if (path.startsWith(root+"/")) {
        String relative = path.replace("^" + root + "/", "");

        sb.append("-C").append(root)
          .append(relative);
      }
    }
  }

  // include any with no specified roots
resource:
  for (String path: resourcelist) {
    for (String root: resourcerootslist) {
      if (path.startsWith(root+"/")) {
         continue resource;
      }
    }
    // path is already relative to pwd
    sb.append(path);
  }

  return sb.toString();

This is obviously just a literal translation for illustrative purposes. Although one might notice the brevity of the make solution despite the apparent verbosity of each part.

Phew.

going native

Native libraries posed a similar but slightly different problem. I still need to know the physical file location but I also need to know the architecture - so I can properly form a multi-architecture runtime tree. I ummed and aahd over where to create some basic system for storing native libraries inside jar files and resolve them at runtime but i decided that it just isn't a good idea for a lot of significant reasons so instead I will always store the libraries on disk and let loadLibrary() resolve the names via java.library.path. As it is still convenient to support multi-architecture installs or at least distribution and testing I decided on a simple naming scheme that places the architecture under lib/ and places any architecture specific files there. This then only requires a simple java.library.path setup and prevents name clashes.

Ok, so the problem is then how to define both the architecture set and the library set in a way that make can synthesise all the file-names, extensions (dll, vs so), relative paths, manifest entries in a practical yet relatively flexible manner?

Lookup tables of course ... and some really messy and hard to read macro use. Oh well can't have everything.

Libraries are specified by a pair of values, the location of the directory containing the architecture name(s), and the base name of the library - in terms of System.loadLibrary(). These are encoded in the strings by joining them with a colon so that they can be specified in a single variable. The final piece is the list of platforms supported by the build, and each library must be present for all platforms - which is probably an unnecessary and inconvenient restriction in hindsight.

This is the bit of code which converts the list of libraries + platform names into platform-specific names in the correct locations. I'm not going to bother to explain this one in detail. It's pretty simple just yuck to read.

#
# lookup tables for platform native extensions
#
# - is remapped to _, so usage is:
#
#  $(_$(subst -,_,$(platform))_prefix) = library prefix
#  $(_$(subst -,_,$(platform))_suffix) = library suffix
#
_gnu_amd64_prefix=lib
_gnu_amd64_suffix=.so
_gnu_amd32_prefix=lib
_gnu_amd32_suffix=.so
_mingw32_amd64_prefix=
_mingw32_amd64_suffix=.dll
_mingw32_amd32_prefix=
_mingw32_amd32_suffix=.dll

# Actual jni libraries for dependencies
$1_JAR_JNI=$$(foreach p,$$($1_PLATFORMS), \
  $$(foreach l,$$($1_RT_JNIADD), \
   $$(firstword $$(subst :, ,$$l))/$$(p)/$$(_$$(subst -,_,$$p)_prefix)$$(lastword $$(subst :, ,$$l))$$(_$$(subst -,_,$$p)_suffix)))

Thinking about it now as i'm typing it in a simpler solution is possibly in order even if might means slightly more typing in the calling Makefile. But such is the way of the metamake neophyte and why it takes so long to get anywhere. This is already the 2nd approach I tried, you can get lost in this stuff all too easily. I was thinking I would need some of this extra information to automatically invoke the jni makefile as required but I probably don't or can synthesise it from path-names if they are restricted in a similar fashion and I can just get away with listing the physical library paths themselves.

Simple but restricted and messy to implement:

termz_PLATFORMS=gnu-amd64
termz_RT_JNIADD=../zcl/jni/bin:zcl jni/bin:termz

vs more typing, more flexibility, more consistency with other file paths, and a simpler implementation:

termz_RT_JNIADD=../zcl/jni/bin/gnu-amd64/libzcl.so \
  ../zcl/jni/bin/mingw32-amd64/zcl.dll \
  jni/bin/gnu-amd64/libtermz.so

Ahh, what's better? Does it matter? But nothing matters. Nothing matters.

Ok the second one is objectively better here isn't it?

After another look I came up with this to extract the platform directory name component:

$(lastword $(subst /, ,$(dir $(path))))

make'n it work

One other fairly large drawback of programming make this way is the abysmal error reporting. If you're lucky you get a reference to the line which expands the macro. So it's a lot of hit and miss debugging but that's something i've been doing since my commodore days as a kid and how I usually work if i can get away with it (i.e. building must be fast, i.e. why I find it so important in the first place).

And if you think all of that looks pretty shit try looking at any of the dumb tools created in the java world in the last 20 years. Jesus.

damn work

It seems i hadn't had enough of the terminal after the post last night - i was up till 3am poking at it - basically another whole full-time day. I created a custom terminfo/termcap - basically just started with xterm and deleted shit I didn't think I cared for (like anything mouse or old bandwidth savers that looked like too much effort). But I looked up each obtuse entry and did some basic testing to make sure each function I left in worked as well as I could tell. Despite the documentation again there are a lot of details missing and I had to repeatedly cross checked with xterm behaviour. Things like the way limiting the scroll region works. And I just had a lot of bugs anyway from shoddy maths I needed to fix. I then went to look at some changes and netbeans had been too fat so I went down the rabbit-hole of writing meta makfiles so I could do some small tests in emacs ... and never got that far.

I really needed a proper break from work-like-activities this weekend too, and a lot more sleep. At least I did water the garden, mow the lawn, wash my undies, and run the dishwasher.

No comments: