In-JAR signing for the JVM

The latest Conveyor update introduces a new solution for an old problem faced by many JVM apps: how to code sign native libraries bundled inside JARs?

Problem

Both Windows and macOS want all native code in an app to be signed. If you don’t do it then you get security errors on macOS, and on Windows virus scanners may interfere with your program or even simply delete it from disk without warning.

In the JVM ecosystem it’s typical to distribute native libraries inside JARs along with a bit of loader code that extracts the right file to somewhere on disk. This is convenient for the developer and works well with the Maven infrastructure. Unfortunately other code signing and packaging tools like jpackage will miss native libraries inside JARs. This prevents notarization on macOS, and thus prevents the app from being distributable, because Apple’s notarization servers do in fact look inside zips for unsigned code.

It also increases startup time, pollutes the user’s home directory, bloats downloads by shipping code for machines the user doesn’t have, can trigger anti-virus scanners on Windows and can break Apple’s library validation security system.

Previous solution

Until now Conveyor solved these problems by searching JARs for native libraries, extracting those that target the right OS and CPU architecture whilst deleting the rest, placing the libraries in the JVM’s lib directory (where the other JNI libraries used by the Java platform are found) and then signing the results. This works well and yields a fully signed app with optimized download and startup times. Unfortunately it has a downside: it can break third party libraries that don’t expect their native components to already be available for loading without extra work.

This solution is usually fixable by supplying a system property that tells the Java library where its pre-extracted native code can be found, but it can be easy to forget and in a few libraries it’s not even possible. If a library isn’t compatible with extraction then the app will unexpectedly break when packaged - the exact scenario we don’t want.

New solution

In-JAR signing

In the February update we introduced a new feature that makes shipping JVM apps even easier: Conveyor can now sign code inside JARs, without extracting the libraries. In this mode it also won’t delete any libraries, so the changes made to the app are minimal. In this mode apps just work and it’s now the default for new projects. You can still turn on the previous behavior using app.jvm.extract-native-libraries = true and we recommend you do that whenever possible.

Please note that existing projects aren’t affected by this change to the defaults, because it only applies when the conveyor.compatibility-level key is set to 7 or higher. As this is set automatically for new projects to whatever the current Conveyor major version is, upgrading is safe and won’t risk breaking anything.

Sysprops project

We also started collecting system properties needed by popular libraries when extract-native-libraries = true in a shared config file. You can copy/paste the parts you need, or import them all directly like this:

include required("https://raw.githubusercontent.com/hydraulic-software/conveyor/master/configs/jvm/extract-native-libraries.conf")

The exact system property needed by each library varies but can usually be found in the README. When the property must hold the absolute path of the library or library directory, the <libpath> token is useful. It will be replaced at runtime with the name of the directory where the library can be found. In other cases libraries can be told to do a search on the “system path”, which for packaged apps means finding libraries in the same place as the bundled JVM.

OS identification

When extracting libraries, how exactly do we know which libraries are for which OS? There are no standards and every library comes up with its own resource paths and loader logic.

The answer is that Conveyor has a sophisticated OS sniffer module. Detecting Mac and Windows libraries isn’t hard but the pain really starts with UNIX libraries. There are a lot of different UNIX variants out there and no standard way to identify them or mark binaries as being intended for them. Despite that, Conveyor can identify UNIX binaries for the following:

  1. Linux with glibc
  2. Linux with muslc
  3. Statically linked Linux
  4. Android (as distinct from regular Linux)
  5. FreeBSD
  6. OpenBSD
  7. NetBSD
  8. Solaris
  9. HP/UX
  10. AIX
  11. Irix
  12. Tru64
  13. Modesto
  14. OpenVMS
  15. Tandem NonStop
  16. DragonflyBSD

That’s a lot of pretty obscure UNIXen but Java has been around a long time, and JARs do sometimes include binaries for operating systems you might not have thought about.

Platform specific Gradle dependencies

Most Java libraries bundle native code for every possible OS into a single JAR, which is why Conveyor knows how to customize the JARs for each target platform. Sometimes though libraries split each supported platform into a separate JAR and expect users to configure their build to depend on the right ones. This is especially common for GUI toolkits that have large native components.

The Conveyor Gradle plugin has special support for this. By adding dependencies to machine specific dependency configurations Conveyor can be told which JARs should go into which packages. The build is also configured so the normal dependency set reflects the machine being used to run Gradle. You can use it like this:

dependencies {
    val conscryptVersion = "2.5.2"
    windowsAmd64("org.conscrypt:conscrypt-openjdk:$conscryptVersion:windows-x86_64")
    macAmd64("org.conscrypt:conscrypt-openjdk:$conscryptVersion:osx-x86_64")
    linuxAmd64("org.conscrypt:conscrypt-openjdk:$conscryptVersion:linux-x86_64")
}

Now the right dependency will be picked regardless of what machine you run Gradle on, and the config generated by the task (see gradle printConveyorConfig) will allow cross-building of packages.

Limitations and future work

There are a few ways this feature could be improved:

  1. We could delete off-target native libraries even when signing in-place.
  2. We could detect specific libraries and import the right system properties automatically, rather than just collecting them in one place.
  3. We could do something about libraries that don’t work with library extraction at all, e.g. due to not using the correct library names for every OS.

The best outcome would be if the JVM community standardized on a way to ship native code. The current situation came about largely due to the limitations of popular build systems combined with lack of support in the JVM. With Project Panama the JVM is getting a fancy new foreign function interface; a nice further upgrade would be for someone to teach build systems and the JVM how to identify and work with the libraries themselves.

.