Requirements for a Configuration Library

gear
For an application developer, the problem of configuring the system for actual operations is usually not regarded as a complicated issue. Sometimes the code needs some settings, which can not, or should not be hard-coded, so some external source is needed to provide these values. It’s not rocket science.

Still, there are some straight-forward, nice-to-have features, which might make this task even easier, not readily found in mainstream configuration libraries.

API

Arguably the most important aspect should be ease of use. It should be trivial to get the right value at the right time, safely. Let’s start with this design:

connection.setTimeout( config.get("net.timeout") );

We need to set a connection timeout, which we get from the “net.timeout” configuration setting. This approach is very similar to how java.util.Properties works:

connection.setTimeout( properties.getProperty("net.timeout") );

There are however a couple of problems with this design:

  1. What should be the type of the returned value? It can not be always String!
  2. What happens if the value is not available?
  3. How to store the timeout without coupling the value to its usage?

Typed values

The first additional requirement should be to allow for different types of configuration values. Not all values are Strings. The approach most frameworks take, including Commons Configuration, Netflix Archaius and Typesafe’s Config is to offer multiple get() methods for the different types of values, like this:

connection.setTimeout( config.getLong("net.timeout") );

This solves the first problem of typed values, but introduces new problems:

  • Decouples the type from the setting (key), which arguably should not be decoupled. The timeout will always be “long”, it can not be different, it is inherently coupled.
  • It makes it hard to extend the existing vocabulary of types. For example what if we want to introduce Date settings?
  • Makes the API more complex.

Typed-er valuesforms

Let’s specify our requirement in more detail to explicitly disallow the decoupling of type and key, and allow for external extension of supported types. One possible design would look like this:

public interface Configuration {
    <T> T get(Key<T> key);
}

In this design, the key is explicitly linked to the returned type, so it is not possible to use the setting key with the wrong return type, as in the previous design. Also, as the key is just an interface, it is left open for external implementations. Any types can be easily implemented.

The connection example would then look like this:

public static final Key<Long> NET_TIMEOUT_KEY = ...;

...
connection.setTimeout( config.get(NET_TIMEOUT_KEY) );

If we assume that the implementation of the Configuration itself would probably have to support different storage/retrieval mechanisms, including file and database perhaps, it would be easier to have all settings be mapped to a String form. This is an easy task if the available types are predetermined by the library, as in the libraries above, but here there might be any number of types, even complex ones that may contain multiple fields. To get around this problem, let’s extend the Key to be able to serialize the value it represents:

public interface Key<T> {
   String getName();

   String serialize(T value);

   T deserialize(String serializedValue);
}

This way, any Configuration implementation would only have to concentrate on storing and retrieving Strings as keys as well as values.

Defaults

What happens if the configuration setting is not found in the backing storage? Should the API return null for example? The practical problem with that is, that the caller would have to always check for null then, for all settings. This would be cumbersome and error prone:

Long timeoutMillis = config.get(NET_TIMEOUT_KEY);
if (timeoutMillis != null) {
   connection.setTimeout(timeoutMillis);
} else {
   connection.setTimeout(1000l);
}

This is ugly, and far from our original goal of doing it as a one-liner. An alternative would be to use Optional to signify that the setting might be missing:

import java.util.Optional;

public interface Configuration {
    <T> Optional<T> get(Key<T> key);
}

Which could be used the following way:

connection.setTimeout( config.get(NET_TIMEOUT_KEY).orElse(1000l) );

The libraries above take a similar, but alternative approach by requiring the default value right in the getter method:

connection.setTimeout( config.get(NET_TIMEOUT_KEY, 1000l) );

It is a little bit more compact, arguably less readable. Also, this approach may require an additional method to be defined to determine whether a setting is given or not:

import java.util.Optional;

public interface Configuration {
    boolean isSet(Key<?> key); // In place of Optional.isPresent()

    <T> Optional<T> get(Key<T> key);
}

There is a third approach, if we realize that the default value, similarly to the type of the value, is not related to the usage, but to the setting (key) itself. In other words, the default value needs to be specified for the “net timeout” setting, and not for the connection.setTimeout() call. This means the default value can be moved to the Key:

public interface Key<T> {
   String getName();

   T getDefault(); // Get default value for this key

   String serialize(T value);

   T deserialize(String serializedValue);
}

In which case the code becomes simple again:

connection.setTimeout( config.get(NET_TIMEOUT_KEY) );

Unitsmeter

There is an aspect of the above connection example, which we didn’t address yet. The timeout value is not really a Long, it is a Duration. The difference might be small, but significant. If the setting is a long value, we still don’t know in what unit it is given in. Is it seconds? Milliseconds? Hours?

Usually we end up with settings like this:

net.timeout = 30 // Is this seconds or milliseconds?

There is no way to decide what this means, other than to look into the code. This might not be possible for colleagues from Operations for example. In other words the configuration is coupled to the implementation of where this setting is used.

How could we decouple the usage from declaration? Simply write the unit used in the configuration into the configuration itself:

net.timeout = 30ms

This unit is completely independent of the unit used in the code. The Typesafe Config’s file format HOCON allows this specifically for durations and sizes (B, KB, MB, etc.), but is not extendable to other units.

Since our current design allows extensions, it would be easy to write a DurationKey which can parse and emit such declarations, or for any other unit types.

To use such settings, the unit-independent Duration must be converted to the desired unit:

connection.setTimeout( config.get(NET_TIMEOUT_KEY).toMillis() );

This would not have to change, even if we would change the setting in the configuration to a seconds-based value.

Dynamic settings

Dynamic settings are values that can change during runtime, at any time. Normally it is thought of as settings that must trigger an immediate change in the application behavior, i.e. to read and apply the new value as soon is it available.

This usually takes the form of registering callbacks to handle changed values, like in the case of Netflix’s Archaius:

config.getIntProperty("net.timeout"), 1000l).addCallback(
   () -> { ...re-apply new value... }
);

As this feature would complicate the API, and might also be quite difficult to implement in a distributed environment correctly, it must be evaluated whether such a feature is worth the effort at all. There are some arguments to be made, that such a feature is usually not needed.

The first argument is, that pieces of code that regularly use configuration values should just query those values directly each time, instead of storing them. For example each time a connection is created, ask for the current value of the timeout directly. This puts some burden on the configuration implementor to provide a very efficient query/cache mechanism internally, but relieves the user from having to specifically prepare for updates, and having to think about threading issues with asynchronously changing values.

There are settings however which are actually only asked for a few specific times, for example on startup, such as the maximal size of a pool. To dynamically change this setting, the pool has to prepare for the change explicitly. This should be however not much more complicated than writing the callback handler above anyway.

Summary

The summary of requirements made in this article to produce a generic configuration library are:

  • API should be easy enough to use inline
  • It should support multiple setting value types
  • The setting key and type should be coupled
  • Value types should be extendable externally
  • Always require default values, do not return null on missing settings
  • Decouple default values from call sites
  • Allow for implementing specific custom units for values (such as duration, length, size, etc.)
  • Allow for getting settings efficiently, possibly thousands of times per second.

One possible interface design:

public interface Key<T> {
   String getName();

   T getDefault();

   String serialize(T value);

   T deserialize(String serializedValue);
}

public interface Configuration {
   boolean isSet(Key<?> key);

   <T> T get(Key<T> key);
}
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s