I recently had the pleasure of reviewing some code that had a function which looked something like the following:
private ServiceCode getServiceCode(String serviceName, String errorMessage) {
if (serviceName == null) {
return null;
}
ServiceCode fromCache = serviceCodeCache.get(serviceName);
if (fromCache != null) {
return fromCache == MISSING_CODE ? null : fromCache;
}
ServiceCode serviceCode = serviceCodeRepository.findOne(serviceName);
if (serviceCode == null) {
serviceCodeCache.put(serviceName, MISSING_CODE);
LOG.error(errorMessage);
} else {
serviceCodeCache.put(serviceName, serviceCode);
}
return code;
}
The purpose of the function was simple enough: return the serviceCode
given the service's name, preferably using a cache and log if the service code is not available from the db for debugging purposes. All of the requirements here make sense: Since the service codes are static data, they can easily be cached and since there could potentially exist 100s of services, we don't want to load all the codes in memory either. Also, if there was a problem with the data, i.e. a service did not have a code provided (or the service name was incorrect), logging would help debugging problems downstream.
A pure function
I am a big fan of immutability and functional programming because it results it code that is easy to reason about. Pure functions need to adhere to the following properties:
- it may not modify any state
- it may not depend on any non-deterministic state
The first property is rather straightforward: for a function to be pure, it cannot go around causing side-effects such as modifying global variables, etc. The second property however, is not quite to obvious at a first glance. What the second property is trying to say is that for a function to be pure it needs to return the same value for the same parameters. As an example, if my function signature looks like the following: int getNumber(Random rand)
, then it isn't pure because it is dependent on an input that is non-deterministic. For an in-depth discussion on the relationship between immutability and pure functions see here.
Making it pure
Without further ado, let's try and make the above function pure. Note that in this exercise, we will allow ourselves a little leeway and state that the logger is part of the environment and can therefore be excluded as a side-effect. Moreover, we can also assume that the serviceCodeRepository
is a read-only global object and can also be excluded from side-effects. Here is the result:
private ServiceCode getServiceCode(
String serviceName,
Cache<String, ServiceCode> serviceCodeCache,
String errorMessage
) {
if (serviceName == null) {
return null;
}
ServiceCode fromCache = serviceCodeCache.get(serviceName);
if (fromCache != null) {
return fromCache == MISSING_CODE ? null : fromCache;
}
ServiceCode serviceCode = serviceCodeRepository.findOne(serviceName);
if (serviceCode == null) {
LOG.error(errorMessage);
}
return code;
}
Note the absence of the cache put in case we retrieve the serviceCode
from the repository. Modifying the input contents would make the function quite impure. Therefore, the caller now needs to ensure that the cache is populated with the serviceCode
or the marker MISSING_CODE
in case of a null result.
The case for data encapsulation
So we managed to make the above function pure but why does it still feel like I just came back from a 4 day camping trip where there were no showers available? Let's try removing the last parameter, in most cases, we would be using the same constant string in any case.
private ServiceCode getServiceCode(String serviceName, Cache serviceCodeCache) {
if (serviceName == null) {
return null;
}
ServiceCode fromCache = serviceCodeCache.get(serviceName);
if (fromCache != null) {
return fromCache == MISSING_CODE ? null : fromCache;
}
ServiceCode serviceCode = serviceCodeRepository.findOne(serviceName);
if (serviceCode == null) {
LOG.error("No serviceCode found for {}", serviceName);
}
return code;
}
We've taken the original requirements, made a pure function, even pushed out some responsibility of cache management to the caller and yet it feels like we're being chased by a bear in the woods.
The problem here as I see it is the cognitive load of calling this function. Due to its purity, the caller now needs to know about it's implementation detail to be able to employ it successfully. Making the original function pure did not help us so let's start from scratch and follow basic data encapsulation to see if that will allow us to sneak past the bear:
private ServiceCode getServiceCode(String serviceName) {
if (serviceName == null) {
return null;
}
ServiceCode fromCache = serviceCodeCache.get(serviceName);
if (fromCache != null) {
return fromCache == MISSING_CODE ? null : fromCache;
}
ServiceCode serviceCode = serviceCodeRepository.findOne(serviceName);
if (serviceCode == null) {
serviceCodeCache.put(serviceName, MISSING_CODE);
LOG.error("No serviceCode found for {}", serviceName);
} else {
serviceCodeCache.put(serviceName, serviceCode);
}
return code;
}
Following data encapsulation principles, we got rid of the last parameter so that the caller does not need to know the implementation detail of the function anymore. This made it much simpler to call and more importantly, gives us that squeaky clean feeling. Unfortunately, it has a few side-effects such as updating a cache but using this function is very straightforward: ask and ye shall receive!
Thus, we can conclude that while immutability is something that we can and should strive for, it isn't quite the silver bullet and sometimes it makes sense to prioritize data encapsulation (or keeping it simple) over immutability :)