By the Wikipedia definition, observability (in control theory) is a measure of how well internal states of a system can be inferred from knowledge of its external outputs. This idea translates very well to software and has thus been borrowed for this context as well. So, as a helpful definition, let us consider observability to be the extent to which we can determine the state and operating mechanisms (which we will jointly call “internals” going forward) of a software system during runtime by the outputs it produces.
In this sense, a software system which does not provide any indication of its internals, a complete black box, would be one with the least observability. A software system which reveals every detail of its internals, a complete white box, would be considered a perfectly observable system. Any real software system falls in between.
How does it help with reliability?
Why would we consider observability to be a quality metric, which helps us with reliability? Observability is essential for reliability, because reliability is about confidence that software behaves in accordance with its specified or implicit requirements or our own intentions. For such confidence to be achieved, one obviously needs to make a determination about the software behavior. While in some limited circumstances, a black box determination about the behavior of software is sufficient, e.g., the software only needs to solve some tasks, as software engineers, we most often can not ignore the internals. Broken software behind a currently working interface is not what we want. We want to have our software follow the paths we have envisioned and be “correct” in a much broader sense.
For a more concrete example, I have been working on some stock picking tools in my free time. Everything is very much based on probability theory and while the outputs of the tools often yield a convincing-looking statistic property, the devil is often in the details. I quickly learned that through most steps, it makes sense to visualize the distribution of the data and put the final result into the context of the intermediate steps. For example, in the figure below it is clear that the mode of my predictions differs from the median in the observed data, which is indicated by a vertical line. There is a significant left-skew.

Such surprising observations have often been a great starting point to improve my models.
But also on more “low-level” or “enterprise” problems than a Jupyter notebook, getting the right cues at the right time can help tremendously by:
- Cutting debugging times, because you know more about an issue when it gets reported.
- Stumbling upon odd behavior, which warrants an investigation.
- Helping developers less familiar with the code to get an intuition about how the software operates.
So next, we discuss what can be done to improve observability.
How to Achieve Observability?
The best thing about observability is that every bit helps. Sure, there is a limit, where the flood of information is overwhelming, but most software doesn’t tell you enough about itself. There is the concept of the “Three Pillars of Observability” (logs, metrics, and traces) and that certainly is a great framework. In this section, I will just present some ideas which I found to be important, but starting with good intentions and a decent plan (which you can later improve) will probably get you far. If a structured framework, like the three pillars, helps you with that, then all the power to you.
Logging
A good log always helps. Make sure errors and unexpected events are marked clearly. Most of the time, a logging framework with loads of functionality can help you with that. Don’t be afraid to log many things at verbose logging levels. Find a mechanism that ensures you receive a log file every time an issue is reported. Be it by an automated process or some organizational provisions.
Make it Visual
If your software creates complex structures or states, find or implement a mechanism to get a visual representation to help with understanding the state. Be it by creating figures, dashboards or creating a graph, every bit helps.
Even if you are working on an embedded system or are running a headless server, you can probably store a CSV file and visualize it with a bit of Python afterwards. Be creative here.
Monitor What You Care About
Have some easy monitoring utilities to take measurements about the stuff you care about. Examples include execution times of critical sections, memory usage after initialization or in proportion to load or maximum iteration count until an event is finished. Sometimes even call counts or call frequencies can be helpful.
These don’t have to be complicated or even perfect. A small helper class or sometimes even just a macro can do the job. Make sure to make it a small snippet that you can just place wherever you need it.
Provide a Way to Look at the Interpretation of Binary
At least optionally, whenever you are dealing with binary data that your software needs to interpret, your software should have means to tell you what it thinks was in that data. Consider a message with 1 byte identifier and 4 bytes of data. 0x1e:BE:02:3C:AF
is probably much less clear than COMMAND_UPDATE - FLAG_PRIORITY new value: 2 @ entry: 15535
. This helps especially if the interpreatation of your software is wrong.
The same applies to sufficiently complex plain text files (like large JSON files).
Summary
Observability is an essential precondition for reliability, as only if you can observe the internals of your software, you can make a complete statement about its behavior. Hopefully, this article convinces you that observability is an important part of a reliable software system. And most importantly, you should be convinced that you have the means to improve the observability of the software you work on.