Software Quality in Academia

Most people, that studied or worked in an academic setting, have come across pieces of software, that are hard to understand and hard to maintain. I just handed in my master thesis, and came across many instances of low quality code during my studies. The most obvious shortcoming, was an almost complete absence of tests. I rarely found tests in the source code related to scientific publications. On a side note, I was not required once in my years of study, to provide any test case whatsoever.

Another big issue is often that academic source code does not have any abstractions. There is a very narrow problem, that is being tackled by the research conducted and reuse of the source code for other (related) problems does not seem to be intended. Interoperability is also often an issue (likely also because of a narrow research scope). Often researchers could easily make a library from their code base, but do not do it.

I acknowledge that there are already good posts on these issues out there, often from people, who are much more involved in academia. I liked the article “On the quality of academic software” by Daniel Lemire a lot. Also, “Why Academic Software Sucks” by Matthias Döring is a good (short) read. However, I would especially like to discuss the incentive structure in more detail, to identify causes and solutions.


Academic researchers, be it post-docs, professors, or people working on a bachelor, master or PhD thesis, are judged by their publications or submissions. They are not being judged by the code that supports their research. This incentivizes them to focus on academic writing, instead of writing code. There is in general nothing wrong with that. Academic writing can act as an incredibly accurate and concise documentation to a project. However, if low quality software inhibits further research or reproduction, there is a problem.

Many people working at universities told me that academia is an incredibly competitive field (especially before you become a professor). The pressure to publish breakthrough results at a high pace is immense. In addition to the inherent quality issues arising from such pressure, this also fosters an environment of individualism and distrust. That means that researchers often work on their software completely isolated and have no interest in the success of their peers.

Academic researchers also often receive no economic benefit from providing high quality software. Their research is mostly funded by (public) research grants, and qualitative software requirements are often not a thing.

Possible Solutions

Universities do operate under very specific social, political and economic conditions. Those conditions are unlikely to change abruptly and insofar, everything I can state as a possible solution can likely not get implemented by any individual (contrary to the general claim this blog is based on). However, maybe these points help to identify positive change, where it arises.

It should become the norm in academia to deliver high quality source code and interoperable and robust software. Publications should be judged together with supporting software, as that software is required to reproduce and extend the results of the publication.

Cooperation in academia should be encouraged. Working together on a problem helps to deliver better results. That holds both for delivered software, and the overall results of the research. This can especially come in the form of inter-university open source projects, where an effort is made to maintain a common, well-maintained codebase.

At last, I believe that, where public funds are granted for software-centered research, a high quality open source software should be the result. Such qualitative requirements should be part of research grants.


History Comments

Comments are a useful tool to give contextual information directly in the source code. They are most typically used as a clarification comments and documentation comments (top Google result for code comments). However, there is a third use which I found comments to be useful for, documenting history!


Is shorter Code always better?

As developers, we are always searching for ways to make our code more concise, structured and understandable. Therefore, short code is preferred over longer code. This post explains why generally we find shorter code better and what some exceptions are.


Communicating Code

A big part of being a programmer is describing code to peers. However, describing the abstract and formalized code constructs in natural language is not always an easy task. This article highlights some techniques that help communicating code to peer programmers. The focus is on communicating the code “itself” instead of through its behavior, which is a whole different topic.