Categories
Allgemein

Software Quality in Academia

Most people, that studied or worked in an academic setting, have come across pieces of software, that are hard to understand and hard to maintain. I just handed in my master thesis, and came across many instances of low quality code during my studies. The most obvious shortcoming, was an almost complete absence of tests. I rarely found tests in the source code related to scientific publications. On a side note, I was not required once in my years of study, to provide any test case whatsoever.

Another big issue is often that academic source code does not have any abstractions. There is a very narrow problem, that is being tackled by the research conducted and reuse of the source code for other (related) problems does not seem to be intended. Interoperability is also often an issue (likely also because of a narrow research scope). Often researchers could easily make a library from their code base, but do not do it.

I acknowledge that there are already good posts on these issues out there, often from people, who are much more involved in academia. I liked the article “On the quality of academic software” by Daniel Lemire a lot. Also, “Why Academic Software Sucks” by Matthias Döring is a good (short) read. However, I would especially like to discuss the incentive structure in more detail, to identify causes and solutions.

Incentives

Academic researchers, be it post-docs, professors, or people working on a bachelor, master or PhD thesis, are judged by their publications or submissions. They are not being judged by the code that supports their research. This incentivizes them to focus on academic writing, instead of writing code. There is in general nothing wrong with that. Academic writing can act as an incredibly accurate and concise documentation to a project. However, if low quality software inhibits further research or reproduction, there is a problem.

Many people working at universities told me that academia is an incredibly competitive field (especially before you become a professor). The pressure to publish breakthrough results at a high pace is immense. In addition to the inherent quality issues arising from such pressure, this also fosters an environment of individualism and distrust. That means that researchers often work on their software completely isolated and have no interest in the success of their peers.

Academic researchers also often receive no economic benefit from providing high quality software. Their research is mostly funded by (public) research grants, and qualitative software requirements are often not a thing.

Possible Solutions

Universities do operate under very specific social, political and economic conditions. Those conditions are unlikely to change abruptly and insofar, everything I can state as a possible solution can likely not get implemented by any individual (contrary to the general claim this blog is based on). However, maybe these points help to identify positive change, where it arises.

It should become the norm in academia to deliver high quality source code and interoperable and robust software. Publications should be judged together with supporting software, as that software is required to reproduce and extend the results of the publication.

Cooperation in academia should be encouraged. Working together on a problem helps to deliver better results. That holds both for delivered software, and the overall results of the research. This can especially come in the form of inter-university open source projects, where an effort is made to maintain a common, well-maintained codebase.

At last, I believe that, where public funds are granted for software-centered research, a high quality open source software should be the result. Such qualitative requirements should be part of research grants.

Categories
Allgemein

History Comments

Comments are a useful tool to give contextual information directly in the source code. They are most typically used as a clarification comments and documentation comments (top Google result for code comments). However, there is a third use which I found comments to be useful for, documenting history!

Categories
Allgemein

Is shorter Code always better?

As developers, we are always searching for ways to make our code more concise, structured and understandable. Therefore, short code is preferred over longer code. This post explains why generally we find shorter code better and what some exceptions are.

Categories
Allgemein

Communicating Code

A big part of being a programmer is describing code to peers. However, describing the abstract and formalized code constructs in natural language is not always an easy task. This article highlights some techniques that help communicating code to peer programmers. The focus is on communicating the code “itself” instead of through its behavior, which is a whole different topic.

Categories
Allgemein

Technical Debt

Technical debt sounds scary. From what we know from financial debt, all debt has some inherent risk that it starts consuming all the income/revenue one has. One becomes insolvent and that is not at all what one wants to be. That same thing may happen with technical debt too. If all development efforts are used up by maintenance tasks and no actual progress can be made, a project or organization reaches technical insolvency.

Categories
Allgemein

constexpr Variables

constexpr variables are a powerful feature of C++. They have the potential to improve runtime performance and also convert some runtime errors to compile time errors. However, they are sparely used in code bases due to the constraints they impose on the developer. This post will argue that their use, especially in the context of templating should be considered more often.

Categories
Allgemein

Property-Based Testing

Property-based testing is a technique I came into contact with that probably changed the way I code the most. It is a powerful tool to describe the properties (/the behavior) of the units software consists of. It greatly reduces the amount of test code that needs to be written, while increasing the coverage of edge cases at the same time. We will start by finding out that we actually have a problem with existing test methods.

Categories
Allgemein

Having a Development Guideline

A software development guideline is a document that determines important rules for the processes, tools, design and architecture used by a team of software developers (programmers, QA folks, product managers…). It is my firm belief that every team of software developers should have such a guideline for the reasons that will be laid out in this article.

Categories
Allgemein

The Type of User Input

In console applications user input will mostly come through the form of arguments provided by the user’s call of the program. We will explore how argparse, Python 3’s built-in argument parser converts this input to an appropriate type. There are some quirks there that provide some additional insights into (duck) typing and user input validation.

Categories
Allgemein

100% Test Coverage

Unit tests are vital in bigger projects. They allow spotting regressions as changes are being made and give the largest share of confidence that everything is in an okay state. But how many unit tests are enough unit tests? In this post I will argue that the optimal coverage rate is 100%.

But let’s start with the bad stuff. Let’s start with the reasons we do not achieve 100% test coverage by default.