getattr() Considered Harmful

While Hynek already considered “Considered Harmful” was getting old in 2016, and so he named this blog post hasattr() – A Dangerous Misnomer, instead of hasattr() considered harmful, meanwhile it is 2022 and I still like that phrase.

So here we go…

`getattr()` considered harmful

The setting is a CLI application with 100% test coverage, and even branch coverage is activated.

coverage 101

I assume you know what coverage is. 100% test coverage means that the test suite covers all lines of code of your library or application. That does not mean they are functionally correct, but this means they are all run. This eliminates a couple of error classes, e.g. syntax errors.

But we still have a problem with e.g. the following code:

if function_call():
    do_something()

When in our test suite function_call is true, both lines will be executed and a code coverage of 100% is shown.

But we may have never tested the case when function_call is false.

hello branch coverage

When you activate branch coverage, coverage.py will show an error for above code example:

line x didn’t jump to line y, because the condition on line x was never false

Excellent! We are safe… right?

hello `getattr()`

Back to our CLI application.

A couple of weeks ago I renamed a flag for the application’s run command from --output to --output-directory.

Easy change, clear improvement, no fear as we have 100% test coverage.

hello regression

Fast forward a couple of weeks.

My colleague created another merge proposal to fix a regression I introduced with the above renaming.

I forgot to also update the flag for the run_one command.

How was this possible when we have 100% test coverage?

Let’s have a look at the following code…

_run_job(args.job, job, provider, getattr(args, "output", None))

If you have a close look at above line, you’ll notice that there is a hidden branch, ie the getattr covers two different outcomes:

args has an attribute output
args has no attribute output

This could have been written as:

if hasattr(args, "output"):
    _run_job(args.job, job, provider, args.output)
else:
    _run_job(args.job, job, provider, None)

Certainly, then coverage, at least with activated branch coverage, would have raised the alarm!

more gotchas

Please note that there are more idioms in Python which impede coverage from being reliable:

x = {"a": 1, "b": 2}.get("z", 99)

x = condition and 0 or 1

x = 0 if condition else 1

I won’t list multiple statements on one line, combined with a semicolon, as flake8, which you hopefully use, will prevent this.

And there might be more.

Do you know more examples? Please contact me https://jugmac00.github.io/

what to do

Well, you could avoid using the above mentioned idioms. :-/

You could use mutation testing, which mutates the code and raises alarm if this does not cause a test failure. But that would be another topic.

At very least - be aware of these “invisible” test or coverage gaps!