Python may be easy but it’s a goddamn mess

By industry leaders and academic researchers alike, Python is touted as one of the absolute best languages for novice programmers. And they’re not wrong — but that doesn’t mean that it doesn’t confuse the shit out of programming newbies anyway.

Take dynamic typing as an example. It seems amazing at first: Python literally figures out by itself what sort of value a variable might take, and you don’t need to waste another line of code by telling it. This makes everything go faster!

At first. Then you mess it up on one single line — yes, one! — and your whole project crashes before it’s finished running.

To be fair, many other languages use dynamic typing. But in the case of Python, this is only the beginning of the shit-list.

Reading code gets messy with implicitly declared variables

3 free tickets to TNW Conference? Get them now!

For a limited time, groups can get up to three extra free tickets! Book now and increase your visibility and connections at TNW Conference

Get Tickets

When I started my PhD a couple of years ago, I wanted to develop an existing software, written by a colleague, further. I understood the basic idea of what it was doing, and my colleague had even written a paper about it as a documentation.

But I still needed to read through thousands of lines of Python code to make sure I knew which part did what, and where I could put the new features I had in mind. That’s where it got problematic…

The whole code was littered with variables that were declared nowhere. In order to understand what every variable was there for, I had to search for it throughout the whole file and, more often, across the whole project.

Add the complication that a variable is often called one thing inside a function, but then something else when the function is actually called… And the fact that a variable can be intertwined with one class which is tied to another variable from another class which influences a totally different class… You get the idea.

I’m hardly alone with this experience. The Zen of Python clearly says explicit is better than implicit. But it’s so easy to do implicit variables in Python that, especially in large projects, sh*t hits the fan very quickly.

Mutable types are hiding everywhere — even in functions

In Python, you can define functions with optional arguments — that is, arguments that you don’t need to state explicitly afterwards — by providing a default value. Like so:

def add_five(a, b=0):
return a + b + 5

That’s a silly example, I know. But the idea is that you can call the function with one argument now, or two, and it works anyways:

add_five(3) # returns 8
add_five(3,4) # returns 12

This works because the expression b=0 is defining b as an integer, and integers are immutable. Now consider this:

def add_element(list=[]):
list.append("foo")
return list
add_element() # returns ["foo"], as expected

So far, so good. But what happens if you execute it again?

add_element() # returns ["foo", "foo"]! wtf!

Because a list, the [“foo”] one, already exists, Python just appends its thing to that one. This works because lists, unlike integers, are mutable types.

“Insanity is doing the same thing over and over again and expecting different results,” so goes the common saying (it’s often misattributed to Albert Einstein). One could also say, Python plus optional arguments plus mutable objects is insanity.

Class variables aren’t safe from danger

If you thought that such problems are limited to the — admittedly not so ubiquitous — case of mutable objects as optional arguments, you’re mistaken.

If you do object-oriented programming — that is almost everyone — classes are everywhere in your Python code. And one of the most useful features of classes of all time is… (drumroll)
… inheritance.

Which is just a fancy word for saying that if you have a parent class with some properties, you can create children which inherit the same properties. Like this:

class parent(object):
x = 1
class firstchild(parent):
pass
class secondchild(parent):
pass
print(parent.x, firstchild.x, secondchild.x) # returns 1 1 1

This isn’t a particularly brainy example, so don’t copy it into your code projects. The point is, the child classes inherit the fact that x = 1, so we can call it and get the same result for the child classes as for the parent.
And if we change the x attribute of a child, it should change that child only. Like when you dyed your hair as a teen; that didn’t change your parents’ or your siblings’ hair either. This works:

firstchild.x = 2
print(parent.x, firstchild.x, secondchild.x) # returns 1 2 1

And what happened when you were little and mommy dyed her hair? Your hair didn’t change, right?

parent.x = 3
print(parent.x, firstchild.x, secondchild.x) # returns 3 2 3

Ew.

This happens because of Python’s Method Resolution Order. Basically, the child classes inherit everything the parents have, as long as it’s not stated otherwise. So, in Python-world, if you don’t protest in advance, mommy dyes your hair whenever she does hers.

Scopes go inside out sometimes

This next one I’ve stumbled over so many times.

In Python, if you’re defining a variable inside a function, this variable won’t work outside the function. One says it’s out of scope:

def myfunction(number):
basenumber = 2
return basenumber*number
basenumber
## Oh no! This is the error:
# Traceback (most recent call last):
# File "", line 1, in
# NameError: name 'basenumber' is not defined

This should be rather intuitive (and no, I didn’t stumble over that part).

But what about the other way around? I mean, what if I define a variable outside a function, and then reference it inside a function?

x = 2
def add_5():
x = x + 5
print(x)
add_5()
## Oh dear...
# Traceback (most recent call last):
# File "", line 1, in
# File "", line 2, in add_y
# UnboundLocalError: local variable 'x' referenced before assignment

Strange, right? If Albert lives in a world which contains trees, and Albert lives inside of a house, surely Albert still knows what trees look like? (The tree is x, Albert’s house is add_5() and Albert is 5…)

I’ve stumbled over this so many times while trying to define functions in one class that get called from another class. It took me a while to get to the root of the problem.

The thought behind this is that x inside the function is different from the x outside, and so you can’t just change it like that. Like if Albert dreams about turning the trees orange — that doesn’t make the trees orange of course.

Luckily, there’s a simple solution to this problem. Just slap a global before x!

x = 2
def add_5():
global x
x = x + 5
print(x)
add_5() # works!

So if you thought scopes only shield variables inside functions from the outside world, think again. The outside world gets protected from local variables in Python, in the same way that Albert can’t color trees orange with the power of his thoughts.

Modifying lists while iterating over them

Eh, well… yes, I’ve managed to run into such bullsh*ttery myself a couple of times.

Consider this:

mynumbers = [x for x in range(10)]
# this is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for x in range(len(mynumbers)):
if mynumbers[x]%3 == 0:
mynumbers.remove(mynumbers[x])
## Ew!
# Traceback (most recent call last):
# File "", line 2, in
# IndexError: list index out of range

This loop doesn’t work because it deletes an element of the list every so often. The list’s end therefore shifts forward. Then it’s impossible to arrive at element number 10 because it’s no longer there!

One dirty but handy workaround is assigning a silly value to all elements that you want to delete, and then remove them in a next step.

But there’s a much better solution:

mynumbers = [x for x in range(10) if x%3 != 0]
# that's what we wanted! [1, 2, 4, 5, 7, 8]

Just one line of code!

Note that we’ve already used Python’s list comprehension in the broken example above, to invoke the list.

It’s the expression in the square brackets [], and is basically a short form for loops. List comprehensions are often a little bit faster than regular loops, which is cool if you’re handling large datasets.

Here, we’re just adding an if clause to tell the list comprehension that it shall not include the numbers that are divisible by 3.

Unlike some of the phenomena described above, this isn’t a case of Python madness. Even if beginners may stumble over this at first, this is Python genius.

Some light at the horizon

Back in the days, coding wasn’t the only pain when it came to Python-related woes.

Python also used to be incredibly slow at execution, running anywhere from 2 to 10 times slower than most other languages.

This has gotten a lot better now. The Numpy package, for example, is incredibly fast at handling lists, matrices, and the like.

Multiprocessing has gotten much easier with Python, too. This lets you use all your 2 or 16 or however many cores of your computer, instead of just one. I’ve been running processes on 20 cores at a time and, boi, it’s saved me weeks of compute time already.

Also, as machine learning has taken up steam over the past few years, Python has shown that it has places to go. Packages like Pytorch and Tensorflow make this dead easy, and other languages are struggling to keep up.

However, the fact that Python has become better over the years doesn’t guarantee a rosy future.

Python still isn’t idiot-proof. Use it with caution.

This article was originally published on Medium. You can read it here.