Hi, I'm Harlin and welcome to my blog. I write about Python, Alfresco and other cheesy comestibles.

Python - A Simple Method for Refactoring Your Code For Extensibility

I am talking to the beginning Python programmer here. Or to someone who has avoided object oriented programming and has used procedural programming. Or to someone who mainly uses Python to write scripts -- though you can definitely use the ideas I have here for bigger applications. In fact, you will want to use these ideas for bigger applications but if you're already writing bigger apps, you may be doing this already. What I want to do is show you a way to optimize your Python programs. This doesn't necessarily mean speeding it up but rather optimizing your code for organizational and readability purposes. And when you organize your code and make it more readable, it will be easier in fact to find ways to speed up its execution.

The following scripts I am going to present will show you a progression of optimizing your code in the way I've described. It will be very simple. It will ask the user his/her name and will render a greeting back to them that uses the name. Each script will have the same functionality but will just be written in a progressively better way.

Should I Avoid premature optimization or not?

If you've read much on the subject of programming you have probably come across this saying:

"Premature optimization is the root of all evil (or at least most of it) in programming."

This saying is attributed to Donald Knuth who is a famous computer scientist. Now, what he says has some truth to it but I think it's been pushed way out of proportion. Let me explain.

If I'm creating a full-fledged application from the ground up, I am going to optimize the orgranization and readability from beginning as much as I possibly can but I'm not going to worry about it being too perfect.

If I'm writing a script to do one main thing well, then I'm just going to start off with a procedural script and then refactor from there. I'm going to go through a sickeningly simple script and show you my thought process as it goes.

Write one module to do something really well

Here is my demo script in all its glory (demo1.py):

#!/usr/bin/env python

# Ask user's name.
name = input('What is your name? ')

# Render a greeting to the user.
print('Hello {}, it is nice to meet you!'.format(name))

That's it. It doesn't do anything else. If you run it from the command line, it will look like this:

# ./demo1.py 
What is your name? Harlin
Hello Harlin, it is nice to meet you!

This is written in a procedural style. It's not really even functional programming or object-oriented as of yet. It's very simple to follow.

The script only does:

  1. Ask the user's name
  2. Greet the user with "Hello Harlin, it is nice to meet you."

Honestly, this would be fine as is. And if your script did a little or a lot more than this, it would totally be ok to be written in this procedural fashion.

Of course, if you needed to add some functionality to it, you would have to add more statements in between each line and after say about 50 lines or so -- which could happen pretty quickly for most scripts, it would get pretty tedious and difficult to read.

Once it gets to be about 200 lines or so, it gets to be kind of a pain in the ass to add, modify or take anything away. And after around 1000 lines, it would be mostly unmaintainable and whoever took over your script when you moved on to some other job would probably hunt you down.

So, we can't make this script any more readable or extensible or maximally optimized in one step but we can start off at least introducing some functions.

Refactor module to use "if name == 'main'"

For now, let's just add something new (demo2.py):

#!/usr/bin/env python

if __name__ == '__main__':
    # Ask user's name.
    name = input('What is your name? ')

    # Render a greeting to the user.
    print('Hello {}, it is nice to meet you!'.format(name))

We didn't change any functionality. It's still going to work as it has already worked but now we're putting the code into our main function. Functions are good! In fact, let's add another ...

Wrap your script into a function

This time, we'll still use our main function but let's take the maid code and put it into its own function called greet() (demo3.py).

#!/usr/bin/env python


def greet():
    # Ask user's name.
    name = input('What is your name? ')

    # Render a greeting to the user.
    print('Hello {}, it is nice to meet you!'.format(name))


if __name__ == '__main__':
    greet()

Now, not only is it still working as before, but if we wanted to, we could call this module from another script like so:

import demo3

demo3.greet()

We probably wouldn't call a greeter but we could if we needed to without having to write the same code again. This is called reusability which is a big advantage of writing object-oriented code. Now, we're getting somewhere.

Use messages instead of string literals when it makes sense to do so

Now, this part is something that may be more optional for you but personally, I prefer to put my string literals into a constant. Now, Python doesn't have pure constants and in fact, we're encouraged to use all caps when naming a constant variable (or for Python a variable that we'd like other programmers who work with our code to understand that we don't want to change the value of the variable).

Here's my example (demo4.py):

#!/usr/bin/env python

ASK_NAME_MSG = 'What is your name? '
GREETING = 'Hello {}, it is nice to meet you!'


def greet():
    # Ask user's name.
    name = input(ASK_NAME_MSG)

    # Render a greeting to the user.
    print(GREETING.format(name))


if __name__ == '__main__':
    greet()

All we've done here is move the name asking string to a variable called ASK_NAME_MSG and the greeting string to another variable called GREETING.

This is up to you but when your script gets to be bigger and bigger, you will want to avoid repeating yourself -- especially when it comes to string literals.

Our next refactoring involoves breaking our greet() function into two separate parts ...

Break big function into smaller functions and use run()

Here, we'll create a new function called get_name() and change our greet() function to only handle the greeting (demo5.py):

#!/usr/bin/env python

ASK_NAME_MSG = 'What is your name? '
GREETING = 'Hello {}, it is nice to meet you!'


def get_name():
    return input(ASK_NAME_MSG)


def greet(name):
    print(GREETING.format(name))


def run():
    # Ask user's name.
    name = get_name()

    # Render a greeting to the user.
    greet(name)


if __name__ == '__main__':
    run()

As you can see, the get_name() function simply gets the user's name and nothing else. Also, the greet() function only greets and doesn't do anything else. I've also created what you'll find to be something very standard in many Python scripts: a function called run().

Why run()? Well, it's not required but if you're using special features in Python like multithreading for example, you will need to call a threaded object like so:

thread = MyThreadedClass()
thread.start()

A threaded class will run with start(). This is because internally the thread's start() method looks for a function called run(). Now, you can call this run() or start() or initiate() if you like, but run() is more of a convention with many Python features.

Now, here's where we can take everything we've done so far and turn our greeting script into a Greeter object.

Combine common functions in an object

Once we've created a number of common functions, it eventually makes sense to create an object that can use all of these functions together. A couple of huge benefits of object-oriented programming (OOP) is reusability (ooh we already mentioned that) and extensibility (that is being able to add new functionality on the fly -- and yep we already talked about that too).

And no worries. This refactoring won't be much of a stretch since we've done quite a bit of the work already (demo6.py):

#!/usr/bin/env python

ASK_NAME_MSG = 'What is your name? '
GREETING = 'Hello {}, it is nice to meet you!'


class Greeter:
    def get_name(self):
        return input(ASK_NAME_MSG)

    def greet(self, name):
        print(GREETING.format(name))

    def run(self):
        # Ask user's name.
        name = self.get_name()

        # Render a greeting to the user.
        self.greet(name)


if __name__ == '__main__':
    app = Greeter()
    app.run()

So here, we're already using the constant strings (or messages) at the top. We haven't changed those. We're also using the exact same functions excepts that now they are members of our new class called Greeter (which is an object). Once they become members of a class, they are no longer called functions but are now called methods.

And in the main function, we crate a Greeter object called app and then we use app to call the method "run()".

As I mentioned, I don't always create really big scripts or applications themselves using this same method of refactoring but if you think about it, it does make the approach to writing object-oriented code to be a more organized way to do it. When you organize your code very well, it makes it even more readable and more readable without much effort.

As mentioned already, when you get your code to this level of maintability, it then becomes much easier to add additional functionality (think plugins!) without having to rewrite everything. Hopefully this was helpful for you. Please leave a comment. I'd love to hear your thoughts on this subject and any methods you use to write cleaner and more extensible code.

Any Comments, Always Welcome!