Skip to content

Introduction to Sourcery

About this Tutorial

Welcome!

If you are new to Sourcery, you are in the right place - this tutorial is aimed at people just starting out.

If you're familiar with Sourcery already, you may instead want to browse the Reference material, the Guides, or the Further Reading.

In this tutorial, you'll learn:

  • What Sourcery does, and how it helps you write better Python code
  • How to install Sourcery in VSCode
  • How to use Sourcery interactively in VSCode

Non-VSCode users

If you are following this tutorial for the first time, we encourage you to install VSCode and follow along, even if you don't usually use VSCode for programming. This will help you understand what Sourcery can do, so you can make best use of it in your main programming workflow.

Sourcery can also be installed in JetBrains IDEs, as a command-line interface, and as a GitHub bot. See full details at Guides: Install and use Sourcery.

Tutorial Setup

Objectives

In this section, you will:

  • Learn how to install Sourcery in VSCode
  • Get the tutorial project

See Also


Requirements

  • Windows, macOS, or Linux operating system (special instructions for M1 machines can be found here)
  • VSCode installed (download at the official website)
  • git installed (installation instructions on the official website)

Installing Sourcery

  1. Open VSCode, and click the icon in the sidebar for "Extensions" Location of VSCode Extensions
  2. Search for "Sourcery" (note the "u" in the spelling) and click "Install" to install the extension Location of Install Button

Getting the Tutorial Project

  1. Open a VSCode window, and from the main screen select "Clone Git Repository"

    Cloning a git Repository

  2. In the prompt, copy the URL

    https://github.com/sourcery-ai/sourcery-tutorial-introduction.git
    

    and press Enter or click "Clone from URL".

    Prompt for repository

  3. A system file window will appear. Select the location you want for the tutorial folder.

  4. When prompted, click "Open" or "Open in New Window".

    Open in new window prompt

What is "Refactoring"?

Objectives

In this section, you'll learn:

  • What refactoring is and why it is useful
  • How you can refactor in small, systematic steps
  • How Sourcery can automate some types of refactoring for you

Refactoring code means improving its readability and maintainability without changing its behaviour. Readable, maintainable code is typically short, self-explanatory, and practical.

The process of refactoring usually involves making a number of small, individual changes to a piece of code. Let's take a look at an example. Here is a function which can be readily refactored:

def make_squares(n: int) -> list[int]:
    """Returns a list of square numbers up to `n ** 2`."""
    squares = []
    for i in range(n):
        square = i ** 2
        squares.append(square)
    return squares

Here are the steps we might take to improve the code:

  1. Inline the variable square - it's only used once. "Inlining" means replacing an assigned variable with its value wherever it is used.
    def make_squares(n: int) -> list[int]:
        """Returns a list of square numbers up to `n ** 2`."""
        squares = []
        for i in range(n):
            squares.append(i ** 2)
        return squares
    
  2. Convert the for loop to a comprehension. Python's list comprehension syntax is elegant, succinct, and considered good practice.
    def make_squares(n: int) -> list[int]:
        """Returns a list of square numbers up to `n ** 2`."""
        squares = [i ** 2 for i in range(n)]
        return squares
    
  3. Inline the variable squares.
    def make_squares(n: int) -> list[int]:
        """Returns a list of square numbers up to `n ** 2`."""
        return [i ** 2 for i in range(n)]
    

Final result:

def make_squares(n: int) -> list[int]:
   """Returns a list of square numbers up to `n ** 2`."""
   return [i ** 2 for i in range(n)]

This code is shorter, easier to understand, and almost exactly matches its description in the docstring.

Maintainability of well-refactored code

Refactored code is often easier to make changes to. Let's imagine we want to abstract the type of iterator the function above returns (set, list, etc.). In the new version, we can make the type a parameter, with minimal changes. The same abstraction would be quite difficult with the original code, due to its reliance on the list interface.

import typing

Constructor = typing.Callable[[typing.Iterator[int]], typing.Iterable[int]]

def make_squares(n: int, constructor: Constructor = list) -> typing.Iterable[int]:
    """Returns an iterable (list by default) of square numbers up to `n ** 2`."""
    return constructor(i ** 2 for i in range(n))

make_squares(3)
# [1, 4, 9]

make_squares(3, set)
# {1, 4, 9}

How Sourcery Helps You Refactor

Sourcery is a refactoring engine. It scans your code for patterns that can be improved (like the one above), and makes suggestions about how to improve them, by using a set of small, built-in "Rules" (see "What are Rules" below) to generate complex changes. Because it composes refactorings from small individual steps, Sourcery can propose changes that go beyond linting.

This automates a lot of the manual effort spent improving the style of your code, allowing you to focus on building features.

Here is what Sourcery reports when run as a command line tool over the code shown above:

screenshot

Now that you know what Sourcery can do in theory, let's see how it works in practice.

Objectives

In this section, you'll learn:

  • How to see the changes Sourcery can suggest
  • What types of suggestion Sourcery can make
  • Where to find more information about each of the suggestions

What are "Rules"?

Open the __main__.py file from the introduction tutorial project:

Location of the __main__.py file

This file contains a single function, create_playlist, which fetches some tracks and arranges them into an approximately half-hour playlist by selecting tracks at random until there are no songs short enough to fit the remaining time.

There are a number of ways this function could be improved, and Sourcery has already made a start for us!

Each of the underlined lines in the code is associated with one or more Rules Sourcery has matched. Some of these are Refactorings (in the sense described above), some are Suggestions, and some are Comments.

Every default rule has a unique ID, and can be browsed in this documentation.

Seeing Rules in the Editor

The Sourcery plugin has identified several pieces of code it thinks could be improved using wiggly underlines, such as this one:

Underlined line of code in the editor

Blue lines correspond to Suggestions and Comments, and yellow lines are used for Refactorings.

If you hover over these lines with your cursor, you'll see a pop-up window with some additional detail about the rule; see the explanations below.

Important

Sourcery only underlines the first line of the code it thinks should change, to avoid visual spam.

Seeing Rules in The Problems Panel

As well as being shown as a wiggly underline in your code, Sourcery's matched Rules can be found in VSCode's "Problems" panel.

To open the problems panel, click "View" in the main toolbar and then select "Problems":

Toolbar method for opening the Problems Panel

The problems panel shows a summary of all the Rules that Sourcery has matched in your open documents.

The problems panel, showing matched Rules

You can click on any of these to jump to their location in the document; this is convenient if you have large files and don't want to scroll through looking for refactorings.

How to Read Rule Details

Using your cursor, hover over the line in the editor which reads

tracks_which_fit = []

You should see a pop-up that looks like this:

Sourcery rule details view

Let's look at what this means.

This line gives you an overview of the change. It may describe the change as "refactored", "making a change", or "identifying an issue" depending on the type of change. It will also try to tell you which function or lines of code will change.

Sourcery rule details header

The bullet point list shows a breakdown of each of the individual rules Sourcery has matched to make the change. In this case, only one rule was used, list-comprehension.

Sourcery rule breakdown

Additional Detail

Some changes may include additional detail below this line that describe the rule in detail. Try hovering over the line

list = starting_tracks

to see an example.

These lines show the complete code change sourcery is proposing - it will delete four lines (shown in red) and introduce a new one (shown in green). A couple of unchanged lines are shown in gray for context.

Sourcery rule diff

These final lines show all the diagnostics VSCode has identified for this line, including a repeat of the main issue reported by Sourcery. If you have other plugins, there may be more here.

VSCode diagnostics

In the next section, we'll look at each of the issues Sourcery has identified in turn, and explain what they mean.

Explanation of the Tutorial Project's Rules

Default Mutable Arguments

The first problem Sourcery has highlighted is in the definition of the function itself, the line that reads

def create_playlist(starting_tracks = []):

If you hover over this line in the editor, you should see something like this:

Hovering over the first suggestion in VSCode

The problem here is the use of a mutable default argument, in this case [], in the definition of a function. This is a common pitfall for Python beginners (see the explanation in The Hitchhiker's Guide to Python), and can cause confusing bugs.

Sourcery's unique ID for this Rule is default-mutable-arg, which you can see in the screenshot above. The documentation for default-mutable-arg can be found here.

Although this change is likely to be one you want, it is possible you are deliberately using a mutable default, or that your other code relies on this behaviour in some way. As a result, this change has a small chance of breaking existing code, so Sourcery doesn't consider this a Refactoring, but instead a Suggestion.

In the pop-up, Sourcery suggests the most common way of solving this problem, and we'll see how to apply this solution in the next section, "Applying Sourcery's Rules".

Avoid Built-in Shadows

The next problem occurs on the line:

list = starting_tracks

If you hover over this line in the editor, you should see something like this:

Hovering over the second comment in VSCode

This code is not incorrect, but Sourcery recognises this pattern as bad practice - you should not reassign Python's built-in variables. However, because variable naming is very domain-driven, Sourcery doesn't have any precise suggestion for what to use instead. Rules of this type, which don't have replacements, are called Comments, and this specific Rule has the ID avoid-builtin-shadow.

List Comprehension

The next highlighted problem refers to the lines

tracks_which_fit = []
for track in tracks_all:
    if track.duration_seconds < duration_seconds_remaining:
        tracks_which_fit.append(track)

Hover over the first of these lines in the editor, and you'll see something like this:

Hovering over the third refactoring in VSCode

This Rule, called list-comprehension, finds for-loops in your code that could be replaced with Python's comprehensions syntax. Comprehension syntax is typically more concise and is often considered more "Pythonic".

Making this change would not affect your code's behaviour, so Sourcery considers this a Refactoring. We'll see how to make this change automatically in the next section, Accepting Sourcery's Rules.

The result will be the following code:

tracks_which_fit = [track for track in tracks_all if track.duration_seconds < duration_seconds_remaining]

Simplify Length Comparison

In the editor, hover over the line

if len(tracks_which_fit) == 0:

You will see something like this:

Hovering over the fourth refactoring in VSCode

Python collections are "True"-like (or "Truthy") when they have some elements in them, and "False"-like ("Falsy") if they are empty. Because of this, it is slightly more concise to write the condition like

if not tracks_which_fit:

which is what Sourcery suggests.

This Refactoring follows a convention in the PEP 8 Style Guide for Python Code.

Augmented Assignment

The last Refactoring Sourcery has identified is on the line

duration_seconds_remaining = duration_seconds_remaining - track_selected.duration_seconds

Hover over this line in the editor, and you'll see something like this:

Hovering over the fifth refactoring in VSCode

When you're adding to, subtracting from, or doing (almost) any other operation on a variable which uses its own value on the left hand side of the operator, you can use the augmented assignment statement instead.

Sourcery's name for this Refactoring is aug-assign.

Accepting Sourcery's Rules

Now that you've learned how to understand the information Sourcery presents, let's see how to implement the changes Sourcery proposes.

Objectives

Learn how to:

  • Accept proposed changes in the editor as you type
  • Accept proposed changes from the problems panel
  • How to deal with Sourcery's Comments

See Also


Accepting Changes in the Editor

Place your cursor on the highlighted line

tracks_which_fit = []

A lightbulb icon will appear to the left of the line. Click on it to bring up a menu. The first menu item contains a brief description of the Refactoring. Click it to apply the change:

Accepting a proposed change from the quick-fix hover menu

Sourcery will change the line you've selected into

tracks_which_fit = [track for track in tracks_all if track.duration_seconds < duration_seconds_remaining]

and will remove the lines below. The entire for loop has now been changed into an equivalent comprehension.

Code Formatting

Sourcery performs minimal code formatting, to avoid conflicting with your project's formatter. You'll probably want to run a code formatting step (often "Ctrl/Cmd + Shift + I") to transform your code into the neater version:

tracks_which_fit = [
    track
    for track in tracks_all
    if track.duration_seconds < duration_seconds_remaining
]

Accepting Changes in the Problems Panel

With the problems panel open (see above), right-click on the Problem labeled "Sourcery - Replace mutable default arguments with None".

Accepting a refactoring in the problems panel

In the editor, you'll see Sourcery has changed the beginning of the function into this:

def create_playlist(starting_tracks=None):
    if starting_tracks is None:
        starting_tracks = []
    ...

A trap avoided!

How to deal with Comments

If you try to "Accept" the Comment "Don't assign to built-in variable list" in the same way as above, you'll find the relevant menu item is missing. This is because Sourcery doesn't know how best to replace this code - that's why it's a Comment.

Let's deal with the issue by hand. Find all the instances of list in the function and replace it with tracks_for_playlist

Warning

tracks_all = list(fetch_tracks())

is a legitimate usage of the list built-in, so don't replace that!

The Comment Sourcery identified will disappear.

Skipping Sourcery's Rules

Sourcery's rules are opinionated, and you may disagree with them.

Objectives

In this section, you'll learn how to:

  • Ignore a specific instance of a rule violation
  • Disable a rule altogether

See Also


Skip a Rule Violation Once

If you don't think a rule is right for your code, you can "skip" it. Sourcery will add a comment to your code that lets it know not to search for that rule in the current function.

To see how this works, place your cursor on the line

if len(tracks_which_fit) == 0:

to which Sourcery wants to apply the rule simplify-len-comparison.

A lightbulb icon will appear to the left of the line. Click on it to bring up a menu. The second menu item says "Sourcery - skip suggested refactoring in this code block". Choose this option.

Skipping a proposed change from the quick-fix hover menu

Two things will happen. First, Sourcery will insert a Python comment near the start of the function which says

# sourcery skip: simplify-len-comparison

Second, the highlighted Refactoring will disappear.

You can make Sourcery search for the simplify-len-comparison Refactoring again at any time by removing the Python comment.

Disable a Rule

Sometimes, you may disagree with Sourcery completely. You can permanently disable a refactoring on your system. We'll do this now. (Don't worry! We'll also re-enable it afterwards.)

Move your cursor to the line

duration_seconds_remaining = duration_seconds_remaining - track_selected.duration_seconds

and click on the lightbulb icon which appears. Select the third option in the menu - "Sourcery - Never show me this refactoring". The underline will disappear.

What just happened?

If you disable a refactoring, Sourcery will update your personal configuration file. The location of this file is system-dependent, but the format is the same.

Using the reference here, locate your system Sourcery config file. For example, this is where it can be found in Ubuntu:

Files location of the sourcery.yaml file in Ubuntu

If you open it in a text editor, it should look something like this:

rule_settings:
  disable:
  - aug-assign

This configuration tells Sourcery to always skip the aug-assign Refactoring. Sourcery loads this configuration on every session.

To re-enable the refactoring, delete the contents of the file, and save it.

For more detail, see the reference on the configuration file.

Note

This option won't appear if Sourcery combines two or more rules into a refactoring. It only shows up when a single rule is matched.

Conclusion

If you've followed along so far, your __main__.py file should look like this:

import random
from .playlist import Playlist
from .tracks import fetch_tracks


def create_playlist(starting_tracks=None):
    # sourcery skip: simplify-len-comparison
    if starting_tracks is None:
        starting_tracks = []
    tracks_all = list(fetch_tracks())
    tracks_for_playlist = starting_tracks
    duration_seconds_remaining = 1800

    while duration_seconds_remaining > 0:
        tracks_which_fit = [
            track
            for track in tracks_all
            if track.duration_seconds < duration_seconds_remaining
        ]

        if len(tracks_which_fit) == 0:
            break
        track_selected = random.choice(tracks_which_fit)
        if track_selected in tracks_for_playlist:
            continue
        duration_seconds_remaining = duration_seconds_remaining - track_selected.duration_seconds
        tracks_for_playlist.append(track_selected)

    return Playlist(tracks_for_playlist)


if __name__ == "__main__":
    playlist = create_playlist()
    print(playlist)

There's still work to do (see below), but using Sourcery you have been able to:

  • Identify and fix a common mistake involving mutable default arguments
  • Simplify a for loop using comprehensions
  • Avoid a confusing assignment to a built-in variable
  • Skip two refactorings that you dislike

Next Steps

If you normally use VSCode for coding, you're all set. We're improving Sourcery all the time, so watch out for updates to the plugin to get the latest features and refactorings.

If you normally code in some other environment, check out our installation instructions for other platforms.

If you like what you've seen so far and want to take the next steps, check out our tutorial for writing custom rules.

You may also want to browse our reference for configuring Sourcery.

If you've got questions about the plugin, have a look at our FAQs.

Finally, if you have any feedback on this tutorial or on Sourcery, feel free to raise an issue on GitHub or just reach out by email.

Taking Refactoring Further

Here at Sourcery, we love refactoring, and I couldn't leave this code alone without trying to refactor it further. This is as far as I could get "simplifying" the code in __main__.py to achieve the same result:

import random
import typing
from itertools import accumulate, takewhile

from more_itertools import last

from sourcery_tutorial_introduction.playlist import Playlist
from sourcery_tutorial_introduction.tracks import Track, MAGIC_TRACKS

T = typing.TypeVar("T")


def shuffle(items: typing.Sequence[T]) -> typing.Sequence[T]:
    return random.sample(items, len(items))


def create_playlist(
    tracks: typing.Sequence[Track],
    condition: typing.Callable[[Playlist], bool],
) -> Playlist:
    """Create the largest Playlist from `tracks` (in order) that satisfies `condition`."""
    return last(
        takewhile(
            condition,
            accumulate(
                tracks,
                Playlist.with_track,
                initial=Playlist.empty(),
            ),
        )
    )


if __name__ == "__main__":
    tracks = shuffle(list(MAGIC_TRACKS))
    condition = lambda playlist: playlist.duration_seconds < 1800
    result = create_playlist(tracks, condition)
    print(result.duration_seconds)

There are a couple of principles at work here:

  • dependency injection - don't hard-code values (including conditions) and don't fetch data that could be passed in
  • abstraction - calculations belong on the dataclass in question
  • declarative programming - avoiding imperative loops and mutations can make code easier to debug

This kind of refactoring exercise is how we figure out how Sourcery should work! If you have suggestions for refactorings, do reach out, we love to hear them.