Introduction to Sourcery¶
About this Tutorial¶
Welcome!
If you are new to Sourcery, you are in the right place - this tutorial is aimed at people just starting out.
If you're familiar with Sourcery already, you may instead want to browse the Reference material, the Guides, or the Further Reading.
In this tutorial, you'll learn:
- What Sourcery does, and how it helps you write better Python code
- How to install Sourcery in VSCode
- How to use Sourcery interactively in VSCode
Non-VSCode users
If you are following this tutorial for the first time, we encourage you to install VSCode and follow along, even if you don't usually use VSCode for programming. This will help you understand what Sourcery can do, so you can make best use of it in your main programming workflow.
Sourcery can also be installed in JetBrains IDEs, as a command-line interface, and as a GitHub bot. See full details at Guides: Install and use Sourcery.
Tutorial Setup¶
Objectives
In this section, you will:
- Learn how to install Sourcery in VSCode
- Get the tutorial project
See Also
Requirements¶
- Windows, macOS, or Linux operating system (special instructions for M1 machines can be found here)
- VSCode installed (download at the official website)
git
installed (installation instructions on the official website)
Installing Sourcery¶
- Open VSCode, and click the icon in the sidebar for "Extensions"
- Search for "Sourcery" (note the "u" in the spelling) and click "Install" to install the extension
Getting the Tutorial Project¶
-
Open a VSCode window, and from the main screen select "Clone Git Repository"
-
In the prompt, copy the URL
https://github.com/sourcery-ai/sourcery-tutorial-introduction.git
and press Enter or click "Clone from URL".
-
A system file window will appear. Select the location you want for the tutorial folder.
-
When prompted, click "Open" or "Open in New Window".
What is "Refactoring"?¶
Objectives
In this section, you'll learn:
- What refactoring is and why it is useful
- How you can refactor in small, systematic steps
- How Sourcery can automate some types of refactoring for you
Refactoring code means improving its readability and maintainability without changing its behaviour. Readable, maintainable code is typically short, self-explanatory, and practical.
The process of refactoring usually involves making a number of small, individual changes to a piece of code. Let's take a look at an example. Here is a function which can be readily refactored:
def make_squares(n: int) -> list[int]:
"""Returns a list of square numbers up to `n ** 2`."""
squares = []
for i in range(n):
square = i ** 2
squares.append(square)
return squares
Here are the steps we might take to improve the code:
- Inline the variable
square
- it's only used once. "Inlining" means replacing an assigned variable with its value wherever it is used.def make_squares(n: int) -> list[int]: """Returns a list of square numbers up to `n ** 2`.""" squares = [] for i in range(n): squares.append(i ** 2) return squares
- Convert the for loop to a comprehension. Python's
list comprehension syntax is elegant,
succinct, and considered good practice.
def make_squares(n: int) -> list[int]: """Returns a list of square numbers up to `n ** 2`.""" squares = [i ** 2 for i in range(n)] return squares
- Inline the variable
squares
.def make_squares(n: int) -> list[int]: """Returns a list of square numbers up to `n ** 2`.""" return [i ** 2 for i in range(n)]
Final result:
def make_squares(n: int) -> list[int]:
"""Returns a list of square numbers up to `n ** 2`."""
return [i ** 2 for i in range(n)]
This code is shorter, easier to understand, and almost exactly matches its description in the docstring.
Maintainability of well-refactored code
Refactored code is often easier to make changes to.
Let's imagine we want to abstract the type of iterator the function above returns (set
, list
, etc.).
In the new version, we can make the type a parameter, with minimal changes.
The same abstraction would be quite difficult with the original code, due to its reliance on the list
interface.
import typing
Constructor = typing.Callable[[typing.Iterator[int]], typing.Iterable[int]]
def make_squares(n: int, constructor: Constructor = list) -> typing.Iterable[int]:
"""Returns an iterable (list by default) of square numbers up to `n ** 2`."""
return constructor(i ** 2 for i in range(n))
make_squares(3)
# [1, 4, 9]
make_squares(3, set)
# {1, 4, 9}
How Sourcery Helps You Refactor¶
Sourcery is a refactoring engine. It scans your code for patterns that can be improved (like the one above), and makes suggestions about how to improve them, by using a set of small, built-in "Rules" (see "What are Rules" below) to generate complex changes. Because it composes refactorings from small individual steps, Sourcery can propose changes that go beyond linting.
This automates a lot of the manual effort spent improving the style of your code, allowing you to focus on building features.
Here is what Sourcery reports when run as a command line tool over the code shown above:
Navigating Sourcery's Rules¶
Now that you know what Sourcery can do in theory, let's see how it works in practice.
Objectives
In this section, you'll learn:
- How to see the changes Sourcery can suggest
- What types of suggestion Sourcery can make
- Where to find more information about each of the suggestions
What are "Rules"?¶
Open the __main__.py
file from the introduction tutorial project:
This file contains a single function, create_playlist
, which fetches some tracks and arranges them into an
approximately half-hour playlist by selecting tracks at random until there are no songs short enough to fit the
remaining time.
There are a number of ways this function could be improved, and Sourcery has already made a start for us!
Each of the underlined lines in the code is associated with one or more Rules Sourcery has matched. Some of these are Refactorings (in the sense described above), some are Suggestions, and some are Comments.
Every default rule has a unique ID, and can be browsed in this documentation.
Seeing Rules in the Editor¶
The Sourcery plugin has identified several pieces of code it thinks could be improved using wiggly underlines, such as this one:
Blue lines correspond to Suggestions and Comments, and yellow lines are used for Refactorings.
If you hover over these lines with your cursor, you'll see a pop-up window with some additional detail about the rule; see the explanations below.
Important
Sourcery only underlines the first line of the code it thinks should change, to avoid visual spam.
Seeing Rules in The Problems Panel¶
As well as being shown as a wiggly underline in your code, Sourcery's matched Rules can be found in VSCode's "Problems" panel.
To open the problems panel, click "View" in the main toolbar and then select "Problems":
The problems panel shows a summary of all the Rules that Sourcery has matched in your open documents.
You can click on any of these to jump to their location in the document; this is convenient if you have large files and don't want to scroll through looking for refactorings.
How to Read Rule Details¶
Using your cursor, hover over the line in the editor which reads
tracks_which_fit = []
You should see a pop-up that looks like this:
Let's look at what this means.
This line gives you an overview of the change. It may describe the change as "refactored", "making a change", or "identifying an issue" depending on the type of change. It will also try to tell you which function or lines of code will change.
The bullet point list shows a breakdown of each of the individual rules Sourcery has matched to make the change.
In this case, only one rule was used, list-comprehension
.
Additional Detail
Some changes may include additional detail below this line that describe the rule in detail. Try hovering over the line
list = starting_tracks
to see an example.
These lines show the complete code change sourcery is proposing - it will delete four lines (shown in red) and introduce a new one (shown in green). A couple of unchanged lines are shown in gray for context.
These final lines show all the diagnostics VSCode has identified for this line, including a repeat of the main issue reported by Sourcery. If you have other plugins, there may be more here.
In the next section, we'll look at each of the issues Sourcery has identified in turn, and explain what they mean.
Explanation of the Tutorial Project's Rules¶
Default Mutable Arguments¶
The first problem Sourcery has highlighted is in the definition of the function itself, the line that reads
def create_playlist(starting_tracks = []):
If you hover over this line in the editor, you should see something like this:
The problem here is the use of a mutable default argument, in this case []
, in the definition of a function.
This is a common pitfall for Python beginners (see the explanation in
The Hitchhiker's Guide to Python),
and can cause confusing bugs.
Sourcery's unique ID for this Rule is default-mutable-arg
, which you can see in the screenshot above.
The documentation for default-mutable-arg
can be found
here.
Although this change is likely to be one you want, it is possible you are deliberately using a mutable default, or that your other code relies on this behaviour in some way. As a result, this change has a small chance of breaking existing code, so Sourcery doesn't consider this a Refactoring, but instead a Suggestion.
In the pop-up, Sourcery suggests the most common way of solving this problem, and we'll see how to apply this solution in the next section, "Applying Sourcery's Rules".
Avoid Built-in Shadows¶
The next problem occurs on the line:
list = starting_tracks
If you hover over this line in the editor, you should see something like this:
This code is not incorrect, but Sourcery recognises this pattern as bad practice - you should not reassign Python's
built-in variables.
However, because variable naming is very domain-driven, Sourcery doesn't have any precise suggestion for what to use
instead.
Rules of this type, which don't have replacements, are called Comments, and this specific Rule has the ID
avoid-builtin-shadow
.
List Comprehension¶
The next highlighted problem refers to the lines
tracks_which_fit = []
for track in tracks_all:
if track.duration_seconds < duration_seconds_remaining:
tracks_which_fit.append(track)
Hover over the first of these lines in the editor, and you'll see something like this:
This Rule, called list-comprehension
, finds
for-loops in your code that could be replaced with Python's
comprehensions syntax.
Comprehension syntax is typically more concise and is often considered more "Pythonic".
Making this change would not affect your code's behaviour, so Sourcery considers this a Refactoring. We'll see how to make this change automatically in the next section, Accepting Sourcery's Rules.
The result will be the following code:
tracks_which_fit = [track for track in tracks_all if track.duration_seconds < duration_seconds_remaining]
Simplify Length Comparison¶
In the editor, hover over the line
if len(tracks_which_fit) == 0:
You will see something like this:
Python collections are "True"-like (or "Truthy") when they have some elements in them, and "False"-like ("Falsy") if they are empty. Because of this, it is slightly more concise to write the condition like
if not tracks_which_fit:
which is what Sourcery suggests.
This Refactoring follows a convention in the PEP 8 Style Guide for Python Code.
Augmented Assignment¶
The last Refactoring Sourcery has identified is on the line
duration_seconds_remaining = duration_seconds_remaining - track_selected.duration_seconds
Hover over this line in the editor, and you'll see something like this:
When you're adding to, subtracting from, or doing (almost) any other operation on a variable which uses its own value on the left hand side of the operator, you can use the augmented assignment statement instead.
Sourcery's name for this Refactoring is aug-assign
.
Accepting Sourcery's Rules¶
Now that you've learned how to understand the information Sourcery presents, let's see how to implement the changes Sourcery proposes.
Objectives
Learn how to:
- Accept proposed changes in the editor as you type
- Accept proposed changes from the problems panel
- How to deal with Sourcery's Comments
See Also
Accepting Changes in the Editor¶
Place your cursor on the highlighted line
tracks_which_fit = []
A lightbulb icon will appear to the left of the line. Click on it to bring up a menu. The first menu item contains a brief description of the Refactoring. Click it to apply the change:
Sourcery will change the line you've selected into
tracks_which_fit = [track for track in tracks_all if track.duration_seconds < duration_seconds_remaining]
and will remove the lines below. The entire for loop has now been changed into an equivalent comprehension.
Code Formatting
Sourcery performs minimal code formatting, to avoid conflicting with your project's formatter. You'll probably
want to run a code formatting step (often "Ctrl/Cmd
+ Shift
+ I
") to transform your code into the neater
version:
tracks_which_fit = [
track
for track in tracks_all
if track.duration_seconds < duration_seconds_remaining
]
Accepting Changes in the Problems Panel¶
With the problems panel open (see above), right-click on the Problem labeled "Sourcery - Replace mutable default arguments with None".
In the editor, you'll see Sourcery has changed the beginning of the function into this:
def create_playlist(starting_tracks=None):
if starting_tracks is None:
starting_tracks = []
...
A trap avoided!
How to deal with Comments¶
If you try to "Accept" the Comment "Don't assign to built-in variable list
" in the same way as above, you'll find
the relevant menu item is missing. This is because Sourcery doesn't know how best to replace this code - that's why
it's a Comment.
Let's deal with the issue by hand.
Find all the instances of list
in the function and replace it with tracks_for_playlist
Warning
tracks_all = list(fetch_tracks())
is a legitimate usage of the list
built-in, so don't replace that!
The Comment Sourcery identified will disappear.
Skipping Sourcery's Rules¶
Sourcery's rules are opinionated, and you may disagree with them.
Objectives
In this section, you'll learn how to:
- Ignore a specific instance of a rule violation
- Disable a rule altogether
See Also
Skip a Rule Violation Once¶
If you don't think a rule is right for your code, you can "skip" it. Sourcery will add a comment to your code that lets it know not to search for that rule in the current function.
To see how this works, place your cursor on the line
if len(tracks_which_fit) == 0:
to which Sourcery wants to apply the rule
simplify-len-comparison
.
A lightbulb icon will appear to the left of the line. Click on it to bring up a menu. The second menu item says "Sourcery - skip suggested refactoring in this code block". Choose this option.
Two things will happen. First, Sourcery will insert a Python comment near the start of the function which says
# sourcery skip: simplify-len-comparison
Second, the highlighted Refactoring will disappear.
You can make Sourcery search for the simplify-len-comparison
Refactoring again at any time by removing the
Python comment.
Disable a Rule¶
Sometimes, you may disagree with Sourcery completely. You can permanently disable a refactoring on your system. We'll do this now. (Don't worry! We'll also re-enable it afterwards.)
Move your cursor to the line
duration_seconds_remaining = duration_seconds_remaining - track_selected.duration_seconds
and click on the lightbulb icon which appears. Select the third option in the menu - "Sourcery - Never show me this refactoring". The underline will disappear.
What just happened?
If you disable a refactoring, Sourcery will update your personal configuration file. The location of this file is system-dependent, but the format is the same.
Using the reference here, locate your system Sourcery config file. For example, this is where it can be found in Ubuntu:
If you open it in a text editor, it should look something like this:
rule_settings:
disable:
- aug-assign
This configuration tells Sourcery to always skip the
aug-assign
Refactoring.
Sourcery loads this configuration on every session.
To re-enable the refactoring, delete the contents of the file, and save it.
For more detail, see the reference on the configuration file.
Note
This option won't appear if Sourcery combines two or more rules into a refactoring. It only shows up when a single rule is matched.
Conclusion¶
If you've followed along so far, your __main__.py
file should look like this:
import random
from .playlist import Playlist
from .tracks import fetch_tracks
def create_playlist(starting_tracks=None):
# sourcery skip: simplify-len-comparison
if starting_tracks is None:
starting_tracks = []
tracks_all = list(fetch_tracks())
tracks_for_playlist = starting_tracks
duration_seconds_remaining = 1800
while duration_seconds_remaining > 0:
tracks_which_fit = [
track
for track in tracks_all
if track.duration_seconds < duration_seconds_remaining
]
if len(tracks_which_fit) == 0:
break
track_selected = random.choice(tracks_which_fit)
if track_selected in tracks_for_playlist:
continue
duration_seconds_remaining = duration_seconds_remaining - track_selected.duration_seconds
tracks_for_playlist.append(track_selected)
return Playlist(tracks_for_playlist)
if __name__ == "__main__":
playlist = create_playlist()
print(playlist)
There's still work to do (see below), but using Sourcery you have been able to:
- Identify and fix a common mistake involving mutable default arguments
- Simplify a for loop using comprehensions
- Avoid a confusing assignment to a built-in variable
- Skip two refactorings that you dislike
Next Steps
If you normally use VSCode for coding, you're all set. We're improving Sourcery all the time, so watch out for updates to the plugin to get the latest features and refactorings.
If you normally code in some other environment, check out our installation instructions for other platforms.
If you like what you've seen so far and want to take the next steps, check out our tutorial for writing custom rules.
You may also want to browse our reference for configuring Sourcery.
If you've got questions about the plugin, have a look at our FAQs.
Finally, if you have any feedback on this tutorial or on Sourcery, feel free to raise an issue on GitHub or just reach out by email.
Taking Refactoring Further
Here at Sourcery, we love refactoring, and I couldn't leave this code alone without trying to refactor it further.
This is as far as I could get "simplifying" the code in __main__.py
to achieve the same result:
import random
import typing
from itertools import accumulate, takewhile
from more_itertools import last
from sourcery_tutorial_introduction.playlist import Playlist
from sourcery_tutorial_introduction.tracks import Track, MAGIC_TRACKS
T = typing.TypeVar("T")
def shuffle(items: typing.Sequence[T]) -> typing.Sequence[T]:
return random.sample(items, len(items))
def create_playlist(
tracks: typing.Sequence[Track],
condition: typing.Callable[[Playlist], bool],
) -> Playlist:
"""Create the largest Playlist from `tracks` (in order) that satisfies `condition`."""
return last(
takewhile(
condition,
accumulate(
tracks,
Playlist.with_track,
initial=Playlist.empty(),
),
)
)
if __name__ == "__main__":
tracks = shuffle(list(MAGIC_TRACKS))
condition = lambda playlist: playlist.duration_seconds < 1800
result = create_playlist(tracks, condition)
print(result.duration_seconds)
There are a couple of principles at work here:
- dependency injection - don't hard-code values (including conditions) and don't fetch data that could be passed in
- abstraction - calculations belong on the dataclass in question
- declarative programming - avoiding imperative loops and mutations can make code easier to debug
This kind of refactoring exercise is how we figure out how Sourcery should work! If you have suggestions for refactorings, do reach out, we love to hear them.