Writing your first custom rules¶
By the end of this tutorial you will have written two custom Sourcery rules and used them to review your code.
Prerequisites¶
- You have installed Sourcery in your IDE
Introduction¶
Have you ever seen the same code smell come up again and again in different code reviews? Wouldn't it be great if you could write a rule to make sure that it could never happen again?
Sourcery can be extended with your own custom rules that allow you to do this.
Creating your first rule¶
Open or create .sourcery.yaml
in your IDE and paste your first rule:
rules:
- id: raise-not-implemented
description: NotImplemented is not an Exception, raise NotImplementedError instead
pattern: raise NotImplemented
Save this file, then open a new Python file and paste this problematic code:
class ExampleBaseClass:
def abstract_method(self):
raise NotImplemented
Sourcery highlights the line where this issue occurs. Hover your mouse on the highlight to see the problem:
Including a replacement in your rule¶
This is really nice, but what would be even nicer is to have Sourcery fix this
for you. Update your .sourcery.yaml
rule by adding a replacement
key:
rules:
- id: raise-not-implemented
description: NotImplemented is not an Exception, raise NotImplementedError instead
pattern: raise NotImplemented
replacement: raise NotImplementedError
Save this file, return to the example code and hover over the highlight:
Not only is the issue highlighted but the fix is suggested too!
Let's accept the suggestion:
- Move the cursor to the broken code, this will show the lightbulb 💡
- Click the lightbulb to show Sourcery options
- Select the first option
and Sourcery fixes our code:
Congratulations, you've written your first rule!
Dissecting a rule¶
Let's look closer at our rule:
rules:
- id: raise-not-implemented
description: NotImplemented is not an Exception, raise NotImplementedError instead
pattern: raise NotImplemented
replacement: raise NotImplementedError
Firstly, we can see that our rule lives inside a rules
sequence. This can
contain multiple rules as we'll see later on.
Each rule has the following keys:
Field | Type | Required | Description |
---|---|---|---|
id | string | Required | Unique, descriptive identifier, e.g. raise-not-implemented |
description | string | Required | A description of why this rule triggered and how to resolve it |
pattern | string | Required | Search for code matching this expression |
condition | string | Optional | Filter pattern matches to those that pass this condition |
replacement | string | Optional | Replace the matched code with this |
More powerful patterns using captures¶
Let's take it up a level. Add a second rule to .sourcery.yaml
rules:
- id: raise-not-implemented
description: NotImplemented is not an Exception, raise NotImplementedError instead
pattern: raise NotImplemented
replacement: raise NotImplementedError
- id: remove-open-r
description: Files are opened in read mode `r` by default
pattern: open(${file}, "r")
replacement: open(${file})
The new feature here is that we are now using the capture ${file}
to match any
code that is used as the first argument to open
. This captured code is then
used in the replacement.
Paste this code into your example file to see this in action:
def print_sourcery_yaml():
with open(".sourcery.yaml", "r") as f:
print(f.read())
Now hover over the highlighted line:
Here we can see that ${file}
has captured "sourcery.yaml"
and used it in the
replacement.
Finally use the quickfix lightbulb to fix your code.
Beautiful!
Filter pattern matches with a condition¶
Let's say we want to stop global variables being declared. First let's write down what a global variable is:
- The declaration happens at the top level of the module
- The variable is not a constant using uppercase like
DAYS_IN_WEEK = 7
- Let's allow private globals starting with
_
Here's a rule to identify global variables:
rules:
- id: no-global-variables
pattern: ${var} = ${value}
condition: |
pattern.in_module_scope()
and not var.is_upper_case()
and not var.starts_with("_")
description: Don't declare `${var}` as a global variable
Notice that the condition
field exactly mirrors what our definition of a
global variable is.
The first check uses a special name, pattern
, to refer to the entire match.
The check pattern.in_module_scope()
ensures that the match is in the module
scope, and not inside a function or class body. The next two checks refer to the
captured variable var
. The condition not var.is_upper_case()
skips variables
whose name is in upper case, like DAYS_IN_WEEK
, whereas the check
not var.starts_with("_")
skips variables whose name starts with an underscore.
Check out the conditions reference to see what conditions are available.
Writing Tests for a Rule¶
Custom rule testing allows you to check that your rules are working as expected by providing example snippets that you would wish Sourcery to refactor or not in your codebase.
We strongly recommend that you write tests for your rules to ensure that they work exactly how you expect them to. You may even use tests to write your rule in a TDD fashion by first writing tests, and then implementing a pattern that matches them.
Always try to provide realistic examples from your codebase to make sure that Sourcery will help you in the most effective manner possible.
How to write tests
You can add tests to your rule by providing a tests
field, which must be a
list of either match
or no-match
examples. Match examples may be optionally
accompanied by a expect
key if the rule contains a replacement
.
Match examples
Match examples are marked by the key match
. Sourcery will test that your rule
is correctly applied to the example code you provide, and will issue an error if
it does not. Use match
examples to check that your rule appears correctly
where you expect it to.
For example:
rules:
- id: do-not-call-print
pattern: print(${arg})
description: Do not call `print` - prefer using logging functions instead
tests:
- match: print(5)
- match: print("Hey, you should use `logger.info`!")
Match-expect examples
If your rule contains a replacement
field, you can also optionally provide a
expect
field to the match examples. Sourcery will then check that not only
your rule is applied to the match
field, but that it also outputs the code in
expect
:
rules:
- id: do-not-call-print
pattern: print(${arg})
replacement: log(${arg})
description: Do not call `print` - prefer using logging functions instead
tests:
- match: print("Match this AND check for the replacement")
expect: log("Match this AND check for the replacement")
- match: print("Match this, but DO NOT check for the replacement")
Note
There is no dash (-
) before the expect
key.
No-match examples
Use no-match
examples to check that your rule is not too general, appearing in
cases where it should not. This is the opposite of match
examples: Sourcery
will test that your rule does not get applied to the example you provide, and
will issue an error if it does. Use them to eliminate false-positives by
including snippets that you know that are correct and should not be refactored
by your rule.
In the following example, we include a match
example where we do want
do-not-call-print
to be applied. However, we also include a no-match
test
because we know that calls to Logger.print
are fine in our codebase, and
should not be flagged as issues by Sourcery:
rules:
- id: do-not-call-print
pattern: print(${arg})
description: Do not call `print` - prefer using logging functions instead
tests:
- match: print("This is bad, we don't want calls to `print`")
- no-match: custom_logger.print("But it is OK if `print` is just a method")
More examples
You can have multiple (or even zero) match
and no-match
tests, and they can
appear in any order. In addition, test examples can be multiline strings. This
way, all the following examples are valid:
rules:
- id: do-not-call-print
pattern: print(${arg})
description: Do not call `print` - prefer using logging functions instead
tests:
- match: print(3)
- match: print(10)
- match: |
def validate_processing_mode(mode):
if mode == "legacy-mode":
print("Do not use 'legacy-mode' anymore")
else:
print(f"Mode '{mode}' successfully validated")
- id: replace-deprecated-function-with-new-one
pattern: my_deprecated_function(${arg})
replacement: new_function(${arg}, mode='legacy')
description: |
Function `my_deprecated_function` is deprecated.
Use `new_function` with `mode='legacy'` instead.
tests:
- no-match: this_function_is_ok(3)
- match: my_deprecated_function(3)
- match: |
if processing_mode == LEGACY_MODE:
my_deprecated_function(data)
expect: |
if processing_mode == LEGACY_MODE:
new_function(data, mode='legacy')
- no-match: print("Note that `do-not-call-print` is not triggered here")
Note that rules are never applied to other rules' tests. This helps you to ensure that the behavior you are testing derives solely from the rule you are writing.
Note
The values for match
, expect
and no-match
must always be valid Python code. Sourcery will issue parsing errors if they are
not.
Failing tests
If any of your tests fails, this indicates that your rule is not behaving as expected. Sourcery will issue configuration errors for each failing test in your configuration file. Those errors appear in your IDE's problems panel, or inline with your YAML file if you use extensions like Error Lens.
For example, the following rule has a failing test:
rules:
- id: replace-map-with-generator
description: |
Calling map with a lambda is confusing, prefer writing a generator instead
pattern: 'map(lambda ${arg}: ${expr}, items)'
tests:
- match: 'map(lambda x: x**2, items)' # SUCCESS
- match: 'map(lambda x: x**2, other_items)' # ERROR: (rules -> 0 -> tests -> 1 -> match) Match example did not trigger
For this file, Sourcery will issue
(rules -> 0 -> tests -> 1 -> match) Match example did not trigger
. This error
can be navigated as:
- in the
rules
field - the first rule (corresponding to index
0
since we use 0-based indexing) - in the
tests
field - the second test (corresponding to index
1
) - in the
match
key
Indeed, in the example above, the problem is that our pattern is not general
enough: it accept any one-element lambdas, but it requires the iterable to be a
variable literally named items
. To solve this, you can replace the literal
items
in the capture with a general capture ${items}
:
rules:
- id: replace-map-with-generator
description: Calling map with a lambda is confusing, prefer writing a generator
instead.
pattern: 'map(lambda ${arg}: ${expr}, ${items})'
tests:
- match: 'map(lambda x: x**2, items)' # SUCCESS
- match: 'map(lambda x: x**2, other_items)' # SUCCESS
The following list contains the most common errors you may get while writing your rules:
- Invalid syntax: this is normally caused by syntax errors in the Python code you provided as example
- Match example did not trigger: Sourcery was unable to find a match for
your rule in the
match
example you provided - Match example output did not match expected result. Got
code
: this happens in rules with replacements where Sourcery was able to find a match in your example, but the replaced code is not equal to what you expected in theexpect
field - No-match example triggered: Sourcery found an unexpected match in the
no-match
example you provided
Conclusion¶
You've written three rules; one that matches exact code, a second that captures code to reuse it in the replacement, and a third that filters the matches based on a condition.
Now go forth and write your own rules. Enjoy!
Next Steps¶
- Check out our rules documentation
- Run these rules on the command line