Skip to content

Custom Rule Pattern Syntax

This page describes Sourcery's specialised syntax for writing patterns for custom rules (see Reference: Rule Pattern Configuration).

See Also

Overview

Syntax Example Description
Basic raise NotImplemented Normal Python code matches (leniently)
Capture ${x} x = ${var} Match (and capture) any single expression
Capture Multiple ${x*} or ${x+} [${items+}] Match (and capture) multiple expressions
Match Anything ... print(...) Match any Python code
Match Missing !!! key: !!! = value Match where some optional code is missing

Complete Example

The pattern in the example below makes use of all the available syntax, and will match any class without a docstring.

pattern: |
  class ${cls}(${bases*}):
      """!!!"""
      ...
  • class matches a literal class definition
  • ${cls} matches any class name
  • ${bases*} matches zero or more base classes, as well as keyword arguments
  • """!!!""" matches a missing docstring
  • ... matches any other code, in this case the class' body

Note

The above example could have been written using ... in place of ${bases*} if the bases weren't needed for replacement, but I've written it this way to illustrate the use.

Syntaxes

Basic Patterns

Any valid Python code is also a valid pattern, which will trigger for exact matches.

Examples

pattern: |
  raise NotImplemented
pattern: |
  import pdb; pdb.set_trace()
pattern: |
  list()

Lenient Patterns

Sourcery employs a lenient matching strategy to simplify writing patterns. See Negative Pattens below for reference on how to suppress lenient matching in your pattern.

The following code elements are leniently matched.

Note

In the following examples, the tests field is used to show examples of what the pattern matches. See Reference: Rule Configuration tests Field for more information.

Decorators:

pattern: |
  def f():
    pass
tests:
  - matches: |
      def f():
          pass
  - matches: |
      @example
      def f():
          pass
  - matches: |
      @first_decorator
      @second_decorator
      def f():
          pass

Return annotations:

pattern: |
  def f():
    pass
tests:
  - matches: |
      def f():
          pass
  - matches: |
      def f() -> int:
          pass
  - matches: |
      def f() -> list[str]:
          pass

Docstrings (for both functions and classes):

pattern: |
  def f():
    pass
tests:
  - matches: |
      def f():
          pass
  - matches: |
      def f():
          """Example docstring"""
          pass

Async declarations (for functions only - patterns with for loops and elsewhere do not match leniently):

pattern: |
  def f():
    pass
tests:
  - matches: |
      def f():
          pass
  - matches: |
      async def f():
          pass

Assignment type annotations:

pattern: |
  x = 3
tests:
  - matches: |
      x = 3
  - matches: |
      x: int = 3
  - matches: |
      x: Number = 3

Parameter type annotations:

pattern: |
  def add(a, b):
    return a + b
tests:
  - matches: |
      def add(a, b):
          return a + b
  - matches: |
      def add(a: int, b: int):
          return a + b
  - matches: |
      def add(a: Vec[Number], b: Number):
          return a + b

Qualified Name Resolution

Variable names typically match exactly. For example, see the tests in the following rule:

rules:
  - id: avoid-logging-debug
    description: Remove logging.debug calls from code
    pattern: logging.debug
    tests:
      - match: logging.debug(DEBUG_MESSAGE)
      - no-match: logging.info(INFO_MESSAGE)
      - no-match: lg.debug(DEBUG_MESSAGE)  # NOTE: this line does not match literally

However, if your code contains import statements, qualified names will be resolved. For example, the statement import logging as lg will allow Sourcery to infer that lg refers to the logging module. Other kinds of imports are also used to resolve names in the tested code:

rules:
  - id: avoid-logging-debug
    description: Remove logging.debug calls from code
    pattern: logging.debug
    tests:
        # literal matches still happen, so you do not need to import `logging`
        # here:
      - match: logging.debug(DEBUG_MESSAGE)
      - no-match: logging.info(INFO_MESSAGE)
        # NOTE: the following test matches because of name resolution.
        # The name `lg` refers to `logging`, so `lg.debug` resolves to
        # `logging.debug`.
      - match: |
          import logging as lg

          lg.debug(DEBUG_MESSAGE)
        # NOTE: the following test matches because of name resolution.
        # The name `dbug` refers to `logging.debug` because of the aliased
        # import.
      - match: |
          import logging.debug as dbug

          dbug(DEBUG_MESSAGE)
        # NOTE: the following test matches because of name resolution.
        # The name `debug` refers to `logging.debug` because it was directly
        # imported from the module `logging`.
      - match: |
          from logging import debug

          debug(DEBUG_MESSAGE)

Capturing Expressions Using ${var}

A pattern can contain one or more captures. A capture will match any single expression.

A capture is specified with the specialised syntax ${<capture name>}.

Examples

Note

In the following examples, the tests field is used to show examples of what the pattern matches. See Reference: Rule Configuration tests Field for more information.

Capture a single variable:

pattern: |
  print(${var})
tests:
  - match: print("hello")
  - match: print(my_variable)
  - match: 'print({"protocol": "http"})'

Capture two variables:

pattern: |
  def ${f}(self, ${arg}):
      pass
tests:
  - match: |
      def example(self, input):
          pass
  - match: |
      def test_template(self, fixture: float) -> None:
          pass
  - match: |
      @log("info")
      def length(self, limit):
          pass

Capture Names

A capture's name can't be a Python keyword or the name of a built-in function.

For example, the following capture names are invalid:

  • in
  • input

There are also reserved names that cannot appear in patterns:

Capturing Multiple Expressions Using ${var*} and ${var+}

A Capture can contain the optional suffix * or +. These tell Sourcery to consume multiple expressions. These suffixes are analogous to regex behaviour:

  • * captures zero or more expressions
  • + captures one or more expressions

Examples

Note

In the following examples, the tests field is used to show examples of what the pattern matches. See Reference: Rule Configuration tests Field for more information.

Capture zero or more arguments (and keyword arguments) to a function call using *:

pattern: print(${args*})
tests:
  - match: print()
  - match: print("hello!")
  - match: 'print("Your name is: ", name)'
  - match: print(f"Error in {__file__}", file=sys.stderr)
  - match: print(13, 14, 15, 16, sep=";", end="")

Capture one or more statements in a function definition using +:

pattern: |
  def repeat(f, repeats):
      ${statements+}
tests:
  - match: |
      def repeat(f, repeats):
          # one statement
          pass
  - match: |
      def repeat(f, repeats: int):
          # two statements
          if not f:
              return
          for repeat in range(repeats):
              f()

"Match Anything" Using ...

Use the specialised syntax ... to match any Python code without capturing it.

Examples

Note

In the following examples, the tests field is used to show examples of what the pattern matches. See Reference: Rule Configuration tests Field for more information.

Match any function called "get":

pattern: |
  def get(...):
      ...
tests:
  - match: |
      def get(self, key):
          return self.dict.get(key)
  - match: |
      def get():
          print("get")
  - match: |
      def get(one: int = 1, two: int = 2):
          return one or two

Match any class definition with no base classes (see Capturing Expressions above for reference on the ${cls} syntax):

pattern: |
  class ${cls}:
      ...
tests:
  - match: |
      class Language:
          alphabet: Alphabet
          grammar: Grammar

          def translate(phrase):
              ...
  - no-match: |
      class TranslationError(Exception):
          pass

Match any call to a print statement (note this is a simplification of the * syntax achieving the same thing above):

pattern: |
  print(...)
tests:
  - match: print()
  - match: print("hello!")
  - match: 'print("Your name is: ", name)'
  - match: print(f"Error in {__file__}", file=sys.stderr)
  - match: print(13, 14, 15, 16, sep=";", end="")

Warning

Do not use a pattern like

pattern: ...

This will match any python code, which is probably not what you want.

"Match Missing" Using !!!

Use the specialised syntax !!! to match cases where some optional piece of code is missing.

Optional Code in Python

Python allows a number of syntax elements which are optional - that is, the code will compile correctly if the syntax is not there. These include:

  • type annotations
  • docstrings
  • base class specification in class definitions
  • decorators
  • async declarations

You can use !!! syntax to match cases where some optional code is not present - see below for examples

Examples

Match missing docstrings:

pattern: |
  def display(x):
      """!!!"""
      print(x)
tests:
  - match: |
      def display(x):
          print(x)
  - no-match: |
      def display(x):
          """Wrapper around built-in print"""
          print(x)

Match missing return annotations (see "Match Anything with ... above for reference on the ... syntax):

pattern: |
  def display(x) -> !!!:
      ...
tests:
  - match: |
      def display(x):
          print(x)
  - no-match: |
      def display(x) -> None:
          print(x)