How To Create Custom Rules for Libraries?¶
The majority of Sourcery's default and optional rules are about Python built-in functions. But you can create rules for libraries as well.
A custom rule helps you follow the best practices for your libraries. It can remind you to:
- set optional but recommended parameters
- avoid functions or options with known issues
- replace deprecated functions with new ones
The guidelines here are valid for all kinds of libraries:
- The modules of the Python standard library.
- PyPI packages.
- Your internal libraries.
TLDR¶
pattern
can contain a fully qualified name.replacement
shouldn't contain a fully qualified name.
Rules Without a Replacement¶
Let's say you want to write a rule that flags all usages of a deprecated
parameter. For example, you want to ensure that calls to pandas.read_csv()
don't use the deprecated argument prefix
.
In such a rule, you can refer to the pandas.read_csv
function. Sourcery will
recognize it, even if you import pandas
with an alias like pd
.
rules:
- id: deprecated-csv-prefix
description: Argument `prefix` for `read_csv` has been deprecated
pattern: pandas.read_csv(..., prefix=${pre}, ...)
explanation: |
See the [pandas docs for read_csv](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html)
tests:
- match: |
import pandas
df = pandas.read_csv("data.csv", prefix="ab")
- match: |
import pandas as pd
df = pd.read_csv("data.csv", prefix="ab")
- no-match: |
import pandas as pd
df = pd.read_csv("data.csv")
Rules With an Empty Replacement¶
It's also possible to delete a piece of code matching a pattern by providing an empty replacement. In these rules, you can again reference the library with its fully qualified name ("dot syntax").
For example, this optional rule
(contributed by
lasinludwig) removes all usages of the
debugging function streamlist.show_experiment
:
- id: flag-streamlit-show
description: Don't use Streamlit's `experimental_show` in production code
explanation: |
`st.experimental_show` should be used only for debugging purposes. See the [Streamlit docs](https://docs.streamlit.io/library/api-reference/utilities/st.experimental_show). Use `st.write()` in production
pattern: streamlit.experimental_show(...)
replacement: ''
tests:
- match: |
import streamlit as st
st.experimental_show(df)
- match: |
import streamlit as st
def some_function():
st.experimental_show(df)
- match: |
import streamlit
streamlit.experimental_show(df)
- no-match: |
import streamlit as st
st.snow(df)
- match: |
from streamlit import experimental_show
experimental_show(something)
- match: |
from streamlit import experimental_show as exp_show
exp_show(something)
- no-match: other_package.experimental_show()
- no-match: |
import st_other_package as st
def some_function():
st.experimental_show(df)
tags:
- no-debug
- streamlit
Library Rules With a Replacement¶
This is where things get tricky.
Caveat: The replacement
field is interpreted literally. Even if it contains
dots.
If you have conventions that you always import a library with a specific alias,
you might use replacement
s with that alias. E.g. this works, if you always
import pandas
with the alias pd
:
- id: deprecated_csv_squeeze_false
description: The parameter `squeeze` for `pandas.read_csv()` has been deprecated
pattern: pd.read_csv(${before*}, squeeze=False, ${after*})
replacement: pd.read_csv(${before}, ${after})
explanation: |
False is the default value for squeeze.
You can just omit this argument.
Conclusion¶
Custom rules are helpful to make those "you should remember" caveats of internal and external libraries explicit.
-
In the
pattern
field, you can use fully qualified names. Sourcery will resolve the imports and flag any code that matches thepattern
. -
Defining a
replacement
for library rules is more tricky. It works only with simple replacements or if your code follows strict import conventions.