Skip to content

Pandas: Avoid inplace

Sourcery suggestion id: pandas-avoid-inplace

Available starting with version 1.1.0

Description

Don't use inplace for methods that always create a copy under the hood.

Before

import pandas as pd

df = pd.DataFrame(
    [["Python", 190], ["JavaScript", 33],],
    columns=["Language", "Number of rules"],
)
df.sort_values("Language", inplace=True)

After

import pandas as pd

df = pd.DataFrame(
    [["Python", 190], ["JavaScript", 33],],
    columns=["Language", "Number of rules"],
)
df = df.sort_values("Language")

Before

import pandas as pd
df = pd.DataFrame(
    [["Python", 190], ["JavaScript", 33],],
    columns=["Language", "Number of rules"],
)
df.copy().sort_values("Language", inplace=True)

After

import pandas as pd
df = pd.DataFrame(
    [["Python", 190], ["JavaScript", 33],],
    columns=["Language", "Number of rules"],
)
df.copy().sort_values("Language")

Explanation

Some DataFrame methods can never operate inplace. Their operation (like reordering rows) requires copying, so they create a copy even if you provide inplace=True.

For these methods, inplace doesn't bring a performance gain.

It's only a "syntactic sugar for reassigning the new result to the calling DataFrame/Series."

Drawbacks of using inplace:

  • You can't use method chaining with inplace=True
  • The inplace keyword complicates type annotations (because the return value depends on the value of inplace)
  • Using inplace=True gives code that mutates the state of an object and thus has side-effects. That can introduce subtle bugs and is harder to debug.

PDEP-8

This PDEP suggests to deprecate the inplace option for methods that can never operate inplace.

Best practice: Explicitly reassign the result to the caller DataFrame.

E.g.

df = df.sort_values("language")

In cases, where the caller isn't a variable but an expression, inplace doesn't have an effect anyway.

df.copy().sort_values("Language", inplace=True)

copy creates a new DataFrame object, which isn't assigned to any variable. inplace doesn't change the df object, but this copy result object instead.

In this case, the only effect of inplace is that the expression returns None instead of a new DataFrame.

Thus, it should be omitted for clarity.

df.copy().sort_values("Language")

DataFrame Methods Affected

These DataFrame methods always create a copy under the hood even if you provide the inplace keyword. In PDEP-8, they are mentioned as "Group 4" methods.

  • dropna
  • drop_duplicates
  • sort_values
  • sort_index
  • eval
  • query