In my experience working across data systems, Python pipelines, and AI workflows, I’ve realized something important:
Most production issues don’t come from lack of knowledge ; they come from from anti-patterns we ignore*.*
These aren’t beginner mistakes.
They’re subtle habits that creep in when we’re moving fast, under pressure, or overconfident.
Here are the anti-patterns I’ve personally encountered and how I think about fixing them today.
Taking a Jupyter notebook and directly turning it into production code.
# messy notebook-style code
df = pd.read_csv("data.csv")
df = df[df["age"] > 18]
df["score"] = df["a"] * df["b"]
print(df.head())
With burst of databricks, and notebook culture I have seen this in multiple orgs and projects:
Better approach
def load_data(path: str) -> pd.DataFrame:
return pd.read_csv(path)
def transform_data(df: pd.DataFrame) -> pd.DataFrame:
df = df[df["age"] > 18].copy()
df["score"] = df["a"] * df["b"]
return df
def main():
df = load_data("data.csv")
df = transform_data(df)
return df
My rule now:
“If it’s going to production, it deserves structure.”
Huge, unreadable SQL queries with nested subqueries everywhere.