Common Anti-Patterns in Python, SQL, Data Engineering & AI

In my experience working across data systems, Python pipelines, and AI workflows, I’ve realized something important:

Most production issues don’t come from lack of knowledge ; they come from from anti-patterns we ignore*.*

These aren’t beginner mistakes.

They’re subtle habits that creep in when we’re moving fast, under pressure, or overconfident.

Here are the anti-patterns I’ve personally encountered and how I think about fixing them today.

1. The “Notebook-to-Production” Trap

Taking a Jupyter notebook and directly turning it into production code.

# messy notebook-style code
df = pd.read_csv("data.csv")
df = df[df["age"] > 18]
df["score"] = df["a"] * df["b"]
print(df.head())

With burst of databricks, and notebook culture I have seen this in multiple orgs and projects:

Hidden state issues
No modularity
Hard to debug in pipelines
Zero testability

Better approach

def load_data(path: str) -> pd.DataFrame:
    return pd.read_csv(path)
def transform_data(df: pd.DataFrame) -> pd.DataFrame:
    df = df[df["age"] > 18].copy()
    df["score"] = df["a"] * df["b"]
    return df
def main():
    df = load_data("data.csv")
    df = transform_data(df)
    return df

My rule now:

“If it’s going to production, it deserves structure.”

2. Writing SQL Like It’s Excel

Huge, unreadable SQL queries with nested subqueries everywhere.