AI writes Python code, but it's still your job to maintain it

Image by author

# Introduction

AI coding tools are getting effectively good at writing Python code that works. They can build entire applications and implement complex algorithms in minutes. However, code generated by AI is often difficult to maintain.

If you are using a tool like cloud code, GitHub CopilotOr of cursor Agentic mode, you may have experienced it. AI helps you ship working code faster, but the costs appear later. You’ve probably re-implemented a bloated function to understand how it works a few weeks after it was generated.

The problem is not that AI writes bad code – although that sometimes happens – the problem is that AI optimizes to “work now” and meet the requirements in your prompt, whereas you need code that is readable and maintainable over the long term. This article shows you how to bridge this gap by focusing on Python-specific strategies.

# Avoiding the Blank Canvas Trap

The biggest mistake developers make is asking AI to start from scratch. AI agents work best with constraints and guidelines.

Before you write your first prompt, Set up project basics yourself. This means choosing your project structure – setting up your core libraries and implementing some working examples – to set the tone. This may seem counterintuitive, but it helps AI write code that better aligns with what you need in your application.

Start by creating some features manually. If you’re building an API, implement a full endpoint with all the patterns you want: dependency injection, proper error handling, database access, and validation. This becomes the reference implementation.

Let’s say you write this first endpoint manually:

from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.orm import Session

router = APIRouter()

# Assume get_db and User model are defined elsewhere
async def get_user(user_id: int, db: Session = Depends(get_db)):
    user = db.query(User).filter(User.id == user_id).first()
    if not user:
        raise HTTPException(status_code=404, detail="User not found")
    return user

When AI sees this pattern, it understands how we handle dependencies, how we query the database, and how we handle missing records.

The same applies to your project structure. Create your directories, set up your imports, and configure your test framework. AI should not make these architectural decisions.

# Build a Python-like system to do heavy lifting

Python’s dynamic typing is flexible, but that flexibility becomes a liability when an AI is writing your code. Make type hints a necessary guardrail rather than a nice-to-use in your application code.

Strict typing catches AI mistakes before they reach production. When you need to type and run a hint on each function signature mypy In strict mode, the AI cannot take shortcuts. It cannot return ambiguous types or accept parameters that can be strings or lists.

More importantly, strict types force better designs. For example, an AI agent is trying to write a function that accepts data: dict Many assumptions can be made about what is in that dictionary. However, an AI agent writes a function that accepts data: UserCreateRequest Where? UserCreateRequest there is one pidentic There is exactly one interpretation of the model.

# This constrains AI to write correct code
from pydantic import BaseModel, EmailStr

class UserCreateRequest(BaseModel):
    name: str
    email: EmailStr
    age: int

class UserResponse(BaseModel):
    id: int
    name: str
    email: EmailStr

def process_user(data: UserCreateRequest) -> UserResponse:
    pass

# Rather than this
def process_user(data: dict) -> dict:
    pass

Use libraries that implement contracts: SQLAlchemy 2.0 and up with type-checked models fastapi There are excellent options with response models. These are not just good practices; Those constraints are what keep AI on track.

Set mypy to strict mode and make passing type checking non-negotiable. When AI generates code that fails type checking, it will iterate until it passes. This automatic feedback loop produces better code than any quick engineering.

# Creating documentation to guide AI

Most projects have documentation that developers ignore. For AI agents, you need to document what they actually use – e.g. README.md File with the guidelines. This means a single file with clear, specific rules.

create a CLAUDE.md Or agents.md File at your project root. Don’t make it too long. Focus on what’s unique about your project rather than general Python best practices.

Your AI guidelines should specify:

Project structure and where are the different types of code
Which libraries to use for common tasks
Typical patterns to follow (point to example files)
clear barred pattern
test requirements

Here is an example agents.md file:

# Project Guidelines

## Structure
/src/api - FastAPI routers
/src/services - business logic
/src/models - SQLAlchemy models
/src/schemas - Pydantic models

## Patterns
- All services inherit from BaseService (see src/services/base.py)
- All database access goes through repository pattern (see src/repositories/)
- Use dependency injection for all external dependencies

## Standards
- Type hints on all functions
- Docstrings using Google style
- Functions under 50 lines
- Run `mypy --strict` and `ruff check` before committing

## Never
- No bare except clauses
- No type: ignore comments
- No mutable default arguments
- No global state

The key is to be specific. Don’t just say “follow best practices.” Point to the exact file that displays the pattern. Don’t just say “handle errors properly;” Show your desired error handling pattern.

# Writing Prompts That Point to Examples

Generic signals generate generic code. Specific hints that reference your existing codebase produce more maintainable code.

Instead of telling the AI to “add authentication”, run it through the implementation in the context of your pattern. Here’s an example of a sign that points to examples:

Implement JWT authentication in src/services/auth_service.py. Follow the same structure as UserService in src/services/user_service.py. Use bcrypt for password hashing (already in require.txt).
Add authentication dependencies to src/api/dependents.py following the pattern of get_db.
Create a pydantic schema in src/schemas/auth.py similar to user.py.
Add pytest test to test/test_auth_service.py using fixtures from conftest.py.

Notice how each instruction points to an existing file or pattern. You’re not asking the AI to build an architecture; You are asking it to implement what is needed for a new feature.

When AI generates code, review it according to your patterns. Does it use the same dependency injection approach? Does it follow the same error handling? Does it handle imports the same way? If not, point out the discrepancy and ask to align it with the existing pattern.

# planning before implementation

AI agents can move faster, which can sometimes make them less useful when speed comes at the expense of structure. Use planning mode or ask for an implementation plan before any code is written.

A planning step forces the AI to think about dependencies and structure. It also gives you a chance to catch architectural problems – such as circular dependencies or redundant services – before they are implemented.

Ask for a plan that specifies:

Which files will be created or modified
What dependencies exist between components
Which existing patterns will be followed
What tests are needed

Review this plan like you would review a design document. Check that AI understands your project structure. Verify that it is using the correct libraries and confirm that it is not re-inventing something that already exists.

If the plan sounds good, let the AI execute it. If not, get the plan in place before any code is written. It’s easier to fix a bad plan than it is to fix bad code.

# Asking AI to write tests that actually test

The AI is great and very fast at writing tests. However, AI is not efficient at writing useful tests unless you are specific about what you mean by “useful”.

The default AI testing behavior is to test the happy path and nothing else. You get tests that verify that the code works when everything goes right, i.e. exactly when you don’t need tests.

Specify your testing requirements clearly. For each feature, it is required:

auspicious path test
Validation error testing to check what happens with invalid input
Edge case testing for empty values, None, boundary conditions, and more
Error handling testing for database failures, external service failures, and other similar tests

Point the AI to your existing test files as an example. If you already have good test patterns, the AI will write useful tests as well. If you don’t have good tests yet, write some yourself first.

# systematically validating the output

After AI generates code, don’t just check whether it runs or not. Run it through a checklist.

Your verification checklist should include questions such as the following:

does it pass mypy strict mode
Does it follow the pattern of existing code?
Are all functions under 50 lines?
Do tests cover edge cases and errors
Are there type indications on all functions?
Does it use the specified libraries correctly

Automate what you can. to install committed to the former hooks that run mypy, A type of fishAnd pytest. If AI-generated code fails these checks, it is not committed.

For what you can’t automate, you’ll notice common anti-patterns after reviewing enough AI code – such as functions that do too much, error handling that swallows exceptions, or validation logic mixed with business logic.

# Implementing a Practical Workflow

Let us now put together everything we have discussed so far.

You start a new project. You spend time setting up the structure, choosing and installing libraries, and writing some example features. you create CLAUDE.md Write more specific pedantic models with your guidelines.

Now you ask the AI to implement a new feature. Write a detailed prompt pointing to your examples. The AI formulates a plan. You review and approve it. AI writes code. You run type checking and tests. everything passes. You review the code according to your pattern. it matches. You are committed.

The total time from prompt to commit may only be about 15 minutes for a feature that would take you an hour to write manually. But more importantly, the code you get is easier to maintain – it follows the patterns you’ve established.

The next feature gets faster because the AI has more examples to learn from. Over time the code becomes more consistent as each new feature reinforces existing patterns.

# wrapping up

With AI coding tools proving to be extremely useful, your job as a developer or data professional is changing. You’re now spending less time writing code and more time on:

Designing a system and choosing an architecture
Creating a reference implementation of the pattern
Writing Constraints and Guidelines
Reviewing AI outputs and maintaining quality bars

The skill that matters most is not writing code fast. Rather, it is designing systems that force AI to write maintainable code. It’s knowing which practices scale and which create technical debt. I hope you find this article useful even if you don’t use Python as your programming language of choice. Tell us what else you think we can do to maintain AI-generated Python code. Keep exploring!

Bala Priya C is a developer and technical writer from India. She likes to work in the fields of mathematics, programming, data science, and content creation. His areas of interest and expertise include DevOps, Data Science, and Natural Language Processing. She loves reading, writing, coding, and coffee! Currently, she is working on learning and sharing her knowledge with the developer community by writing tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.

AI writes Python code, but it’s still your job to maintain it