Architecture reviews can easily become 50-page documents that sit unread in a shared drive. I've developed a process that produces actionable, prioritized findings that teams can actually use.

The Goal

An architecture review should answer one question: what should we fix first?

Not "what could theoretically go wrong" or "what would be ideal in a perfect world." Teams have limited time and budget. The review should make the path forward obvious.

Phase 1: Discovery (2-4 hours)

Before I look at any diagrams or code, I ask questions:

Business context:

  • What's the growth trajectory? 2x this year? 10x?
  • What would an outage cost? Per minute? Per hour?
  • Are there compliance requirements (SOC 2, HIPAA, PCI)?
  • What's the budget for infrastructure changes?

Operational reality:

  • What keeps you up at night?
  • What's broken recently?
  • Where do you feel uncertain?

These answers shape everything. A startup growing 10x needs different advice than a stable business optimizing costs.

Phase 2: Investigation (4-8 hours)

Now I dig into the actual system. I'm looking for four categories of issues:

Single Points of Failure (SPOF)

Anything where one failure takes down the whole system:

  • Single database instance
  • One availability zone
  • Critical services with no redundancy
  • Single NAT gateway handling all egress

Scalability Blockers

Patterns that prevent horizontal scaling:

  • In-memory session storage
  • Synchronous processing that should be async
  • Database connections per request (connection pool exhaustion)
  • Monolithic deployments that can't scale components independently

Security Gaps

Not a full pentest, but the obvious stuff:

  • Secrets in environment variables vs. secrets manager
  • Overly permissive IAM policies
  • Tenant isolation that relies only on application code
  • Missing encryption at rest or in transit

Cost Inefficiency

Money being left on the table:

  • Oversized instances (high CPU allocation, low actual usage)
  • Unused resources (orphaned snapshots, idle load balancers)
  • Reserved instance opportunities
  • Data transfer patterns that could be optimized

Phase 3: Prioritization

This is where most reviews fail. They list 30 findings without helping the team decide what to do.

I prioritize using two dimensions:

Severity: How bad is it if this thing breaks?

  • Critical: Total outage, data breach, compliance violation
  • High: Major degradation, significant risk
  • Medium: Partial impact, moderate risk
  • Low: Minor inconvenience, theoretical risk

Effort: How hard is it to fix?

  • S (Small): A day or less, no coordination needed
  • M (Medium): A week, maybe some coordination
  • L (Large): A sprint or more, cross-team coordination
  • XL (Extra Large): Multi-sprint project, significant architecture change

The magic quadrant is Critical/High severity + Small/Medium effort. These are the wins. Fix them first.

Phase 4: Roadmap

Instead of a flat list, I group findings into phases:

Phase 1: Quick wins (Week 1) Small effort fixes that reduce immediate risk. Often achievable without changing core architecture.

Phase 2: Foundation (Weeks 2-4) Medium effort work that enables future scaling. Usually involves adding redundancy and decoupling.

Phase 3: Strategic (Month 2+) Larger architectural changes. Only makes sense once the foundation is solid.

This gives the team a clear path. They can see progress in week one, which builds momentum for the harder work.

The Deliverable

My architecture review includes:

  1. Executive summary (1 paragraph) — the single most important thing
  2. System overview — how I understand the architecture
  3. Findings table — severity, category, effort for each issue
  4. Detailed findings — problem, impact, recommendation for each
  5. Prioritized roadmap — phased plan with effort estimates

The whole thing fits in 5-10 pages. Readable in 15 minutes.

Example

Here's what this looks like in practice: Sample Architecture Review

It's a fictional B2B SaaS company, but the findings are based on real patterns I've seen repeatedly.

What I've Learned

Start with business context. Technical findings mean nothing without understanding what the business needs.

Prioritize ruthlessly. Five high-impact recommendations beat 30 theoretical concerns.

Make it actionable. Every finding should have a clear next step, not just a vague "consider improving."

Right-size the effort. A $500 review for a startup looks different than a $5,000 review for an enterprise. Scope appropriately.


Need an architecture review? See my packages or get in touch.

React to this post: