Good logs save you at 3 AM. Bad logs make 3 AM worse. Here's how to log effectively.

Use Log Levels Correctly

DEBUG: Verbose details for development. Never in production.

logger.debug(f"Cache lookup for key={key}, hit={hit}")

INFO: Normal operations worth recording.

logger.info(f"User {user_id} logged in")

WARNING: Something unexpected but recoverable.

logger.warning(f"Retry attempt {attempt}/3 for API call")

ERROR: Something failed and needs attention.

logger.error(f"Payment failed for order {order_id}", exc_info=True)

CRITICAL: System is broken, wake someone up.

logger.critical("Database connection pool exhausted")

Structure Your Logs

Plain text is hard to search. Use structured logging:

# Bad
logger.info(f"User {user_id} purchased {product} for ${amount}")
 
# Good
logger.info(
    "Purchase completed",
    extra={
        "user_id": user_id,
        "product_id": product.id,
        "amount": amount,
        "currency": "USD",
    }
)

Structured logs become queryable data:

SELECT * FROM logs WHERE user_id = '123' AND amount > 100

What to Log

Log at boundaries:

  • Incoming requests
  • Outgoing API calls
  • Database queries (in debug)
  • Background job starts/completions

Log decisions:

if user.is_premium:
    logger.info("Applying premium discount", extra={"user_id": user.id})
    apply_discount()

Log failures with context:

except PaymentError as e:
    logger.error(
        "Payment processing failed",
        extra={
            "order_id": order.id,
            "amount": order.total,
            "error_code": e.code,
        },
        exc_info=True,
    )

What Not to Log

Secrets:

# NEVER
logger.info(f"Authenticating with password={password}")
logger.info(f"API key: {api_key}")

High-volume noise:

# Don't log inside tight loops
for item in million_items:
    logger.debug(f"Processing {item}")  # Million log lines

Personally identifiable information (PII):

# Bad
logger.info(f"User email: {user.email}, SSN: {user.ssn}")
 
# Good
logger.info(f"User created", extra={"user_id": user.id})

Add Request Context

Trace requests across your system:

import uuid
from contextvars import ContextVar
 
request_id: ContextVar[str] = ContextVar("request_id")
 
# Middleware
def add_request_id(request):
    request_id.set(str(uuid.uuid4()))
 
# In your logger
class RequestIdFilter(logging.Filter):
    def filter(self, record):
        record.request_id = request_id.get("")
        return True

Now every log line includes request_id. Follow one request through the entire system.

Log Configuration

import logging
 
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s %(levelname)s %(name)s %(message)s",
    datefmt="%Y-%m-%d %H:%M:%S",
)
 
# Quiet noisy libraries
logging.getLogger("urllib3").setLevel(logging.WARNING)
logging.getLogger("sqlalchemy").setLevel(logging.WARNING)

Production vs Development

Development: DEBUG level, console output, readable format.

Production: INFO level, JSON format, shipped to log aggregator.

if os.getenv("ENV") == "production":
    handler = logging.StreamHandler()
    handler.setFormatter(JsonFormatter())
    logger.addHandler(handler)

Log Aggregation

Logs on disk don't help when you have 50 servers. Use a log aggregator:

  • ELK Stack (Elasticsearch, Logstash, Kibana)
  • Datadog
  • Papertrail
  • CloudWatch Logs

Search across all logs. Set up alerts for error patterns.

My Rules

  1. Log levels matter. Use them consistently.
  2. Structure everything. JSON beats plain text.
  3. Add context. Request ID, user ID, relevant data.
  4. No secrets. Ever.
  5. Test your logs. If something fails, do your logs tell you why?

When production breaks, logs are your debugger. Invest in them.

React to this post: