Logging Best Practices
You ship a feature. It works in development. It works in staging. Then it breaks in production, and you have no idea why because your logs are a mess of console.log statements that tell you nothing useful.
Good logging is one of those boring infrastructure things that feels unnecessary until you really need it. Then it's the difference between debugging in minutes versus hours.
What Makes a Good Log
Every log entry should answer: what happened, when, where, and why it matters. A log that just says "Error" tells you nothing. A log that says "Payment failed for user 12345: card declined by Stripe, error code card_insufficient_funds at 2025-10-07T14:32:00Z" tells you everything.
Structure your logs with consistent fields:
- Timestamp with timezone
- Log level
- Service or component name
- Message describing what happened
- Relevant context: IDs, values, error details
Log Levels and When to Use Them
Most logging frameworks support these levels:
DEBUG: Detailed information for debugging. Disabled in production usually. "User 123 preferences loaded: {theme: dark, language: en}"
INFO: Normal operations worth recording. "User 123 logged in" or "Payment of $99 processed successfully"
WARN: Something unexpected that didn't break anything. "API rate limit at 80%, consider increasing quota" or "Retry attempt 2 of 3 for external service"
ERROR: Something broke and needs attention. "Failed to save order: database connection timeout"
FATAL: The system can't continue. "Unable to connect to database on startup, shutting down"
The most common mistake is logging everything at INFO or ERROR. Use levels intentionally so you can filter effectively.
Structured Logging
Stop writing logs as human-readable sentences. Write them as structured data, usually JSON.
Instead of: "User 123 purchased item SKU-456 for $29.99"
Write: {"event": "purchase", "user_id": 123, "sku": "SKU-456", "amount": 29.99, "currency": "USD"}
Structured logs can be parsed, filtered, and analyzed by log management tools. You can query for all purchases over $100 or all errors from a specific user. You can't do that with free-form text.
Context Is Everything
A log without context is often useless. Include identifiers that let you trace the full picture:
- Request ID: trace a request across services
- User ID: see everything one user did
- Session ID: track a user's session
- Correlation ID: link related operations
When debugging, you should be able to filter by a request ID and see every log entry related to that specific request. If you can't, your logging needs more context.
What to Log
Log at boundaries: when requests come in, when responses go out, when you call external services, when important business events happen.
Good things to log:
- Incoming requests (method, path, relevant params)
- Outgoing responses (status code, latency)
- External service calls (what service, latency, success/failure)
- Business events (user signed up, order placed, payment processed)
- Errors with stack traces and context
- Performance metrics (slow queries, timeout warnings)
What Not to Log
Never log sensitive data:
- Passwords or authentication tokens
- Credit card numbers
- Personal data you don't need (full addresses, social security numbers)
- API keys or secrets
If you accidentally log sensitive data, it lives forever in your log storage. Scrub sensitive fields before logging, or use a logging library that does it automatically.
Centralized Log Management
Logs spread across multiple servers are useless when you're debugging. You need centralized log management.
Options range from free to expensive:
- ELK Stack (Elasticsearch, Logstash, Kibana) if you self-host
- Cloud solutions: Datadog, Loggly, Papertrail, CloudWatch
- Open source: Loki with Grafana
Pick something that lets you search, filter, and alert on your logs. The specific tool matters less than having one.
Log Rotation and Retention
Logs add up fast. A busy service can generate gigabytes daily. You need a retention policy.
Keep detailed logs for a short period, like 30 days, then archive or delete. Keep aggregated metrics longer. Make sure your log storage has automatic rotation so you don't fill up disks.
Alerting on Logs
Set up alerts for patterns that indicate problems:
- Spike in error rate
- Specific error types that should never happen
- Response time degradation
- Rate limit warnings approaching thresholds
Don't alert on every error. You'll get alert fatigue and start ignoring them. Alert on patterns that actually require attention.
Logging in Development vs Production
Development logs can be verbose. You want all the detail to debug locally.
Production logs should be more selective. DEBUG level is usually off. You're optimizing for signal-to-noise ratio so important events don't get buried.
Make log levels configurable so you can temporarily increase verbosity in production when debugging a specific issue.
Performance Considerations
Logging isn't free. Writing to disk or sending to a remote service takes time.
Don't log in tight loops. Batch writes when possible. Use async logging so you don't block request handling.
In hot paths, consider sampling. Log 1% of successful requests but 100% of errors. You still get visibility without the overhead.
Start Simple, Improve Incrementally
You don't need perfect logging from day one. Start with basic structured logging at service boundaries. Add context as you feel the pain of missing it. Improve based on what you actually need when debugging real issues.
The best logging strategy is one your team actually uses. Overly complex systems get ignored. Simple, consistent logging gets you 90% of the value.