Go Performance Optimization Guide
A comprehensive guide to writing high-performance Go applications — from memory management and concurrency patterns to networking and profiling.
Welcome to the Go Performance Optimization Guide — a practical, in-depth resource for engineers who want to write faster, leaner, and more efficient Go code.
Go is already one of the fastest compiled languages with a lightweight runtime and excellent concurrency primitives. But writing truly performant Go requires understanding how the compiler, runtime, and garbage collector interact with your code. This guide covers the patterns, tools, and techniques that separate good Go code from great Go code.
What's New in Go
Go 1.25 — Green Tea GC
Experimental GC with 10-40% less overhead, GOMAXPROCS cgroup awareness, FlightRecorder, json/v2.
Go 1.24 — Swiss Tables
Swiss Tables map (up to 60% faster), sync.Map rewrite, runtime.AddCleanup, testing.B.Loop.
Go 1.23 — Iterators
Iterator functions, unique package for string interning, stack frame optimization, PGO alignment.
Optimization Topics
Memory Management
Preallocation, stack vs heap, object pooling, GC tuning, struct alignment, and map performance.
Concurrency Patterns
Worker pools, atomic operations, context management, batching, and caching patterns.
I/O & Data Handling
Buffered I/O, zero-copy, string optimization, JSON performance, and regexp tuning.
Compiler & Runtime
Build flags, escape analysis, inlining, BCE, CGO performance, and unsafe patterns.
Networking Performance
HTTP/2, gRPC, connection pooling, TLS, DNS, QUIC, and 10K connections.
Profiling & Benchmarking
pprof, benchmarking best practices, and execution tracing.
Go Internals
Scheduler GMP, memory allocator, GC deep dive, goroutine stacks, map internals, select, panic/recover, compiler SSA, runtime bootstrap, syscalls, and generics.
Ecosystem & Production
Database Performance
Connection pooling, prepared statements, pgx vs database/sql, ORM overhead, batch operations, and Redis pipelining.
Serialization & Encoding
JSON, Protocol Buffers, MessagePack, FlatBuffers — benchmarks, zero-copy techniques, and custom marshalers.
Logging Performance
Zero-alloc loggers: slog vs zerolog vs zap, sampling, async logging, and hot path impact.
Container & Cloud
GOMAXPROCS in containers, GOMEMLIMIT, Kubernetes tuning, Docker optimization, and serverless cold starts.
OS-Level Tuning
Linux kernel params, TCP tuning, file descriptors, huge pages, NUMA, io_uring, and CPU governors.
Observability Overhead
OpenTelemetry cost, Prometheus metrics, distributed tracing, sampling strategies, and custom counters.
Data Structures
Bloom filters, ring buffers, lock-free queues, skip lists, bitsets, B-trees, tries, and arena allocation.
Architecture Patterns
Pipelines, fan-out/fan-in, rate limiting, backpressure, load shedding, circuit breakers, and graceful shutdown.
Binary & Build Optimization
ldflags, trimpath, UPX, PGO, build caching, cross-compilation, and dependency management.
Real-World Case Studies
From profiling to production — API gateways, data pipelines, CLI tools, Kafka processors, and memory-constrained services.
Who Is This For?
This guide is aimed at Go developers who are past the basics and want to understand what happens beneath the surface. Whether you're optimizing a hot path in a microservice, reducing tail latency in an API gateway, or squeezing every last allocation out of a data pipeline — you'll find actionable advice here.
How to Use This Guide
Each article is self-contained. You can read them in order for a full picture, or jump straight to the topic you need. Every section includes Go code examples with benchmarks so you can measure the impact yourself.
Start with Memory Preallocation if you're new to Go optimization, or jump to Profiling with pprof if you need to diagnose a specific performance issue.