Programming

How to Set Up Continuous Profiling at Scale with Pyroscope 2.0

Learn to set up continuous profiling with Pyroscope 2.0: deploy server, configure agents, ingest via OTLP, analyze flame graphs, and optimize costs.

Published 2026-05-02 22:30:33 • Atinec Stack Staff

Introduction

Continuous profiling is becoming a standard part of the observability stack for good reason. It tells you why your code is slow or expensive, not just that it is. Metrics show high CPU, logs show slow requests, traces pinpoint the service, but only a profile reveals which function and which line are burning cycles. As systems grow complex, this level of visibility is essential. OpenTelemetry recently declared its Profiles signal as alpha, making profiling a first-class observability signal. Now, Pyroscope 2.0—a ground-up rearchitecture of the open-source continuous profiling database—makes profiling more cost-effective at scale with native support for OpenTelemetry Protocol (OTLP) profiling. This guide walks you through setting up continuous profiling using Pyroscope 2.0, from understanding the benefits to deploying and optimizing.

How to Set Up Continuous Profiling at Scale with Pyroscope 2.0

What You Need

A running application or service (e.g., a microservice, web app, or backend process)
Pyroscope 2.0 server (download from GitHub or use Docker image)
Profiling agents (language-specific e.g., pyroscope-java, pyroscope-go, or OTLP-compatible agents)
OpenTelemetry Collector (optional, for OTLP ingestion)
Basic knowledge of observability (metrics, traces, profiles)
Access to cloud or on-prem infrastructure (for deployment)

Step-by-Step Guide

Step 1: Understand the Case for Always-On Profiling

Before diving into setup, recognize why continuous profiling matters. It cuts infrastructure costs by revealing exactly which functions consume CPU and memory, enabling targeted optimizations instead of overprovisioning. It accelerates root cause analysis—compare profiles from before and after a regression to pinpoint changed code paths in minutes, without reproducing in staging. Profiling also closes the observability gap: while distributed tracing shows wall-clock time, profiling shows where CPU spends that time. For tail latency, Pyroscope captures p99 spikes as they happen.

Step 2: Deploy Pyroscope 2.0 Server

Pyroscope 2.0 rearchitects the original Cortex-based database for scalability. Deploy using Docker:

Pull the image: docker pull grafana/pyroscope:latest
Run with default config: docker run -d --name pyroscope -p 4040:4040 grafana/pyroscope
Open http://localhost:4040 to verify the UI loads.

For production, use Kubernetes via Helm charts (see Tips).

Step 3: Configure Profiling Agents

Install agents in your applications. For example, in a Java service using the Pyroscope Java agent:

Add the JAR: -javaagent:/path/to/pyroscope.jar
Set environment variables: PYROSCOPE_SERVER_ADDRESS=http://localhost:4040, PYROSCOPE_APPLICATION_NAME=my-service
Restart the service. Profiles will begin flowing.

For languages without a native agent, use the OpenTelemetry SDK with the profiling signal enabled, sending to the OTLP endpoint.

Step 4: Ingest Profiles via OpenTelemetry Protocol (OTLP)

Pyroscope 2.0 natively supports OTLP profiling. This enables ingesting profiles using the emerging standard without a separate agent. To use:

Deploy an OpenTelemetry Collector with the profiling receiver enabled.
Configure the collector to export profiles to Pyroscope: exporters: otlp: endpoint: "localhost:4317"
Ensure your application is instrumented with OpenTelemetry SDKs that generate profile data (currently alpha).

This approach future-proofs your observability pipeline and aligns with OpenTelemetry’s roadmap.

Step 5: Analyze Profiles in the UI

Navigate to the Pyroscope web interface. You can:

View live flame graphs for each service.
Compare two time ranges (e.g., before and after a deploy) to diff CPU or memory usage.
Identify top consumers – functions, packages, or lines responsible for resource usage.

Use the search bar to filter by application, type (cpu, memory, goroutines), or tags.

Step 6: Use Profiles for Root Cause Analysis

When an incident occurs, profiling helps you find the root cause fast:

Identify the affected service from metrics/traces.
Open Pyroscope and select the service.
Choose a time range covering the incident (use the compare feature against a baseline).
Look for new functions or increased CPU/memory in the diff.
Drill down to the exact line of code causing the regression.

This eliminates the need for ad-hoc logging or guesswork.

Step 7: Optimize Infrastructure Costs

Continuous profiling provides data-driven insights for cost reduction:

Identify high-CPU functions and optimize algorithms or add caching.
Detect memory leaks and inefficient allocations.
Right-size resources by analyzing historical consumption patterns.

Pyroscope 2.0’s rearchitecture reduces storage and query costs, making it feasible to profile all your services continuously without prohibitive expense.

Tips for Success

Start small: Profile one critical service first to understand overhead and value.
Use tags: Annotate profiles with version, region, or environment tags to filter and compare effectively.
Monitor agent overhead: Pyroscope agents are designed to be low-impact, but always test in staging.
Combine signals: Integrate Pyroscope with Grafana or your existing observability stack to correlate profiles with metrics and traces.
Scale with Kubernetes: Deploy the Pyroscope server with StatefulSet and persistent volumes for production.
Leverage OTLP: Even if your language has a native agent, consider using OTLP for consistency with OpenTelemetry.
Set retention policies: Configure data retention based on your compliance and cost needs (default is 30 days).
Stay updated: Pyroscope 2.0 is actively developed; watch the release notes for new features.

By following these steps, you’ll gain deep code-level visibility into your production systems, reduce infrastructure costs, and accelerate incident response—all with a cost-effective, open-source solution.