Syed Jafer K

Its all about Trade-Offs

Learning Notes #44 – Initial Steps on Distributed Tracing and Observability

Few days back, i read a blog post on Jaegar https://medium.com/@achanandhi.m/why-we-need-jaeger-8ba21278b5c9 by Achanandhi M, which made to look on to distributed tracing. I was in need of this tracing for my current project inside a single service. I want to have some trace_id to be passed over the classes for isolating each request from other in logs.

What is Distributed Tracing?

Distributed tracing is a method that helps you see what happens when a request is sent to an application. It shows you the path the request takes as it moves through different parts of the system, like different services or databases. This is helpful for

  • Finding where delays or errors happen.
  • Understanding how different parts of your system work together.
  • Making your application faster and more reliable.

When I started looking into this, I kept coming across mentions of OpenTelemetry and Jaeger. They seemed important, so I decided to learn more about how they work together to make tracing possible.

Why Does Observability Matter?

Observability is about collecting information from your system to understand what’s going on inside it. It involves three main types of data

  1. Metrics: Numbers that tell you how your system is performing, like how many requests it’s handling.
  2. Logs: Text messages that record what’s happening, like errors or warnings.
  3. Traces: A detailed map of how a request moves through your system.

I had a chat with my friend/mentor/well-wisher/brother, who is a cloud architect. During the office time we were discussing on observability, metrics, logs, traces. He also told me about telemetry, which is kind of collecting data from your application and sending it to tools that can analyze it. OpenTelemetry is one of those tools. There are many such tools out there like SigNoz, New Relic, DataDog and more.

What Are OpenTelemetry and Jaeger?

OpenTelemetry is a tool that helps you collect data from your application in a standard way. It works with logs, metrics, and traces. Jaeger is another tool that takes the tracing data collected by OpenTelemetry and lets you see it in a simple way.

  • OpenTelemetry collects data from your application.
  • Jaeger takes that data and shows you a clear picture of how requests are moving through your system.

How Does OpenTelemetry Collect Logs and Traces?

One of my first questions was: How does OpenTelemetry get logs and traces from my application? Here’s what I learned

  1. Instrumentation: You add OpenTelemetry to your application, so it knows what to collect.
  2. Context Propagation: It automatically adds information like a request ID to logs, so you can connect them to traces.
  3. Exporting: OpenTelemetry sends the collected data to a tool like Jaeger, where you can analyze it.

I found an OpenTelemetry SDK for Python that makes this process simple. There’s also 2 YouTube [2]videos I watched that showed how to set it up for my app. It really helped me see how everything fits together.

How Logs and Traces Work Together

Another thing I learned is that traces and logs are not the same, but they work well together. Traces help you see the big picture of how requests move through your system, while logs give you detailed information about what’s happening inside each part. When you use OpenTelemetry, it adds trace IDs to your logs so you can easily match them up.

What I Learned

  1. Tracing Makes Complex Systems Easier to Understand – It’s like having a map of your application that shows where things are slow or broken. Not recommended for smaller apps.
  2. OpenTelemetry Simplifies Data Collection – It provides tools to collect logs and traces without much effort.
  3. Jaeger Helps Visualize Traces – Jaeger shows the collected data in a way that’s easy to read and understand.
  4. Logs and Traces Work Better Together – Combining these gives you a complete view of what’s happening in your system.

What’s Next for Me?

I’m planning to keep learning about observability and applying it to my projects. I want to see how these tools can make debugging and improving my applications easier. If you’re curious about understanding how your applications work or want to make them better, I think distributed tracing and tools like OpenTelemetry and Jaeger are worth exploring. Will be exploring on these front more. As its good to know available tools for a future tech architect.