Addressing tracing in processes that fork. #257

kellegous · 2020-05-19T17:38:06Z

We're currently using opencensus-php to trace in a codebase that makes considerable use of pcntl_fork in the cli sapi as a means to do concurrent batch work. As you might expect, there are a few issues that arise from tracing a process that forks itself into child processes. I'm going to try to enumerate a few of those issues and propose some solutions to each. However, before I talk about the details, I do want to pose a critical question.

Is opencensus-php interested in addressing this use case?

For us (us here is https://mailchimp.com/), we'll have to address this in some form since much of our batch infrastructure currently depends on process forking and as much as that causes us headaches, we're stuck with it for the foreseeable future. I don't know how prevalent forking is in production PHP codebases, so I can't really make a claim here about how useful this is to the community, in general.

With that out of the way, let's talk about some details of tracing within processes that fork.

What happens right now when you fork while tracing.

Let's just consider a simple example. We have a parent process that has an enabled SpanContext. Let's say that parent process creates a few spans and then forks itself into a single child process. At the point of forking, the child process will inherit all of the spans that were created prior to forking. After the fork, the parent and the child can continue to create spans and those spans are accumulated independently in each process (There is one additional wrinkle that prevents these spans from being independent, but I'll discuss that later). As the parent and child continue to execute, they will both eventually close spans that were created prior to fork. And finally, as each process terminates, both will export their full trace tree (as a list of spans). This will result in two lists of span data being exported. The spans that are shared (those that were created prior to fork), will have conflicting end times (and possibly conflicting attributes). Most tracing backends store spans under (trace_id, span_id) and so the last one to be exported will typically win.

I also mentioned an additional wrinkle that prevents spans from being independent in the two processes. Span IDs are currently generated with dechex(mt_rand()). When the parent process forks, the internal state of the Mersenne Twister is the same for both the parent and the child. This means that as the parent and child create new spans, those spans are highly likely to share span IDs since the PRNG's are effectively seeded identically in the two processes.

Proposals for fixing the issues that arise during fork

Fixing the issue of exporting conflicting spans

I propose that each Span have an additional field that contains the pid of the process that created that span. The pid will be populated with getmypid(). Then all implementers of TraceInterface will have their spans() method updated to only include the SpanData for spans that were created in the current process. If you return to our previous example, this would mean that the parent process would be responsible for exporting all spans that were created prior to fork and any spans created in the parent after fork. The child process would only export the spans that were created in the child after fork. This solution does have some trade-offs in that any attributes or annotations added to a span created in a different process will be lost. So, as an example, the child process would lose the ability to add attributes to the root span since it will never export that span. Since the spans have the pid, this situation is possible to detect but it's not clear how best to surface that error condition to the user.

Fixing the issue of spans not being independent across processes

If you follow IdGeneratorTrait::generateTraceId down into ramsey/uuid, you'll find that trace ids are already generated using the equivalent of bin2hex(random_bytes(16)). If span ids were generated similarly, we would not longer depend on internal state of a PRNG. So I'm proposing that Span::generateSpanId be implemented as bin2hex(random_bytes(4)) instead of dechex(mt_rand()). I looked through the commit history to try to determine if it was considered important to avoid using the CSPRNG for span id creation and was unable to find anything. If that turns out to be the case, there is also the option of hashing the pid into the span_id. Using random_bytes, however, seems like the most straightforward approach.

Key Questions

Is opencensus-php interested in addressing this use case of tracing in processes that fork?
If so, are there any reservations to the 2 proposed fixes?
Are there other issues that I have not accounted for in this proposal?

The text was updated successfully, but these errors were encountered:

jcchavezs · 2020-05-20T08:48:47Z

Chiming in as this is particularly interesting for me as a maintainer of zipkin-php. Both solutions look fine, tho a third one comes to my mind. I think that if you make this a reporting problem everything will be easier. The way I envision the solution is: - Do not change spans - Create a custom reporter with the parent process id as attribute that does: a. If the reporter.process_id != current_process_id then you instead of reporting, dump all spans into a file b. If reporter.process_id == current_process_id along with the spans you are reporting, read all children dump files and merge that information. There could be some conflicts to solve when merging the data (not much and mostly when merging metadata or tags), in that case you can annotate spans with the PID and then making either child or parent process to take precedence depending on who created it. It looks hacky but I tried to solve a similar issue in the past and this was the only way.

…

On Tue, 19 May 2020, 19:38 Kelly Norton, ***@***.***> wrote: We're currently using opencensus-php to trace in a codebase that makes considerable use of pcntl_fork in the cli sapi as a means to do concurrent batch work. As you might expect, there are a few issues that arise from tracing a process that forks itself into child processes. I'm going to try to enumerate a few of those issues and propose some solutions to each. However, before I talk about the details, I do want to pose a critical question. Is opencensus-php interested in addressing this use case? For us (us here is https://mailchimp.com/), we'll have to address this in some form since much of our batch infrastructure currently depends on process forking and as much as that causes us headaches, we're stuck with it for the foreseeable future. I don't know how prevalent forking is in production PHP codebases, so I can't really make a claim here about how useful this is to the community, in general. With that out of the way, let's talk about some details of tracing within processes that fork. What happens right now when you fork while tracing. Let's just consider a simple example. We have a parent process that has an enabled SpanContext. Let's say that parent process creates a few spans and then forks itself into a single child process. At the point of forking, the child process will inherit all of the spans that were created prior to forking. After the fork, the parent and the child can continue to create spans and those spans are accumulated independently in each process (There is one additional wrinkle that prevents these spans from being independent, but I'll discuss that later). As the parent and child continue to execute, they will both eventually close spans that were created prior to fork. And finally, as each process terminates, both will export their full trace tree (as a list of spans). This will result in two lists of span data being exported. The spans that are shared (those that were created prior to fork), will have conflicting end times (and possibly conflicting attributes). Most tracing backends store spans under (trace_id, span_id) and so the last one to be exported will typically win. I also mentioned an additional wrinkle that prevents spans from being independent in the two processes. Span IDs are currently generated with dechex(mt_rand()). When the parent process forks, the internal state of the Mersenne Twister is the same for both the parent and the child. This means that as the parent and child create new spans, those spans are highly likely to share span IDs since the PRNG's are effectively seeded identically in the two processes. Proposals for fixing the issues that arise during fork Fixing the issue of exporting conflicting spans I propose that each Span have an additional field that contains the pid of the process that created that span. The pid will be populated with getmypid(). Then all implementers of TraceInterface will have their spans() method updated to only include the SpanData for spans that were created in the current process. If you return to our previous example, this would mean that the parent process would be responsible for exporting all spans that were created prior to fork and any spans created in the parent after fork. The child process would only export the spans that were created in the child after fork. This solution does have some trade-offs in that any attributes or annotations added to a span created in a different process will be lost. So, as an example, the child process would lose the ability to add attributes to the root span since it will never export that span. Since the spans have the pid, this situation is possible to detect but it's not clear how best to surface that error condition to the user. Fixing the issue of spans not being independent across processes If you follow IdGeneratorTrait::generateTraceId down into ramsey/uuid, you'll find that trace ids are already generated using the equivalent of bin2hex(random_bytes(16)). If span ids were generated similarly, we would not longer depend on internal state of a PRNG. So I'm proposing that Span::generateSpanId be implemented as bin2hex(random_bytes(4)) instead of dechex(mt_rand()). I looked through the commit history to try to determine if it was considered important to avoid using the CSPRNG for span id creation and was unable to find anything. If that turns out to be the case, there is also the option of hashing the pid into the span_id. Using random_bytes, however, seems like the most straightforward approach. Key Questions 1. Is opencensus-php interested in addressing this use case of tracing in processes that fork? 2. If so, are there any reservations to the 2 proposed fixes? 3. Are there other issues that I have not accounted for in this proposal? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#257>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAXOYAVKSQEZRBNUVMFDPNLRSK7Y3ANCNFSM4NFGUERA> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Addressing tracing in processes that fork. #257

Addressing tracing in processes that fork. #257

kellegous commented May 19, 2020

jcchavezs commented May 20, 2020 via email

Addressing tracing in processes that fork. #257

Addressing tracing in processes that fork. #257

Comments

kellegous commented May 19, 2020

Is opencensus-php interested in addressing this use case?

What happens right now when you fork while tracing.

Proposals for fixing the issues that arise during fork

Fixing the issue of exporting conflicting spans

Fixing the issue of spans not being independent across processes

Key Questions

jcchavezs commented May 20, 2020 via email