-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pstats macro #6
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/LinuxPerf.jl
Outdated
error("unknown option: $(opt)") | ||
end | ||
end | ||
return (events = events, exclude_kernel = exclude_kernel,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The keyword argument right now is called userspace_only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there is a rationale behind renaming exclude_kernel
to userspace_only
. The original name of the perf_event_attr
struct is exclude_kernel
, which seems enough natural to me. Also, we have exclude_user
, exclude_hv
(hypervisor), exclude_idle
, and so on. I'm not sure the meaning of userspace_only
in this situation.
src/LinuxPerf.jl
Outdated
@@ -334,18 +345,378 @@ const reasonable_defaults = | |||
[EventType(:cache, :L1_data, :write, :access), | |||
EventType(:cache, :L1_data, :write, :miss)]=#] | |||
|
|||
function make_bench(x) | |||
function make_bench(x; kwargs...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should follow perf here and add proper modifiers to the events https://perf.wiki.kernel.org/index.php/Tutorial#Modifiers
so allowing :u
for userspace only, instead of making it a keywordargument to the EventGroup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the logic behind these event modifiers: https://elixir.bootlin.com/linux/v5.7.2/source/tools/perf/util/parse-events.c#L1731. So, a better design to support these modifiers would be removing the userspace_only
keyword argument of the EventGroup
constructor and augmenting the EventType
struct with modifier(s).
Yes, we can remove |
Supporting raw events should be easy. I'd like to follow the syntax of the perf command, which specifies event numbers like
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @bicycle1885 do you want to finish up this PR and add raw events in a separate PR?
Yes, I think adding raw events to this pull request is too much. I found a few problems in the current pr so please give me a moment. |
I think it is now in good shape! Here is a highlight of changes I added yesterday. Before those changes, count statistics except from the main thread are just ignored. This is not good because some multithreaded routines such as matrix multiplication just lose some valuable information. The current
|
This is a port of my
@pstats
macro I post on Twitter. This is still a work in progress, but I would like to get your feedback.Here is a simple test function called
tarai
.And this is an example of default behavior compared to
@measure
:The most important difference is
@pstats
scales the counter by the reciprocal of the ratio of time_running and time_enabled while@measure
shows the raw counter value. In my opinion, scaled values are easier to understand and can be easily compared to other runs.Also, some comments are added on the right hand side of the table. These are useful because you can easily notice stalled cycles and/or branch prediction misses. These values are computed on demand and not explicitly stored in a statistics object.
You can focus on specific events you're interested in by passing a string to the macro as follows:
Events in a paired parens are grouped together: they are measured as a single unit of execution and thus their values can be compared to each other. In the case above, "instructions" and "cpu-cycles" are grouped, and "task-clock" is a singleton group. Groups are also shown in the reported result using box-drawing characters (the leftmost column of the table).
The event names are taken from the perf command-line tool because they would be more familiar to users of the tool. Aliases are not supported yet.
I'm going to add more tweaks and docs to this pull request. I appreciate your feedback.