1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169
//! Inferno is a set of tools that let you to produce [flame graphs] from performance profiles of //! your application. It's a port of parts Brendan Gregg's original [flamegraph toolkit] that aims //! to improve the performance of the original flamegraph tools and provide programmatic access to //! them to facilitate integration with _other_ tools (like [not-perf]). //! //! Inferno, like the original flame graph toolkit, consists of two "stages": stack collapsing and //! plotting. In the original Perl implementations, these were represented by the `stackcollapse-*` //! binaries and `flamegraph.pl` respectively. In Inferno, collapsing is available through the //! [`collapse`] module and the `inferno-collapse-*` binaries, and plotting can be found in the //! [`flamegraph`] module and the `inferno-flamegraph` binary. //! //! # Command-line use //! //! ## Collapsing stacks //! //! Most sampling profilers (as opposed to [tracing profilers]) work by repeatedly recording the //! state of the [call stack]. The stack can be sampled based on a fixed sampling interval, based //! on [hardware or software events], or some combination of the two. In the end, you get a series //! of [stack traces], each of which represents a snapshot of where the program was at different //! points in time. //! //! Given enough of these snapshots, you can get a pretty good idea of where your program is //! spending its time by looking at which functions appear in many of the traces. To ease this //! analysis, we want to "collapse" the stack traces so if a particular trace occurs more than //! once, we instead just keep it _once_ along with a count of how many times we've seen it. This //! is what the various collapsing tools do! You'll sometimes see the resulting tuples of stack + //! count called a "folded stack trace". //! //! Since profiling tools produce stack traces in a myriad of different formats, and the flame //! graph plotter expects input in a particular folded stack trace format, each profiler needs a //! separate collapse implementation. While the original Perl implementation supports _lots_ of //! profilers, Inferno currently only supports four: the widely used [`perf`] tool (specifically //! the output from `perf script`), [DTrace], [sample], and [VTune]. Support for xdebug is //! [hopefully coming soon], and [`bpftrace`] should get [native support] before too long. //! //! Inferno supports profiles from applications written in any language, but we'll walk through an //! example with a Rust program. To profile a Rust application, you would first set //! //! ```toml //! [profile.release] //! debug = true //! ``` //! //! in your `Cargo.toml` so that your profile will have useful function names and such included. //! Then, compile with `--release`, and then run your favorite performance profiler: //! //! ### perf (Linux) //! //! ```console //! # perf record --call-graph dwarf target/release/mybin //! $ perf script | inferno-collapse-perf > stacks.folded //! ``` //! //! For more advanced uses, see Brendan Gregg's excellent [perf examples] page. //! //! ### DTrace (macOS) //! //! ```console //! $ target/release/mybin & //! $ pid=$! //! # dtrace -x ustackframes=100 -n "profile-97 /pid == $pid/ { @[ustack()] = count(); } tick-60s { exit(0); }" -o out.user_stacks //! $ cat out.user_stacks | inferno-collapse-dtrace > stacks.folded //! ``` //! //! For more advanced uses, see also upstream FlameGraph's [DTrace examples]. //! You may also be interested in something like [NodeJS's ustack helper]. //! //! ### sample (macOS) //! //! ```console //! $ target/release/mybin & //! $ pid=$! //! $ sample $pid 30 -file sample.txt //! $ inferno-collapse-sample sample.txt > stacks.folded //! ``` //! //! ### VTune (Windows and Linux) //! //! ```console //! $ amplxe-cl -collect hotspots -r resultdir -- target/release/mybin //! $ amplxe-cl -R top-down -call-stack-mode all -column=\"CPU Time:Self\",\"Module\" -report-out result.csv -filter \"Function Stack\" -format csv -csv-delimiter comma -r resultdir //! $ inferno-collapse-vtune result.csv > stacks.folded //! ``` //! //! ## Producing a flame graph //! //! Once you have a folded stack file, you're ready to produce the flame graph SVG image. To do so, //! simply provide the folded stack file to `inferno-flamegraph`, and it will print the resulting //! SVG. Following on from the example above: //! //! ```console //! $ cat stacks.folded | inferno-flamegraph > profile.svg //! ``` //! //! And then open `profile.svg` in your viewer of choice. //! //! ## Differential flame graphs //! //! You can debug CPU performance regressions with the help of differential flame graphs. //! They let you easily visualize the differences between two profiles performed before and //! after a code change. See Brendan Gregg's [differential flame graphs] blog post for a great //! writeup. To create one you must first pass the two folded stack files to `inferno-diff-folded`, //! then send the output to `inferno-flamegraph`. Example: //! //! ```console //! $ inferno-diff-folded folded1 folded2 | inferno-flamegraph > diff2.svg //! ``` //! //! The flamegraph will be colored based on higher samples (red) and smaller samples (blue). The //! frame widths will be based on the 2nd folded profile. This might be confusing if stack frames //! disappear entirely; it will make the most sense to ALSO create a differential based on the 1st //! profile widths, while switching the hues. To do this, reverse the order of the input files //! and pass the `--negate` flag to `inferno-flamegraph` like this: //! //! ```console //! $ inferno-diff-folded folded2 folded1 | inferno-flamegraph --negate > diff1.svg //! ``` //! //! # Development //! //! This crate was initially developed through [a series of live coding sessions]. If you want to //! contribute to the code, that may be a good way to learn why it's all designed the way it is! //! //! [flame graphs]: http://www.brendangregg.com/flamegraphs.html //! [flamegraph toolkit]: https://github.com/brendangregg/FlameGraph //! [not-perf]: https://github.com/nokia/not-perf //! [tracing profilers]: https://danluu.com/perf-tracing/ //! [call stack]: https://en.wikipedia.org/wiki/Call_stack //! [hardware or software events]: https://perf.wiki.kernel.org/index.php/Tutorial#Events //! [stack traces]: https://en.wikipedia.org/wiki/Stack_trace //! [`perf`]: https://perf.wiki.kernel.org/index.php/Main_Page //! [DTrace]: https://www.joyent.com/dtrace //! [hopefully coming soon]: https://twitter.com/DanielLockyer/status/1094605231155900416 //! [native support]: https://github.com/jonhoo/inferno/issues/51#issuecomment-466732304 //! [`bpftrace`]: https://github.com/iovisor/bpftrace //! [perf examples]: http://www.brendangregg.com/perf.html //! [DTrace examples]: http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html#DTrace //! [NodeJS's ustack helper]: http://dtrace.org/blogs/dap/2012/01/05/where-does-your-node-program-spend-its-time/ //! [a series of live coding sessions]: https://www.youtube.com/watch?v=jTpK-bNZiA4&list=PLqbS7AVVErFimAvMW-kIJUwxpPvcPBCsz //! [differential flame graphs]: http://www.brendangregg.com/blog/2014-11-09/differential-flame-graphs.html //! [sample]: https://gist.github.com/loderunner/36724cc9ee8db66db305#profiling-with-sample //! [VTune]: https://software.intel.com/en-us/vtune-amplifier-help-command-line-interface #![deny(missing_docs)] #![cfg_attr(all(test, feature = "nightly"), feature(test))] #[cfg(all(test, feature = "nightly"))] extern crate test; /// Stack collapsing for various input formats. /// /// See the [crate-level documentation] for details. /// /// [crate-level documentation]: ../index.html pub mod collapse; /// Tool for creating an output required to generate differential flame graphs. /// /// See the [crate-level documentation] for details. /// /// [crate-level documentation]: ../index.html pub mod differential; /// Tools for producing flame graphs from folded stack traces. /// /// See the [crate-level documentation] for details. /// /// [crate-level documentation]: ../index.html pub mod flamegraph;