OverOps' Error Analysis Screen provides a powerful mechanism to get to the root cause of errors and exceptions in production and staging environments. The screen is divided into a number of sections, each providing information about the error to create a complete picture of the cause and impact of the error.
The Error Analysis screen
The analytics pane is located on the top left of the screen, and provides important details relating to the impact of the error in the application(s).
You can find the type of error, when it was first seen, the server/application it happened in,and how many times it has occurred.
The Error Graph is located at the top of the page and was built to provide context for the error by showing its trend across servers/applications in a specific timeframe. Learn more about the Error Analysis Graph.
You can click the JVM or Server labels to directly to see the error volume that machine or application specifically. You can also hover over the occurrences label to see the number of times this error occurred and out of how many calls into the method containing it.
You can use the button in the top right of the screen to open and select this error in the context of OverOps' main dashboard. You can click "Go to snapshot" on any point in the graph to jump and see the code and variable state at that moment in time.
The analytics panel on the left and the chart on the right showing the volume of the error.
The Call Stack Pane shows you the chain of methods within the JVM leading to the error. The topmost method denotes the last method on non 3rd party code within your application leading to the error. If the method contains a it means variable state has been captured by the JVM micro-agent.
In case an exception that is caught and re-thrown once or even multiple times within the context of a thread, you can see the error analysis for these exceptions using the Related Errors drop-down (this drop-down will only shown if such related exceptions exist).
At the bottom of the stack you can see the machine name and the JVM thread name for the thread in which this error occurred. 3rd party code is hidden by default and can expand by toggling "Show 3rd party methods" on the bottom of the stack. You can also use the "COPY STACK" button to copy the full stack to the clipboard.
The call stack pane
The Source Code View shows by default a decompiled Java version of the bytecode which was executing within the JVM at the moment of error. You can hover over any highlighted variable to see its value and jump to see its full contents within the variable grid.
The line in which the error occurred will also be highlighted as depicted below. Above the code pane, you can see the full error message and the time in which it occurred.
You can also easily configure OverOps to use your own source code instead of decompiling it from the JVM.
You can search for any variable name or value in the source code or variable grid using the box. Click here to learn more about variable search.
The source code and variable state pane
The Variable State Grid shows the variable values and objects accessible from the method. Objects can be explored up to 5 levels deep into the heap. You can click the "..." ellipsis button next to every object to see its entire contents as a JSOB object, which you can easily copy to clipboard.
The variable grid contains all local variables and parameters (including "this" in non-static methods). The first method will also contain thread-local variables defined for this thread as well as SLF4J and Log4J Mapped Diagnostics Context (MDC) values.
Which variables are collected and at what depth (i.e. how many variables to collect, the number of items to collect from a collection, the length of string to capture, etc..) is determined by the micro-agent to collect the most relevant variables within an allocated timeframe using an adaptive machine learning algorithm.
Click here to learn more about object and variable state.
(1) The variable grid displaying the variables state within the current method as well as thread local variables
(2). The JSON representation of an object available through the "..." ellipsis button.
The Log View shows the last 250 log statements leading up to the error. As these statements are collected directly from JVM memory, you can see any DEBUG, TRACE or INFO statements regardless of whether or not they were logged to file.
You can reach this view by clicking the button to switch between code and log view.
Click here to learn more about the Log View.
The Log View pane showing the last 250 log statements leading to this error.
For each error and exception captured, OverOps will also display a JVM view, which shows the internal JVM state at the moment of error, including memory usage (both heap and non-heap), basic system information, CPU usage and much more. This provides an easy way to work with OverOps' code (“classic”), log and JVM data without needing to leave the application.
Click here to learn more about the JVM View.
The Actions Toolbar provides a set of capabilities to share, mark and search through the error analysis contents.
- Send to JIRA - Enables you to create new JIRA issues for an error linking directly to the source, stack, state and statistics. Click here to learn more about JIRA integration.
- Hide - Marks an error with "Hide", meaning it will no longer appear in dashboard event list and chart. Furthermore, the micro-agents will no longer capture error analysis snapshots. The error will appear under the "Archive" label in the dashboard where it can be-unhidden. Click here to learn more about hiding errors.
- Resolve - Marks an error as "Resolved", meaning it has been fixed and will be removed from the dashboard's event list and chart. However, should this error occur after a new code deployment, it will be tagged as "Resurfaced", you will receive an email notification, and it will return to the event list and chart. Click here to learn more about resolving errors.
- Label - Adds a label to an error. Labels are a great tool for classifying and tagging errors with tags such as "Critical" or "Low" to assign priority, "John" or "AQ" to assign responsibility, or "V1 RC2" to denote a version. Learn more about creating and assigning labels here.
- Edit Note - Enables you to attach a note to an error and share it with your teammates alerting them by email . Click here to learn more about sharing with teammates.