Add Java Source context to stack frames #46497

adinauer · 2023-03-29T09:39:48Z

Adds a new processor to the JavaPlugin called JavaSourceLookupStacktraceProcessor which looks up stack frames in available source bundles and tries to find the source code line plus some context lines.

JavaSourceLookupStacktraceProcessor wraps the existing JavaStacktraceProcessor as only the first processor that handles a certain frame is called but we need both of them to run. In case of obfuscated sources, JavaStacktraceProcessor deobfuscates them (possibly exploding a single frame into multiple) then we look up sources in JavaSourceLookupStacktraceProcessor.

We're using a new debug file type jvm (see getsentry/relay#2002 and getsentry/sentry-cli#1551). This causes tests to fail because other PRs haven't been merged and released yet.

getsentry/sentry-java#633

…cess the frame

codecov · 2023-04-13T07:08:44Z

Codecov Report

Merging #46497 (f09223f) into master (64b8c20) will increase coverage by 0.00%.
The diff coverage is 98.05%.

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #46497   +/-   ##
=======================================
  Coverage   80.73%   80.74%           
=======================================
  Files        4761     4761           
  Lines      201288   201386   +98     
  Branches    11626    11626           
=======================================
+ Hits       162512   162608   +96     
- Misses      38518    38520    +2     
  Partials      258      258

Impacted Files	Coverage Δ
src/sentry/lang/java/plugin.py	`98.31% <97.87%> (+2.91%)`	⬆️
src/sentry/lang/java/utils.py	`89.47% <100.00%> (+1.06%)`	⬆️

... and 5 files with indirect coverage changes

iker-barriocanal · 2023-04-13T10:22:25Z

This seems like stack trace processing so requesting a review from the processing team, as they own this part.

Swatinem

tests are not really happy :-(

Swatinem · 2023-04-13T13:46:05Z

src/sentry/lang/java/plugin.py

+        # TODO unable to use dif cache as file can't be recognized as ZIP by ArtifactBundleArchive(file)
+        difs = ProjectDebugFile.objects.find_by_debug_ids(self.project, self.images)
+
+        for new_frame in new_frames:
+            lineno = new_frame.get("lineno")
+            if not lineno:
+                continue
+
+            source_file_name = self._build_source_file_name(new_frame)
+
+            for key, dif in difs.items():
+                file = dif.file.getfile(prefetch=True)
+                archive = ArtifactBundleArchive(file)


we do a database query for every frame, and then open up the ArtifactBundleArchive for at least every frame (plus, every inlined frame and every debug-id).

We should definitely do this in the constructor of this class, or lazily on-demand.
Are we guaranteed to only have a single debug-id by event? What would happen if we have more than one?

We should definitely do this in the constructor of this class, or lazily on-demand.

Moved it to init

Are we guaranteed to only have a single debug-id by event? What would happen if we have more than one?

It should try all of them until it's able to find the source file. Can add a test for this case.

Are we guaranteed to only have a single debug-id by event?

Not there could be multiple per event.

Swatinem · 2023-04-13T13:47:46Z

src/sentry/lang/java/plugin.py

+
+        platform = frame.get("platform") or self.data.get("platform")
+        self._handles_frame = (
+            platform == "java" and self.available and "module" in frame and "lineno" in frame


You are looking at the lineno of the raw frame, before proguard mapping. Which means this is out of sync with whatever frames you will actually process in process_frame.

I assumed that even with proguard if there's no line number in the first place we'd be unable to produce one via proguard mapping. No lineno means no source lookup possible. Maybe my assumption is wrong. I can remove the lineno check. Though the check here doesn't mean we use the "raw" lineno for lookup. In case JavaStacktraceProcessor (doing proguard mapping) remaps it, we take the remapped lineno. This behaviour is also tested by the last test.

…n see tests passing

src/sentry/lang/java/utils.py

Swatinem · 2023-04-14T08:56:13Z

src/sentry/lang/java/plugin.py

@@ -138,16 +138,23 @@ def __init__(self, *args, **kwargs):
        self._handles_frame = None
        self.images = get_jvm_images(self.data)
        self.available = len(self.images) > 0
+        difs = ProjectDebugFile.objects.find_by_debug_ids(self.project, self.images)
+        self._archives = {}
+        for key, dif in difs.items():


you never use the key anywhere, so I think you can just remove it and use a list instead of a dict.

Changed it to a list

antonpirker

I think this is good and can bemerged IF:

the Query ProjectDebugFile.objects.find_by_debug_ids() is ok. (Read somewhere that because of hybrid cloud there a new API to use to query stuff, but I am not sure if this is a concern here. Someone with more knowledge of how all this works should be consulted about this)
it is OK to load one (or maybe multiple ArtifactBundleArchive(file) into memory on start of the JavaSourceLookupStacktraceProcessor. Again someone with more knowledge about how this is run should be asked if there is a concern of using too much memory here.

Swatinem · 2023-04-14T11:24:40Z

it is OK to load one (or maybe multiple ArtifactBundleArchive(file) into memory on start of the JavaSourceLookupStacktraceProcessor. Again someone with more knowledge about how this is run should be asked if there is a concern of using too much memory here.

I believe that is fine. The JS processor and lookup API is also loading a couple of bundles at the same time.

src/sentry/lang/java/utils.py

…o list; other cleanup

Add Java Source context to stack frames

18de972

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 29, 2023

vercel bot deployed to Preview March 29, 2023 09:40 View deployment

adinauer mentioned this pull request Mar 29, 2023

Java Source Context getsentry/sentry-java#633

Closed

remove commented out code

445362a

vercel bot deployed to Preview March 29, 2023 09:48 View deployment

adinauer added 4 commits April 3, 2023 10:10

Do not provide source context for line no 0

c646645

Delegate to JavaStacktraceProcessor as only the first one gets to pro…

5ab6cb0

…cess the frame

Merge branch 'master' into feat/java-source-context

3e540e3

TODO replace sourcemap with jvm; add tests

ba9229c

vercel bot deployed to Preview April 13, 2023 06:41 View deployment

adinauer added 2 commits April 13, 2023 08:47

Merge branch 'master' into feat/java-source-context

64e614b

change import

ad0602f

vercel bot deployed to Preview April 13, 2023 06:54 View deployment

Restore jvm debug file type

d09ca78

vercel bot deployed to Preview April 13, 2023 07:26 View deployment

adinauer marked this pull request as ready for review April 13, 2023 07:30

adinauer requested a review from a team April 13, 2023 07:30

iker-barriocanal requested a review from a team April 13, 2023 10:21

loewenheim approved these changes Apr 13, 2023

View reviewed changes

Swatinem reviewed Apr 13, 2023

View reviewed changes

adinauer added 4 commits April 14, 2023 08:38

Test using multiple source bundles

0eba657

Load DIFs and open archives in init

3f3fcb8

Remove lineno check from handles_frame

99df832

TODO revert; Change back to sourcemap debug file type so reviewers ca…

74a8925

…n see tests passing

vercel bot deployed to Preview April 14, 2023 07:24 View deployment

CR

12b13ca

Swatinem reviewed Apr 14, 2023

View reviewed changes

antonpirker approved these changes Apr 14, 2023

View reviewed changes

Swatinem approved these changes Apr 14, 2023

View reviewed changes

src/sentry/lang/java/utils.py Outdated Show resolved Hide resolved

Move loading of archives from init to preprocess; convert _archives t…

c0e8fe1

…o list; other cleanup

vercel bot deployed to Preview April 17, 2023 06:22 View deployment

Merge branch 'master' into feat/java-source-context

e632cac

vercel bot deployed to Preview April 17, 2023 06:26 View deployment

Use jvm debug file type

3bb96a6

vercel bot deployed to Preview April 17, 2023 06:54 View deployment

Swatinem approved these changes Apr 17, 2023

View reviewed changes

Merge branch 'master' into feat/java-source-context

f09223f

vercel bot deployed to Preview April 18, 2023 05:11 View deployment

adinauer merged commit edd5420 into master Apr 18, 2023
54 checks passed

adinauer deleted the feat/java-source-context branch April 18, 2023 08:42

github-actions bot locked and limited conversation to collaborators May 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Java Source context to stack frames #46497

Add Java Source context to stack frames #46497

adinauer commented Mar 29, 2023 •

edited

codecov bot commented Apr 13, 2023 •

edited

iker-barriocanal commented Apr 13, 2023

Swatinem left a comment

Swatinem Apr 13, 2023

adinauer Apr 14, 2023 •

edited

adinauer Apr 14, 2023

Swatinem Apr 13, 2023

adinauer Apr 14, 2023

Swatinem Apr 14, 2023

adinauer Apr 17, 2023

antonpirker left a comment •

edited

Swatinem commented Apr 14, 2023

Add Java Source context to stack frames #46497

Add Java Source context to stack frames #46497

Conversation

adinauer commented Mar 29, 2023 • edited

codecov bot commented Apr 13, 2023 • edited

Codecov Report

iker-barriocanal commented Apr 13, 2023

Swatinem left a comment

Choose a reason for hiding this comment

Swatinem Apr 13, 2023

Choose a reason for hiding this comment

adinauer Apr 14, 2023 • edited

Choose a reason for hiding this comment

adinauer Apr 14, 2023

Choose a reason for hiding this comment

Swatinem Apr 13, 2023

Choose a reason for hiding this comment

adinauer Apr 14, 2023

Choose a reason for hiding this comment

Swatinem Apr 14, 2023

Choose a reason for hiding this comment

adinauer Apr 17, 2023

Choose a reason for hiding this comment

antonpirker left a comment • edited

Choose a reason for hiding this comment

Swatinem commented Apr 14, 2023

adinauer commented Mar 29, 2023 •

edited

codecov bot commented Apr 13, 2023 •

edited

adinauer Apr 14, 2023 •

edited

antonpirker left a comment •

edited