Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stacktrace gripes #33065

Open
antoine-levitt opened this issue Aug 24, 2019 · 61 comments
Open

Stacktrace gripes #33065

antoine-levitt opened this issue Aug 24, 2019 · 61 comments
Labels
domain:display and printing Aesthetics and correctness of printed representations of objects. domain:error handling Handling of exceptions by Julia or the user

Comments

@antoine-levitt
Copy link
Contributor

Stacktraces are in general pretty hard to read. Here's a list of gripes. I'm not posting this as separate issues because I don't know what is bug and what is feature here. All tests are on 1.3alpha from an official release. This is related but not the same as #25375, which is about formatting.

Let's start small:

julia> sqrt(-2)
ERROR: DomainError with -2.0:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
Stacktrace:
 [1] throw_complex_domainerror(::Symbol, ::Float64) at ./math.jl:32
 [2] sqrt at ./math.jl:492 [inlined]
 [3] sqrt(::Int64) at ./math.jl:518
 [4] top-level scope at REPL[4]:1
  • Where is ./math.jl?
  • I don't care that [2] is inlined
  • The last frame is unnecessary (the :1 definitely is, and I don't know what the [4] is)

Now a stdlib:

julia> lu([0 0;0 0])
ERROR: SingularException(1)
Stacktrace:
 [1] checknonsingular at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/LinearAlgebra/src/factorization.jl:19 [inlined]
 [2] #lu!#128(::Bool, ::typeof(lu!), ::Array{Float64,2}, ::Val{true}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/LinearAlgebra/src/lu.jl:85
 [3] #lu! at ./none:0 [inlined]
 [4] #lu#134(::Bool, ::typeof(lu), ::Array{Int64,2}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/LinearAlgebra/src/lu.jl:295
 [5] lu(::Array{Int64,2}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/LinearAlgebra/src/lu.jl:286
 [6] top-level scope at REPL[2]:1
  • The paths are bad; that means that nice things like ctrl-q will not work, and neither will the functionality in some editors that go to the path.
  • The keyword arguments lowering, which should really be an implementation detail, is exposed here, and confusing (wtf is [3]?)

included user files:

julia> include("scratch.jl")
ERROR: LoadError: UndefVarError: qsd not defined
Stacktrace:
 [1] f() at /home/antoine/scratch.jl:2
 [2] top-level scope at /home/antoine/scratch.jl:4
 [3] include at ./boot.jl:328 [inlined]
 [4] include_relative(::Module, ::String) at ./loading.jl:1094
 [5] include(::Module, ::String) at ./Base.jl:31
 [6] include(::String) at ./client.jl:432
 [7] top-level scope at REPL[3]:1
in expression starting at /home/antoine/scratch.jl:4
  • Five frames for the include are hard to read
  • Having the information at the bottom is confusing: where am I supposed to find the source of the error? In the stacktrace or at the bottom?
  • The LoadError: UndefVarError is annoying (it gets worse for more complex cases, ie multiple includes)

Failing tests:

ERROR: LoadError: LoadError: LoadError: DomainError with -2.0:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
Stacktrace:
 [1] throw_complex_domainerror(::Symbol, ::Float64) at ./math.jl:32
 [2] sqrt at ./math.jl:492 [inlined]
 [3] sqrt(::Int64) at ./math.jl:518
 [4] @SArray(::LineNumberNode, ::Module, ::Any) at /home/antoine/.julia/packages/StaticArrays/3KEjZ/src/SArray.jl:83
 [5] include at ./boot.jl:328 [inlined]
 [6] include_relative(::Module, ::String) at ./loading.jl:1094
 [7] include(::Module, ::String) at ./Base.jl:31
 [8] include(::String) at ./client.jl:432
 [9] top-level scope at /home/antoine/.julia/packages/StaticArrays/3KEjZ/test/runtests.jl:19
 [10] include at ./boot.jl:328 [inlined]
 [11] include_relative(::Module, ::String) at ./loading.jl:1094
 [12] include(::Module, ::String) at ./Base.jl:31
 [13] include(::String) at ./client.jl:432
 [14] top-level scope at none:5
in expression starting at /home/antoine/.julia/packages/StaticArrays/3KEjZ/test/SArray.jl:30
in expression starting at /home/antoine/.julia/packages/StaticArrays/3KEjZ/test/SArray.jl:1
in expression starting at /home/antoine/.julia/packages/StaticArrays/3KEjZ/test/runtests.jl:19
ERROR: Package StaticArrays errored during testing

I'm a bit confused about what's going on here to be honest...

I did not try with the other ways of including code, macros, etc. So, to sum up, in subjective increasing order of seriousness

  • There are minor printing things that could be nicer: REPL frames, LoadError nesting, inline, ./ path for base
  • The keyword argument lowering is a bit nasty
  • The paths on official releases should be corrected
  • The stacktraces with include are very hard to parse. This is possibly the most controversial issue because enhancing this would mean "lying" to the user, ie treating include as a special function
@KristofferC
Copy link
Sponsor Member

It is a bit hard to understand what is rhetorical questions and what are actual questions so just throwing some answers out there.

Where is ./math.jl?

Everything in Base is written like this. Writing the full path would be kinda annoying but perhaps there could be a better way to write it.

The last frame is unnecessary (the :1 definitely is, and I don't know what the [4] is)

It isn't, see #25467. Why would the :1 be unnecessary, we need it in e.g. the debugger. The [4] is the execution count number of the REPL.

The paths are bad; that means that nice things like ctrl-q will not work, and neither will the functionality in some editors that go to the path.

This is #26314.

There is always a contention between showing the truth and what most users would find useful. Personally, I think that hiding stuff like keyword argument wrapper functions by default is the right thing to do.

@antoine-levitt
Copy link
Contributor Author

It isn't, see #25467. Why would the :1 be unnecessary, we need it in e.g. the debugger. The [4] is the execution count number of the REPL.

Ah, didn't think of multiline input, thanks. Makes sense to print it along with the frame then. The [4] could maybe go?

Everything in Base is written like this. Writing the full path would be kinda annoying but perhaps there could be a better way to write it.

I agree that it would be annoying to have full paths. Why not use ./ as the julia path (ie the top level in the source distribution)? Then base could be ./base/whatever.jl and stdlib ./stdlib/Lib/src/, fixing #26314 also.

@timholy
Copy link
Sponsor Member

timholy commented Aug 24, 2019

I don't care that [2] is inlined

It says that because inlining breaks the ability to show the type information. (Compare the output for the non-inlined calls, there you see the argument types.)

Five frames for the include are hard to read

Julia's dispatch sometimes goes pretty deep. It's both a strength (it's the foundation of abstraction) and a curse (makes things complicated). Not showing the truth has obvious downsides.

There do seem to be a couple of actionable items:

@KristofferC
Copy link
Sponsor Member

#32763 doesn't actually help with stacktraces. But it could probably be extended to do that.

@StefanKarpinski
Copy link
Sponsor Member

  • tricky to decide about showing the internals of kw dispatch. I'd actually lean towards suppressing it by default, subject to an ENV variable. It's potentially breaking, though, for packages that test the returned stacktrace.

Can't we keep keyword calls in the stacktrace object but just not print them?

@KristofferC
Copy link
Sponsor Member

Can't we keep keyword calls in the stacktrace object but just not print them?

Yes, we are for example already scrubbing the backtrace before printing it:

julia/base/client.jl

Lines 82 to 101 in cef655a

function ip_matches_func(ip, func::Symbol)
for fr in StackTraces.lookup(ip)
if fr === StackTraces.UNKNOWN || fr.from_c
return false
end
fr.func === func && return true
end
return false
end
function scrub_repl_backtrace(bt)
if bt !== nothing
# remove REPL-related frames from interactive printing
eval_ind = findlast(addr->ip_matches_func(addr, :eval), bt)
if eval_ind !== nothing
return bt[1:eval_ind-1]
end
end
return bt
end
.

@antoine-levitt
Copy link
Contributor Author

antoine-levitt commented Aug 25, 2019

It says that because inlining breaks the ability to show the type information. (Compare the output for the non-inlined calls, there you see the argument types.)

I see, thanks! How about [2] sqrt(inlined) at ./math.jl:492 to make that more explicit?

Julia's dispatch sometimes goes pretty deep. It's both a strength (it's the foundation of abstraction) and a curse (makes things complicated). Not showing the truth has obvious downsides.

Which practical downsides would that be in this particular example, if we, say, replaced

julia> include("scratch.jl")
ERROR: LoadError: UndefVarError: qsd not defined
Stacktrace:
 [1] f() at /home/antoine/scratch.jl:2
 [2] top-level scope at /home/antoine/scratch.jl:4
 [3] include at ./boot.jl:328 [inlined]
 [4] include_relative(::Module, ::String) at ./loading.jl:1094
 [5] include(::Module, ::String) at ./Base.jl:31
 [6] include(::String) at ./client.jl:432
 [7] top-level scope at REPL[3]:1
in expression starting at /home/antoine/scratch.jl:4

by

julia> include("scratch.jl")
ERROR: LoadError: UndefVarError: qsd not defined
Stacktrace:
 [1] f() at /home/antoine/scratch.jl:2
 [2] top-level scope at /home/antoine/scratch.jl:4
     included from REPL[3]:1
 [3] top-level scope at REPL[3]:1

or similar ?

Sure, it's nice to know that include is just a plain julia function, but do I really need to be reminded of that every time I look at a stacktrace? Given that we already scrub the stacktraces for other purposes, why not do it here? That means accepting that the stacktraces print differently from their internal representation (which we already do), more annoying hacking on include, and having to maintain some code that relies on the internal representation of include. That code is modified very infrequently, so that seems like a small to pay for massively improving stacktraces for everyone.

someone should fix #26314, which seems to be done by #32763. @antoine-levitt, would you care to help out by tacking @which? That seems to be the holdup to merging it.

I think the proper fix here is to use relative paths, which would also fix the consistency with base, see comment above.

To be clear, this thread is not entirely about me complaining idly. If there's a relative consensus (or at least an indication that a PR might be merged) on the easy items proposed above (inline, REPL execution counts), I can do a PR. I can try to do relative paths also, with no promises on timeframe. Kwargs and include are above my paygrade.

@timholy
Copy link
Sponsor Member

timholy commented Aug 25, 2019

All good points. I have definitely seen folks be turned off by the complexity of our stacktraces, this seems to be a necessary discussion.

How about [2] sqrt(inlined) at ./math.jl:492 to make that more explicit?

An even more radical proposal: suppress type information for all calls. That would solve problems like this:

julia> @noinline foo(A) = error("fake")
foo (generic function with 1 method)

julia> foo(img)
ERROR: fake
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] foo(::ImageMetadata.ImageMeta{FixedPointNumbers.Normed{UInt16,14},3,AxisArrays.AxisArray{FixedPointNumbers.Normed{UInt16,14},3,SharedArrays.SharedArray{FixedPointNumbers.Normed{UInt16,14},3},Tuple{AxisArrays.Axis{:x,StepRangeLen{Quantity{Float64,𝐋,Unitful.FreeUnits{(μm,),𝐋,nothing}},Base.TwicePrecision{Quantity{Float64,𝐋,Unitful.FreeUnits{(μm,),𝐋,nothing}}},Base.TwicePrecision{Quantity{Float64,𝐋,Unitful.FreeUnits{(μm,),𝐋,nothing}}}}},AxisArrays.Axis{:l,StepRangeLen{Quantity{Float64,𝐋,Unitful.FreeUnits{(μm,),𝐋,nothing}},Base.TwicePrecision{Quantity{Float64,𝐋,Unitful.FreeUnits{(μm,),𝐋,nothing}}},Base.TwicePrecision{Quantity{Float64,𝐋,Unitful.FreeUnits{(μm,),𝐋,nothing}}}}},AxisArrays.Axis{:time,StepRangeLen{Quantity{Float64,𝐓,Unitful.FreeUnits{(s,),𝐓,nothing}},Base.TwicePrecision{Quantity{Float64,𝐓,Unitful.FreeUnits{(s,),𝐓,nothing}}},Base.TwicePrecision{Quantity{Float64,𝐓,Unitful.FreeUnits{(s,),𝐓,nothing}}}}}}},Dict{String,Any}}) at ./REPL[32]:1
 [3] top-level scope at REPL[33]:1

I'm going to start trying to notice whether I find the type information in the stack trace useful in my own debugging. We do have other tools for this available now, it may be less important than it once might have been.

@antoine-levitt
Copy link
Contributor Author

That's a very nice idea, which would reduce (but not completely solve) the problem of huge type names, aka the StridedArray of doom (cf #32470). Should we just print the argument name instead? ie foo(A) here (although possibly that information is not available?)

@pfitzseb
Copy link
Member

I'm going to start trying to notice whether I find the type information in the stack trace useful in my own debugging. We do have other tools for this available now, it may be less important than it once might have been.

FWIW, Juno has been suppressing type info in stacktrace for a long time now, and I don't think anyone ever complained about it.

I think everyone will agree that
grafik
is much clearer than

julia> loss(x, w, cdims)
ERROR: AssertionError: DimensionMismatch("Kernel output channel count (10 vs. 1)")
Stacktrace:
 [1] check_dims(::NTuple{5,Int64}, ::NTuple{5,Int64}, ::NTuple{5,Int64}, ::DenseConvDims{3,(3, 3, 1),3,1,(1, 1, 1),(0, 0, 0, 0, 0, 0),(1, 1, 1),false}) at C:\Users\Basti\.julia\packages\NNlib\mxWRT\src\dim_helpers\DenseConvDims.jl:68
 [2] #conv_im2col!#231(::Array{Float64,2}, ::Float64, ::Float64, ::Function, ::Array{Float64,5}, ::Array{Float64,5}, ::Array{Float64,5}, ::DenseConvDims{3,(3, 3, 1),3,1,(1, 1, 1),(0, 0, 0, 0,0, 0),(1, 1, 1),false}) at C:\Users\Basti\.julia\packages\NNlib\mxWRT\src\impl\conv_im2col.jl:30
 [3] conv_im2col! at C:\Users\Basti\.julia\packages\TimerOutputs\7zSea\src\TimerOutput.jl:198 [inlined]
 [4] macro expansion at C:\Users\Basti\.julia\packages\NNlib\mxWRT\src\conv.jl:51 [inlined]
 [5] #conv!#37 at C:\Users\Basti\.julia\packages\TimerOutputs\7zSea\src\TimerOutput.jl:190 [inlined]
 [6] conv!(::Array{Float64,5}, ::Array{Float64,5}, ::Array{Float64,5}, ::DenseConvDims{3,(3, 3, 1),3,1,(1, 1, 1),(0, 0, 0, 0, 0, 0),(1, 1, 1),false}) at C:\Users\Basti\.julia\packages\TimerOutputs\7zSea\src\TimerOutput.jl:198
 [7] #conv!#56(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Array{Float64,4}, ::Array{Float64,4}, ::Array{Float64,4}, ::DenseConvDims{2,(3, 3),3,1,(1,1),(0, 0, 0, 0),(1, 1),false}) at C:\Users\Basti\.julia\packages\NNlib\mxWRT\src\conv.jl:68
 [8] conv!(::Array{Float64,4}, ::Array{Float64,4}, ::Array{Float64,4}, ::DenseConvDims{2,(3, 3),3,1,(1, 1),(0, 0, 0, 0),(1, 1),false}) at C:\Users\Basti\.julia\packages\NNlib\mxWRT\src\conv.jl:68
 [9] #conv#97(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Array{Float64,4}, ::Array{Float64,4}, ::DenseConvDims{2,(3, 3),3,1,(1, 1),(0, 0, 0, 0),(1, 1),false}) at C:\Users\Basti\.julia\packages\NNlib\mxWRT\src\conv.jl:114
 [10] conv at C:\Users\Basti\.julia\packages\TimerOutputs\7zSea\src\TimerOutput.jl:198 [inlined]
 [11] loss(::Array{Float64,4}, ::Array{Float64,4}, ::DenseConvDims{2,(3, 3),3,1,(1, 1),(0, 0, 0, 0),(1, 1),false}) at .\untitled-dbb830fbb211cb3810c5451763348f52:4
 [12] top-level scope at none:0

and you can get the type info quickly by opening the file (which should be easy both in the REPL and in Juno).

Juno also shows everything in packages/base/stdlibs with less contrast than user code, which does add a little to clarity as well.

@vchuravy
Copy link
Sponsor Member

I find type-information incredibly useful and would surely miss them. They answer the question how/why did I get there and a MethodError without type-information is useless.

@c42f
Copy link
Member

c42f commented Aug 26, 2019

The LoadError annoyance would be addressed by #31882. I plan to come back to that once I've addressed the related but difficult problem of making sure that errors in any top level expression get into the backtrace. There are various cases where this doesn't work right now.

@c42f
Copy link
Member

c42f commented Aug 26, 2019

Sure, it's nice to know that include is just a plain julia function, but do I really need to be reminded of that every time I look at a stacktrace?

This is an excellent point. I've often wished that we would scrub include from backtraces.

In general we won't be able to please everybody though with a single default; some parts of Base a user may find "interesting" and other parts are "non-interesting". Some users may even be annoyed to see package code in their backtrace. On the other hand @vchuravy wants to see types, and I'm sure he's wanted to see C frames in there on occasion — I know I have.

I think a good approach would be an interactive user interface for the REPL which allows the last stack trace to be re-printed with increasing or decreasing verbosity based on some keyboard commands.

Then if we have a set of heuristics which label frames with a verbosity, something like from most to least verbose:

C > kw frames / other lowering? > REPL/loading/... > rest of Base > Packages > non-package code

the REPL and other interfaces like Juno can use a consistent shared ordering in their interfaces — much the same way that all editors agree on the set of julia LaTeX completions because the REPL defines a standard.

@JeffBezanson
Copy link
Sponsor Member

There are definitely several good points here.

Ah, didn't think of multiline input, thanks. Makes sense to print it along with the frame then. The [4] could maybe go?

I think the problem is more on the opposite end --- that repl inputs aren't labeled so you can't easily tell which input it's referring to. Maybe we should change the default prompt to be more like the numbered prompts in IJulia?

We could improve the include situation very quickly by just changing the implementation not to do so many layers of dispatch. That might seem a bit undisciplined but I think it's actually easier to maintain than trying to patch it up during stack trace display.

Agreed on keyword arguments; I'll think about it.

I'm also sympathetic to not showing argument types by default; they can indeed be way too verbose. I think I would be ok with it if there were something like ^Q to easily show the types when wanted.

@JeffBezanson JeffBezanson added domain:display and printing Aesthetics and correctness of printed representations of objects. domain:error handling Handling of exceptions by Julia or the user labels Aug 26, 2019
@c42f
Copy link
Member

c42f commented Aug 27, 2019

@antoine-levitt to add a minimal version of your last example:

a.jl:

include("b.jl")

b.jl:

error("Boom")

Currently in 1.2 we have

jl12 --color=yes -e 'include("a.jl")'

stacktrace_orig

Here is a mockup showing a re-imagined version of this stack trace:

stacktrace_cleanup

Here I've done the following:

  • Reformatted into a tabular format. Tabular alignment greatly helps with readability.
  • Added a file name and line summary columns separately from the full path. Readability is more important that denormalized data.
  • Put the path and line number information on the right — this is very useful for copying and pasting into an editor, but paths can be long. Arguably the maximum path length shown could respect the terminal width by default when running in the REPL.
  • Used the full paths for Base
  • Removed the type information. This could become an optional column.
  • Removed the LoadError printing which should be redundant with the backtrace file and line info
  • Added a visual indicator (the _ in the index column) showing where the next level of "excessively verbose" frames have been folded away. (Making verbosity labeling and folding actually work could be a challenge).
  • Highlighted rows of the table based on module. I think the heuristic here could be that code running in Main is most likely to be in error, followed by packages, followed by Base/stdlib.

This is just a hand-edited interface mockup. If anyone wants to pursue this with code I'd suggest trying the PrettyTables package for prototyping.

@c42f
Copy link
Member

c42f commented Aug 27, 2019

Speculation: We could have a REPL mode for interactively exploring past program events, including stack traces. That might be overkill for stack traces alone, but I've also wanted such a UI for exploring log events. These seem a bit similar because they retroactively replay some captured state from a previous execution. It's a kind of debugging experience which is different from interacting with the program live but has some advantages.

@antoine-levitt
Copy link
Contributor Author

antoine-levitt commented Aug 27, 2019

We could improve the include situation very quickly by just changing the implementation not to do so many layers of dispatch. That might seem a bit undisciplined but I think it's actually easier to maintain than trying to patch it up during stack trace display.

That's a great idea! Just manually inlining the function,

function myinclude(fname::AbstractString)
    mod = Main
    path, prev = Base._include_dependency(mod, fname)
    for callback in Base.include_callbacks # to preserve order, must come before Core.include
        Base.invokelatest(callback, mod, path)
    end
    tls = Base.task_local_storage()
    tls[:SOURCE_PATH] = path
    local result
    try
        result = ccall(:jl_load_, Any, (Any, Any), mod, fname)
    finally
        if prev === nothing
            delete!(tls, :SOURCE_PATH)
        else
            tls[:SOURCE_PATH] = prev
        end
    end
    return result
end

I get

ERROR: LoadError: UndefVarError: STOP not defined
Stacktrace:
 [1] top-level scope
 [2] myinclude(::String) at /home/antoine/scratch.jl:11
 [3] top-level scope at none:0
in expression starting at scratch2.jl:2

which is a huge win! Perhaps we can do top-level scope at /home/antoine/scratch2.jl:2 as @c42f suggests? That seems to interact with #31882 though, so might be better to do that there.

edit: hm, the "top-level scope at ..." already happens with master julia. Can we remove "in expression starting at" then?

I think the problem is more on the opposite end --- that repl inputs aren't labeled so you can't easily tell which input it's referring to. Maybe we should change the default prompt to be more like the numbered prompts in IJulia?

That always felt very '70s to me... maybe that's just me, but I don't type that many things at the REPL, and it's always pretty easy to know what it refers to.

@timholy
Copy link
Sponsor Member

timholy commented Aug 27, 2019

@vchuravy,

a MethodError without type-information is useless

agreed, but I don't think this proposal changes that:

julia> foo(::Float64) = 1
foo (generic function with 1 method)

julia> foo(::Int16) = 2
foo (generic function with 2 methods)

julia> @noinline bar(x) = foo(x)
bar (generic function with 1 method)

julia> bar(1.0f0)
ERROR: MethodError: no method matching foo(::Float32)
Closest candidates are:
  foo(::Int16) at REPL[2]:1
  foo(::Float64) at REPL[1]:1
Stacktrace:
 [1] bar(::Float32) at ./REPL[5]:1
 [2] top-level scope at REPL[6]:1

While here you might say, "well, it was nice knowing that bar was passed a Float32," I think it's fair to say that all the necessary information is printed in the MethodError header. We don't lose that if we scrub it from the stacktrace display.

@StefanKarpinski
Copy link
Sponsor Member

@c42f, I like the tabular display a lot. It would be good to have package name column as well. See #33047 wherein it's noted that we might want to install packages in locations where the package name is not part of the path; I don't actually think we should do that, but displaying the package name prominently rather than hidden away in the path name would be good.

@antoine-levitt
Copy link
Contributor Author

I think one thing that needs to be decided is whether or not we print always want full paths. If we do, then printing the package is less important. If we don't, we could do something like Juno, or have a table display with package name and path relative to the package root. That meshes well with the discussion in #25375 to use different colors for different levels of (user, package, stdlib, base). Things like (invented example)

Stacktrace:
[1] throw_complex_domainerror   Base          math.jl:32
[2] sqrt                        Base          math.jl:492
[3] cholesky                    LinearAlgebra factorizations/cholesky.jl:XX
[4] top-level                                 /home/antoine/dev/my_script.jl:XX
[5] include                     Base          loading.jl:XX
[6] REPL                                      [5]:2

with different colors for the base, linalg and user code. It does make it harder to get to the source, but we have ctrl-Q for that.

We could also have a clean but conservative choice in base (full paths and no tables), and more fancy display (table, packages and relative paths) in a package.

@StefanKarpinski
Copy link
Sponsor Member

Full paths are the most useful thing in any stack trace as far as I'm concerned: I click on them to open up the location where an error happened all the time, I never use ctrl-Q. It would be fine to print paths on the next line though, which might improve readability.

@antoine-levitt
Copy link
Contributor Author

When you say click, do you mean in an IDE? Or do you have a terminal that calls $EDITOR when you click on it? How do you deal with Base, which prints as ./file.jl right now?

@StefanKarpinski
Copy link
Sponsor Member

Terminal that opens files in my editor. Guess what directory I’m usually in when running Julia?

@KristofferC
Copy link
Sponsor Member

KristofferC commented Aug 28, 2019

While it is tempting to try to overhaul the stack trace printing, I want to point out that it has worked quite well for some time and that I think we can make significant improvements with targeted incremental changes.

For example, #33087 is one step. Hiding keyword wrapping functions is another.

@antoine-levitt
Copy link
Contributor Author

Agreed: include, kwargs and fixing the paths will fix the most important issues. Doesn't mean we can't dream of the ideal stacktrace :-p

@KristofferC
Copy link
Sponsor Member

It is also a bit interesting to note that we do a lot of these things (hiding the truth for a better experience) in the debugger already. Automatically stepping through keyword wrapper functions, rewriting function names from the lowered versions, automatically unwrapping Core.Box in evaluation etc etc. And for those that really need it, we have an advanced mode where you can step through the lowered code statement by statement.

I am heavily in favor of optimizing for the "99%" and then having "escape hatches" for those that need to see the ugly truth.

@vchuravy
Copy link
Sponsor Member

I am heavily in favor of optimizing for the "99%" and then having "escape hatches" for those that need to see the ugly truth.

The only word of caution I have here is that sometimes the only information you get from a user is a stacktrace and
a MWE might be not trivial. So yes I am all for improving readability of stacktraces, but removing information also comes at a cost.

While here you might say, "well, it was nice knowing that bar was passed a Float32," I think it's fair to say that all the necessary information is printed in the MethodError header. We don't lose that if we scrub it from the stacktrace display.

I wasn't meaning the type information in the MethodError per-se. For me that is that "what happened", but the type-information in the stack-trace is the "how did it happen". As an example, for GPU code it is the case that the bug is several stack-frames higher and involved mixing CuArray and Array and the MethodError doesn't tell me much. Of course I can throw the program into Cthulhu or the Debugger, but that seems overkill.

@pfitzseb
Copy link
Member

pfitzseb commented Sep 9, 2019

The list of terminal emulators supporting hyperlinks is fairly short though: https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda.

@StefanKarpinski
Copy link
Sponsor Member

It does include my favorite terminal, though, so let's do it! 😁

But seriously, maybe we could print the link/file on the next line but detect terminals that support hyperlinks and emit hyperlinks in terminals that support it?

@lechner
Copy link

lechner commented Sep 10, 2019

Plus, let's tell @egmontkob that julia is excited about his feature.

@c42f
Copy link
Member

c42f commented Sep 11, 2019

FWIW I've just enabled ls --hyperlink=auto by default (I use gnome-terminal) and I think this may actually be a game changer for TUIs. It achieves just enough of what has been long desired by people trying to improve terminal workflow, without breaking significantly from the "everything is text" model which makes the terminal so powerful. We should totally support it in julia.

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Sep 11, 2019

I just tried it too... it's quite nice. I guess the next question is can we enable this conditionally in the terminals that support it? Or is it harmless in terminals that don't support it (I doubt it).

@pfitzseb
Copy link
Member

As per this, sane terminals should just print the link text and not show the URL at all (which is true for the couple of terminals I tried). We'll definitely need to figure out a good fallback.

@egmontkob
Copy link

I'm aware of the following terminals that don't at least silently ignore these escape sequences, but rather result in visual corruption:

  • VTE versions prior to 0.48.2 (2½ years old). Pretty much every distribution has already upgraded to newer versions; the catch is the ancient version 0.28 that goes with GTK2. Debian and Ubuntu have mostly gotten rid of it. Fedora / Red Hat (a side thread within that bug) still has a few emulators linked against it, including guake (up to F30 only), lxterminal, lilyterm, pacmanager. There's also an upstream project called asbru-cm (fork of pacmanager) using this ancient VTE.

  • Konsole works incorrectly only if the terminator character is the official standard ST. It works correctly if the terminator is the commonly used but nonstandard BEL. I talked to the developers off-bugzilla about a month ago and they assured me it's at the top of their TODO list, but maybe you could ping them again :)

@c42f
Copy link
Member

c42f commented Sep 11, 2019

Perhaps we can heuristically autodetect this feature by attempting to print a hyperlink and checking the cursor position afterward? (Eg cursor position can be read using tput u7 or printf "\033[6n" though I don't know enough about the terminal ecosystem to know how portable that is.)

@egmontkob
Copy link

Querying the cursor position is an asynchronous operation. I loathe those, it's impossible to reliably code against them.

You issue the sequence that queries the position. And then what? You wait for the response, with a timeout I presume, since probably not all terminals respond (I really don't know). How long should that timeout be? If you ssh to a server on the opposite side of the Earth, the fastest theoretical roundtrip time (going around the Earth with the speed of light – assuming you can't go straight across the Earth :)) is about 0.13 seconds. Plus the path is longer, and is full of devices that add latency.

What if you receive other data before the response, within the timeout? I mean keypresses from the user, either typed ahead before your app started up, or pressed within an interactive app just when it was about to query the cursor position? Will your application properly act on them while waiting for the response?

What if you reach the timeout? You need to go on. What if the response arrives later? Will all the parts of your interactive application, all the way from simple cases like a read statement in a shell script, to every possible screen and state of interactive apps, be able to silently ignore (or retroactively properly act upon) this response?

Mind you, unfortunately I don't know anything about Julia, nor the exact app or utility that you're adding this feature to (I know I should take a look at it :)) so my questions don't necessarily apply exactly as I phrased them, but some things along these lines (including user reports about heuristic behavior, unreproducible breakages) would sure cause lots of headaches for you.


IMHO go for a much simpler approach.

Maybe a config option, maybe tri-state with an "autodetect" option. You can whitelist VTE by looking at $VTE_VERSION >= 4802, iTerm2 by perhaps $TERM_APPLICATION or $TERMINAL_APP or something like this (I can't recall off the top of my head), terminology by $TERMINOLOGY and so on. These can err in both directions, e.g. if you start up the terminal emulator from another one (the old one's value remains set), or across ssh (the value isn't forwarded). Or you can blacklist broken ones, e.g. $KONSOLE_*, although I don't know how to detect old VTEs.

What I'd probably do: Just go on, enable by default, and give a config option for users to disable. Use BEL as terminator in order to please Konsole users (and change to the proper ST a few years after Konsole fixes their bug). The rest is pretty much VTE 0.28. Those users could fix it for themselves by disabling the option, and in the same time you'd do a favor by raising awareness of them using an ancient emulator and push them towards upgrading to a new one. By VTE 0.28 and other possible broken terminals we're really talking about the long tail, not the mainstream ones. (YMMV. I know that not everyone agrees with this pushy appraoch, it's absolutely reasonable for an app developer to try to stay safer and care about the long tail.)

@lechner
Copy link

lechner commented Sep 11, 2019

Is that logic available in a library, or are there sample implementations anywhere?

@egmontkob
Copy link

(Which one do you mean? The unreliable asynchronous one, or guessing based on the environment variables? Anyway, for both of them, my answer is:)

Not that I'm aware of.

@lechner
Copy link

lechner commented Sep 11, 2019

I just thought a library might aid implementation. Others are getting interested in the feature, too.

@c42f
Copy link
Member

c42f commented Sep 12, 2019

@egmontkob thanks; I think a whitelist sounds simple and sane. The "correct" solution would seem to be negotiating terminal capabilities when the connection is intiated. But good luck getting that deployed in the terminal ecosystem which has 50 years of history or so. I also found the discussion of autodetection over at kovidgoyal/kitty#68 interesting.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Sep 13, 2019

For this set of (slightly modified) frames above:

 [10] include(::Module, ::String) at ./boot.jl:328 [inlined]
 [11] include(::Module, ::String) at ./Base.jl:31
 [12] include(::String) at ./client.jl:432

The @code_* debuginfo=:source printing would simplify this onto one line similar to:

 [10] include at ./boot.jl:328 [inlined] at ./Base.jl:31 at ./client.jl:432

@vtjnash
Copy link
Sponsor Member

vtjnash commented Sep 17, 2019

Another improvement I think someone could attempt is to remove common frames, especially those above the point where the exception was caught. This may be easier to exemplify with the new catch-stack, but could apply to anything. (there's already a hard-coded filter doing this here to cut it off for the REPL, so that could be removed, and the effect may be more dramatic for code in files):

Currently:

julia> f(x) = error(x)
f (generic function with 1 method)

julia> g() = try; f(1); catch; f(2); end
g (generic function with 1 method)

julia> g()
ERROR: 2
Stacktrace:
 [1] error(::Int64) at ./error.jl:42
 [2] f(::Int64) at ./REPL[8]:1
 [3] g() at ./REPL[6]:1
 [4] top-level scope at REPL[9]:1
caused by [exception 1]
1
Stacktrace:
 [1] error(::Int64) at ./error.jl:42
 [2] f(::Int64) at ./REPL[8]:1
 [3] g() at ./REPL[6]:1
 [4] top-level scope at REPL[9]:1

Could try to remove the common frames, for something like:

ERROR: 2
Stacktrace:
 [1] error(::Int64) at ./error.jl:42
 [2] f(::Int64) at ./REPL[8]:1
 [3] g() at ./REPL[6]:1
 [4] top-level scope at REPL[9]:1
caused by [exception 1]
1
Stacktrace:
 [1] error(::Int64) at ./error.jl:42
 [2] f(::Int64) at ./REPL[8]:1
 [3] g() at ./REPL[6]:1
 [4] caught here at prior [3]

Implementer's note: in the past, I've posited that we could handle this by only unwinding the stack until we hit the stack frame that contains the try/catch block.

@rafaqz
Copy link
Contributor

rafaqz commented Sep 18, 2019

It would be good to use color more intelligently in error messages. Descriptive text should be a different colour to types so you can easily parse ambiguity errors and similar. Line numbers and paths should have different colors again so you can easily find them.

@K-Kent
Copy link

K-Kent commented Dec 23, 2019

In version 1.3.0 on Win_64 platform, I get this:

ERROR: MethodError: no method matching randstring(::Int64, ::String)
Closest candidates are:
  randstring(::AbstractRNG, ::Any) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Random\src\misc.jl:74

It must be a hard-coded path since this machine does not even contain a D drive.
The actual path is AppData\Local\

@c42f
Copy link
Member

c42f commented Feb 6, 2020

To record where I got to with @hide_in_backtrace and the frame formatting heuristics mentioned above — I hit a roadblock making this all work properly because there's not much information available for inlined frames. In particular, only the file name and line (no module or other code / method metadata).

I think this boils down to implementing the following TODO but that's about as far as I got:

frame->linfo = NULL; // TODO: if (new_frames[n_frames - 1].linfo) frame->linfo = lookup(func_name in linfo)?

@aminya
Copy link

aminya commented Feb 8, 2020

@jkrumbiegel made some cool modifications to error printing in the slack group. I hope this ends up on Master.

To improve upon his work, we should remove the full file path and give a hyperlink.
There is this repository (U++ widget):
https://github.com/ismail-yilmaz/upp-components/tree/master/CtrlLib/Terminal
There are other terminals that have hyperlink but may not be cross platform:
https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda

  • Is there a way to make a hyperlink in REPL or Juno REPL (no full file path)?

If not we may be able to decrease the characters font so they take up less space

  • Is it possible to change the font of certain characters in REPL?

image

https://julialang.slack.com/archives/C67910KEH/p1581114459340300?thread_ts=1580974353.268600&cid=C67910KEH

@jkrumbiegel
Copy link
Contributor

Here's the experimental package mentioned above that I made to enhance the visual clarity of stacktraces https://github.com/jkrumbiegel/ClearStacktrace.jl

@BioTurboNick
Copy link
Contributor

To record where I got to with @hide_in_backtrace and the frame formatting heuristics mentioned above — I hit a roadblock making this all work properly because there's not much information available for inlined frames. In particular, only the file name and line (no module or other code / method metadata).

I think this boils down to implementing the following TODO but that's about as far as I got:

frame->linfo = NULL; // TODO: if (new_frames[n_frames - 1].linfo) frame->linfo = lookup(func_name in linfo)?

Could this (module info for inlined frames) be raised to its own issue? I'm willing to try implementing it, but I haven't poked around the C++ side of Julia yet.

@c42f
Copy link
Member

c42f commented May 31, 2021

@BioTurboNick ok, see #41031

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:display and printing Aesthetics and correctness of printed representations of objects. domain:error handling Handling of exceptions by Julia or the user
Projects
None yet
Development

No branches or pull requests