-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression between 0.33.2 and 0.33.1 #235
Comments
Could this be related to issue "crosstool-NG: math & printf related errors (IDFGH-4643)"? |
I don't think this is in anyway related to |
I'm actually using an ESP32-C3 (RISC-V). Anyway, I have to test the new release of |
But the argument remains - how can |
I totally agree with you. I just observed it happen with an specific |
Yes it is not the right repo so I closed the bug. But since the conversation had started:
How did you manage to take this coredump? It does not look like the standard backtrace that ESP-IDF generates? |
We are using this IDF option https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-guides/core_dump.html#core-dump-to-uart |
Well, other than increasing the stack of the task/thread doing the computation, I don't have other easy to try solutions. If stack size increase does not help, you need to fork the upstream crate and play with it by commenting bits and pieces of the computation until you figure out why it is crashing. |
I'm trying to debug this issue as we can't update Now I was trying to identify which commit between
Is there a guide I can follow to use a local copy of |
if you want to test out a local git branch you need to use the [patch.crates-io] feature and point it to your local / remote git version you wanna use, so all dependancys that uses esp-idf-sys are using the same version. |
You are supposed to use the What you are doing instead would mean that you'll be ending up with TWO |
Thank you both! The exception is caused by commit 6102063, which makes changes to the linker. I'm exploring |
I'm not really sure what is the root problem. Out of stack/heap memory? Something else? The code you are looking at only matters for ESP IDF 4.4.X. If you are on ESP IDF 5.X you can't be affected by it. As for what it does - look at the comments. The issue at hand is thorough-fully documented (for once). |
If I had to guess, it's a heap problem. And yeah, we are using |
Can't you assign more heap to confirm/deny your hypothesis? Or as I said a few months ago - play with the crate until you isolate the problem? |
Also it is a bit strange if you think the culprit is the heap, as this issue started with "it crashes on a bunch of math operations", Can't you take these math operations, and execute them on the MCU without the sunrise crate? |
How can I assign more heap? I've already played with the crate to isolate the problem but it led me nowhere. Let me explain: I have a main function with only calls This works (i.e. it doesn't raise an exception): pub fn sunrise_sunset(
latitude: f64,
longitude: f64,
year: i32,
month: u32,
day: u32,
) -> (i64, i64) {
let day: f64 = mean_solar_noon(longitude, year, month, day);
let solar_anomaly: f64 = solar_mean_anomaly(day);
let equation_of_center: f64 = equation_of_center(solar_anomaly);
// let ecliptic_longitude: f64 = ecliptic_longitude(solar_anomaly, equation_of_center, day);
// let solar_transit: f64 = solar_transit(day, solar_anomaly, ecliptic_longitude);
// let declination: f64 = declination(ecliptic_longitude);
// let hour_angle: f64 = hour_angle(latitude, declination);
// let frac: f64 = hour_angle / 360.;
// (
// julian_to_unix(solar_transit - frac),
// julian_to_unix(solar_transit + frac),
// )
(0, 0)
} But this crashes: pub fn sunrise_sunset(
latitude: f64,
longitude: f64,
year: i32,
month: u32,
day: u32,
) -> (i64, i64) {
let day: f64 = mean_solar_noon(longitude, year, month, day);
let solar_anomaly: f64 = solar_mean_anomaly(day);
let equation_of_center: f64 = equation_of_center(solar_anomaly);
let ecliptic_longitude: f64 = ecliptic_longitude(solar_anomaly, equation_of_center, day);
// let solar_transit: f64 = solar_transit(day, solar_anomaly, ecliptic_longitude);
// let declination: f64 = declination(ecliptic_longitude);
// let hour_angle: f64 = hour_angle(latitude, declination);
// let frac: f64 = hour_angle / 360.;
// (
// julian_to_unix(solar_transit - frac),
// julian_to_unix(solar_transit + frac),
// )
(0, 0)
} The crate contains a bunch of mathematical operations, so that's why I think it has to do with memory. |
This has nothing to do with the heap. It looks more like an ISA problem. Can you - as a start - just copy the above code - plus whatever other functions are called - directly into a minimal binary crate? Also, what exactly does the "crash" look like? Rust panic? Invalid instruction? Invalid pointer? Can you also remove your coredump-to-uart option so that we can see the standard, default ESP IDF panic handler, and what exactly it displays? |
Do you use hardware with external PSRAM or only using internal sram? Notice that the c3 has no hardware floating point support, so all calculations are done in pure software, and f64 tends to be expansive here. Just a general remark. |
You both have made some good points. Check out this minimal example. I managed to reproduce it without even using the external crate but just a for loop with some basic float operations. Full output: error-235.log |
I don't know when I would be able to test this so in the meantime some ideas and suggestions: The very fact that you have to execute the same statement in a loop to achieve a crash looks strange. Does it also crash if you manually unroll the loop? My point is, it might be a flaky hardware too, especially if it turns out that executing the same thing a number of times makes it crash. Therefore:
|
The original ESP IDF issue referred by @punkto at the beginning of the thread is precisely a power issue, not a software one btw. |
It doesn't crash when unrolling the loop. To give more context, when creating the example I was just trying to emulate what the external crate does, which are a bunch of float operations somehow interconnected. We have a custom board (powered through both battery and USB), but I've replicated the behavior on an ESP32-C3-DevKitM-1U. Besides, the exception arises after a software change (6102063), doesn't that rule out a hardware issue -or at least make it very unlikely-? |
@igrr Any ideas? It seems we are facing runtime crashes when utilizing FP on riscv targets when linked with a GCC toolchain newer than 5.1 (as in, the toolchain which is available from ESP IDF 5.1 onwards). (As to why we are using a newer GCC toolchain for linking even with ESP IDF 4.4.X - long story - but the crux is due to the zicr changes to the RISCV ISA 2.1 spec, where GCC and LLVM diverged, and GCC is following the new (backward incompatible) 2.1 spec, where zicr and zifencei are not considered part of "riscv32imc", while LLVM continues to follow the older 2.0 version, where these are considered in. Which creates linking errors when using an older GCC toolchain.) |
Maybe related to esp-rs/rust#176? What happens if you build esp-idf with clang, instead of GCC? |
I'm using
sunrise
crate to do some computation. Everything works fine withesp-idf-sys="0.33.1"
but the program crashes here (it's just a bunch of float math operations) with version0.33.2
.Here's the coredump of the thread that crashes:
I didn't manage to track the bug deeper. Any idea of what could be going on with esp-idf-sys?
The text was updated successfully, but these errors were encountered: