Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we improve how HRA's RECLASS_RISK_Ecosystem.tif is calculated? #1457

Open
phargogh opened this issue Dec 1, 2023 · 3 comments
Open
Assignees
Labels
question Further information is requested

Comments

@phargogh
Copy link
Member

phargogh commented Dec 1, 2023

I got an interesting message from someone on the forums, which, among other things, was asking why we reclassify TOTAL_RISK_Ecosystem.tif --> RECLASS_RISK_Ecosystem.tif in the way that we do, as it can easily lead to pixels being "low" risk because it doesn't take into account anything other than the sum of risk on a pixel relative to the maximum possible risk. So unless you have a pixel that is high risk for all habitats and all stressors, you're going to end up with low reclassified risk, which seems strange.

@jade-md do you have any opinions about how we are currently calculating the classification breaks in RECLASS_RISK_Ecosystem.tif? Should it be changed? Here's the user's guide definition of it, for reference (see step 4b).

Here's the full message:

I come back to this subproject of my PhD, and as I explain in the previous message, my objective is to access the increase in Risk across my study area, in different scenarios of human activities and also with the addiction of planned changes in both the distribution of those activities and the increase in others.

For context, it is a coastal region with 12 habitats, and between 15 and 18 stressors, using a scale of [0, 3]. And I am more interested in evaluating changes across the entire system, than to look at specific habitats risk (Ecosystem Risk).
I have used the new 3.14 InVEST version for the runs below, but also the development build that you sent me last time (InVEST 3.12.1.post651+gcc91a81ac Workbench).

When I look at the normalized total risk raster of ecosystem (outputs/TOTAL_RISK_Ecosystem.tif), I can see the increase/decrease in the max risk scores between scenarios and the changes throughout the area.

However when looking at the reclassified risk (outputs/RECLASS_RISK_Ecosystem.tif), in all scenarios I am working, this reclassified ecosystem risk is always either 0 or 1, even with increased number of stressors. I had in mind to use the increase in areas classified as higher risk as a mean to compare the management scenarios and options.

As the manual says, for the Reclassification of Ecosystem total Risk raster in Equation 41, the cut off values are based on the the maximum cumulative risk score, given by ql=mjkl⋅nj.

As I used Euclidean distance the mjkl is equal to 2.828, and in my case ql is set as 2.828 * 12 (number of habitats) = 33.941. Meaning this is the max total risk I can get on a grid cell, and used in the equation for the cut off values, as a cut off value reference.

Using this, the cut off value to a grid cell to be classified at risk 1 is <11.31367, risk 2 is >= 11.31367 and 22.62733, and 3 is >= 22.627, as per equation 41.

If I compute the TOTAL_RISK_ecosystem.tif using the version you send me or the 3.14 version, the max value for the normalized TOTAL_RISK of the ecosystem that I obtained in my case is 9.344 (I have 4 overlapping habitats in a just a few cells, the other is always 1 or two max), which would always be lower than the cut off value of 11.3137, for classifying a grid cell as risk 2 for example.

Even after some tests, increasing the table values to 3, increasing the number of stressors, this reclassification is given me always 1.
If the cut of values for the reclassification of risk is performed after the normalization of the total risk scores, shouldn’t the ql=mjkl⋅nj be based not on the total number of habitats, but on the total habitats overlapping at a specific grid cell?

In the model github repo (github.com/natcap/invest/blob/main/src/natcap/invest/hra.py) there is a reference (line 948) that beginning at version 3.10.2, the total risk of the ecosystem is computed as the mean risk per habitat (also clearly mentioned in the Model outputs manual section), where the sum of habitat risk scores is normalised by the number of habitats occurring in each cell, which makes sense to use a normalized risk value.

If this normalization by the number of habitats in the grid cell is used, and the value of the ql=mjkl⋅nj remains at the total value of habitats present in the entire aoi, would not all the results of the reclassification of TOTAL_RISK_ecosystem.tif into RECLASS_RISK_Ecosystem.tif risk bins using equation 41 give a reclassified risk of 0 or 1?

On the other hand, I tried the method as in InVEST 3.3.3, mentioned in the hra.py script (in this case, I summed the TOTAL_RISK_habitat.tif layers model outputs above, without dividing by the number of habitats present in each cell), the max risk value obtained in the TOTAL_RISK_ecosystem.tif is 27.809.
Which would be more inline with the range cut values described above, and results in other risk classes in the RECLASS_RISK_Ecosystem.tif, or perhaps the cut off values should be based on the max number of just overlapping habitats.

Is the model working as intended or am I missing or totally misinterpreted something within these model steps? Could you provide me with some indications?

Thank you very much for your help and attention, and keep up the good work.

@phargogh phargogh added the question Further information is requested label Dec 1, 2023
@davemfish
Copy link
Contributor

So unless you have a pixel that is high risk for all habitats and all stressors, you're going to end up with low reclassified risk, which seems strange.

I agree this does not make much sense. Just because a pixel has fewer different habitats present doesn't mean its "ecosystem" should be deemed at lower risk than a pixel with more habitats present (everything equal with stressors). Maybe this is why we changed "TOTAL" ecosystem risk to the "mean" risk across habitats instead of the "sum"? But then this normalization is kind of lost again during the reclassification if the max risk value is based on max_pairwise_risk * number_of_habitats. It should just be max_pairwise_risk alone at that point?

@jade-md
Copy link

jade-md commented Dec 2, 2023

This reminds of the same issue we had for the number of overlapping stressors... which we ended up adding as a parameter in the model because using the max number of stressors was leading in under-estimating risk for areas with less stressors... with max_pariwise_risk, could that results in more high risk values? We could try on the Belize analysis to see the effect on the results. I also wonder if there is a way to multiply it by the number of overlapping habitat at the pixel level rather the site level?

@phargogh
Copy link
Member Author

Yeah, I don't recall finding any documentation on why we added a classified version of the "TOTAL" ecosystem risk, and that was in the bitbucket days so all of that information is gone or in some JSON object somewhere in a backup if we even had it written down in the first place. I think we added this classified ecosystem risk in the 3.8 version of the model.

@davemfish It should just be max_pairwise_risk alone at that point?

I think I like that solution. So if the cumulative risk for the Ecosystem on a pixel is equal to or greater than 66% of the maximum possible pairwise risk, it should be high risk. That would be consistent with how the other habitat/stressor combinations are calculated in the model, just considering all of the factors across the whole ecosystem.

@jade-md This reminds of the same issue we had for the number of overlapping stressors... which we ended up adding as a parameter in the model because using the max number of stressors was leading in under-estimating risk for areas with less stressors... with max_pariwise_risk, could that results in more high risk values?

If we based the classification of TOTAL_RISK_Ecosystem.tif on the value of max_pairwise_risk, then yes, I think we'd have many more of the TOTAL_RISK_Ecosystem.tif values being classified as high risk.

@jade-md I also wonder if there is a way to multiply it by the number of overlapping habitat at the pixel level rather the site level?

Yes, we can certainly do something like this if we want the classification to consider the number of overlapping habitats or the number of overlapping stressors on a pixel. This is not currently accounted for in the way the reclassification happens, which feels like an oversight.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants