Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis results grouped by hour #1298

Open
FlorianWeissDev opened this issue Jan 11, 2024 · 4 comments
Open

Analysis results grouped by hour #1298

FlorianWeissDev opened this issue Jan 11, 2024 · 4 comments

Comments

@FlorianWeissDev
Copy link
Contributor

FlorianWeissDev commented Jan 11, 2024

Hello everyone! 馃憢

The function App.getCostAndEstimates() in the App package currently allows for a groupBy parameter that accepts the values 'day', 'week', 'month', 'quarter', 'year'. At least, the AWS cost and usage report also offers the option to export data with an hourly timeframe.

Feature Suggestion

Results are grouped by service and hour.
It would already be helpful to have this as a feature of the library, but I'm sure people would also benefit if the CLI and other tooling would also offer access to it.

Describe the solution you'd like
The possibility to pass 'hour' to the App.getCostAndEstimates() function. Resulting in a data set that contains results on an hourly basis.

Describe the alternatives you've considered
There is no alternative but to accept that the deepest granularity is a grouping by day, as the hourly information is lost.

Additional context
I could not find any information about the technical reasons why the groupBy stops at day.

@felixrichardt
Copy link

+1

@4upz
Copy link
Member

4upz commented Jan 26, 2024

Hi @FlorianWeissDev thanks for suggesting this, and sorry for the delay in responding!

This is a feature that we've been considering for quite some time, because as you mentioned, most (if not all) cloud provider billing data including AWS does allow for hourly granularity and updates. The reason we haven't prioritized this is because there has been a question of how valuable it would be and potential challenges that would arise in implementing this. I see that this issue has some support, so that answers the first part. Allow us to share thoughts on the second part in terms of the potential challenges:

  1. Scalability - Hourly data would involve additional more rows of data per service. So, for example, if a single day of estimates that is grouped at a daily level has 100 rows of data. Then the same amount of services estimated at an hourly level for a single day would easily have 24x that amount. For a day that's not much but could lead to scalability issues as the range and the number of services per account increases.
  2. Accuracy - Billing and usage data for certain cloud providers are usually not final until the day is completed. Therefore, if you are querying at an hourly level for the current day, there is a chance that the usage amounts you see would differ when querying for that same hour after the day has concluded. This is because cloud providers usually adjust for cost savings, discrepancies, and other potential changes. So, there could be consistency issues when tracking real-time data.
  3. Effort Required - The amount of effort required to implement this since it would affect almost all aspects of the app was a concern as well when examining how valuable it would be in respect to the first two points mentioned above. We like to remain feature parity with all cloud providers, so this would include implementing that granularity level for CSP's core logic, adding support for the parameter via the API and CLI, adding additional support to the caching methods, as well as the option to display within the UI with any additional front-end components.

We have been considering introducing limitations for request ranges of each grouping methods to help encourage best practices around monitoring usage. For example, if requesting estimate data at a daily granularity, then limiting request ranges to only 90 days since anything beyond that would cause diminishing returns on the value of that granularity -- thus suggesting a larger granularity of monthly as a better alternative and view for the larger range.

If we were to go with the route of limiting request ranges per grouping method, this could help for the first point when discussing scalability. As far as accuracy, perhaps that could just be a tradeoff and documented risk of using this grouping method.

Please let me know your thoughts on some of those points. Our mind is still open to the idea, so would love to continue to have a conversation around this to see if it makes sense and to potentially develop a plan for it.

@FlorianWeissDev
Copy link
Contributor Author

Hey @4upz thank you for the detailed response!

Regarding the accuracy issue, I would see it as a trade off the users have to evaluate for their respective use case. But this is not the use case I had in mind. I want to calculate the data once a day and have a break down of the data with hourly granularity. For example, to make it easier to connect a specific deployment to a spike of emissions. Or to make visible that the emissions are not constant but change with the time of day. This assumes that the CCF calculation logic is able to produce these numbers in the first place (considering the energy mix and so on).

I'm using the CCF library only to calculate the data. I don't think, I can contribute much to the issues that may arise with the CLI or Web-Server approaches, as I don't have much experience with them.

@4upz
Copy link
Member

4upz commented Mar 20, 2024

@FlorianWeissDev That makes sense, thanks for sharing your thoughts and more details about your use case.

I think this is worth exploring then! I'll add it to our backlog so that we can discuss prioritization and implementation details. I agree that the tradeoffs should be up to the users and that we'll just have to properly document it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 馃 In Analysis
Development

No branches or pull requests

3 participants