New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More detailed documentation around the limitations for Azure and scaling the solution #1331
Comments
Hi @MartinDawson! Thanks for all of the feedback. These limitations definitely bring up challenges on scaling up, especially in an Azure environment. CCF is being worked on with these enhancements in mind. To address the concerns you described, there are two open issues: Both of these should help improve CCF specifically in the limitations you are talking about. Please feel free to add any ideas you may have to these issues, or if you would like to, please contribute! We really do appreciate your feedback! We also rely on and encourage our community members to open PRs for any general enhancements or fixes made on their own CCF instances/forks. |
Thanks for the great repo. I should have put it in the original comment, it's a very good starting point and amazing that it's open source. We looked at cost details API, it still isn't sufficient for large data IIRC. Unfortunately companies are going through huge cuts right now just like my previous one was that used this repo so there's less appetite from management to spend time contributing :/ |
@MartinDawson All very good points, and thanks for sharing. We totally understand if your organization doesn't have a budget for contributing. Even mentioning your workaround of using Azure Blob Storage and csv exports are extremely helpful! This is an alternative that we're considering building alongside the Cost Details upgrade as mentioned in #1175. This may end up being the recommended route for enterprise users with a similar scale of data. Ideally, with a built-in method for kicking off or regularly automating exports. Thanks again for your feedback as these type of issues help move work up our priority list by directly seeing its impact 😄. I'll close this issue since we already have actionable tasks for most of these suggestions, but don't hesitate to leave additional feedback or questions on the relative issues if any new ones come up. |
Hi,
After having used this for a while at our company for AWS and Azure emissions I think it would be useful to list the following issues with the current architecture and the way to solve them (A LOT of changes are needed to bring this code up to a scalable solution) and the performance recommendations aren't really too good here (https://www.cloudcarbonfootprint.org/docs/performance-considerations):
Using an API for this even with caching is not going to work, it's simply too much data to transfer. We solved this for Azure by having MSFT export all of our companies CSV data daily into azure blob storage containers. We then set up an ETL process with Azure Data Lake to process this and fire a daily azure function which seeds our own timescaledb database.
We also have a nodejs script that fetches the initial CSV data and seeds our database on start for many months of data.
This is the only way if you want resource level granularity, i.e like Carbon optimization has.
This obviously requires a lot of changes, both in the thoughtworks code Azure, Cli, App, api to add CSV support, an API layer, an ETL process, a nodejs script, modifications to the frontend, Azure functions etc etc.
Wanted to post this for any other devs who are thinking of using this for repository for scaling up, this repo is a good base but it will take a lot of changes, many many months of effort and lots of code.
The text was updated successfully, but these errors were encountered: