Possibilities for automating large scale Sentinel imagery dataset building? #7098
-
Has anybody worked on problems at this scale? e.g. this, writ large https://examples.dask.org/applications/satellite-imagery-geotiff.html There may of course be many suggestions for improvements of this one scene example - 10000*10000 pixel example and a couple of cluster trials on AWS https://github.com/RichardScottOZ/dask-era5/blob/main/notebook/pangeoFargateTrialSA%20(1).ipynb If this feasible, or feasible in combination with some other methods? Thanks, Richard |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
Hi Richard,
Yes, people do operate at this scale with Dask, but there are pain points.
Generally people find that initial processing of the graph takes a while
(minutes) which can feel awkward because you're not getting any feedback.
You might also find that the graph is large in size, requiring the
scheduler to have substantial memory. You might also find that if your
cluster is very large that the scheduler eventually becomes a bottleneck.
These problems are all being worked on currently (you're not alone in
wanting to increase scale) so things should steadily improve over the next
few months. However, things should be doable today if you're careful. If
you have specific questions I encourage you to ask them.
…On Fri, Jan 22, 2021 at 2:06 AM RichardScottOZ ***@***.***> wrote:
[image: image]
<https://user-images.githubusercontent.com/72196131/105476267-57cea900-5cf0-11eb-9118-d39ab026da18.png>
I posted here
https://discourse.pangeo.io/t/best-practices-for-automating-large-scale-sentinel-dataset-building-and-machine-learning/1161/3
Has anybody worked on problems at this scale?
Basically trying to automatically make a median mosaic of a time series
over a year of Sentinel data - so a 100TB+ problem for a state of
Australia. Petabytes for a large country.
e.g. this, writ large
https://examples.dask.org/applications/satellite-imagery-geotiff.html
There may of course be many suggestions for improvements of this one scene
example - 10000*10000 pixel example
https://github.com/RichardScottOZ/dask-era5/blob/main/notebook/PANGEO-Sentinel-Intake-SA-OZLocation.ipynb
and a couple of cluster trials on AWS
https://github.com/RichardScottOZ/dask-era5/blob/main/notebook/pangeoFargateTrialSA%20(1).ipynb
If this feasible, or feasible in combination with some other methods?
Thanks,
Richard
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#7098>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTH5QIJA5NSANPRETA3S3FE3RANCNFSM4WOH34UQ>
.
|
Beta Was this translation helpful? Give feedback.
-
This did get worked out via open data cube techniques. |
Beta Was this translation helpful? Give feedback.
This did get worked out via open data cube techniques.