https://data.openei.org/s3_viewer?bucket=nrel-pds-nsrdb&prefix=v3%2F
I’m still getting my NAS set up but this is apparently urgent, so we can split this up
comment below to sign up for a range and i’ll check da boxes
-
1.6 TiB v3/nsrdb_1998.h5
-
1.6 TiB v3/nsrdb_1999.h5
-
1.6 TiB v3/nsrdb_2000.h5
-
1.6 TiB v3/nsrdb_2001.h5
-
1.6 TiB v3/nsrdb_2002.h5
-
1.6 TiB v3/nsrdb_2003.h5
-
1.6 TiB v3/nsrdb_2004.h5
-
1.6 TiB v3/nsrdb_2005.h5
-
1.6 TiB v3/nsrdb_2006.h5
-
1.6 TiB v3/nsrdb_2007.h5
-
1.6 TiB v3/nsrdb_2008.h5
-
1.6 TiB v3/nsrdb_2009.h5
-
1.6 TiB v3/nsrdb_2010.h5
-
1.6 TiB v3/nsrdb_2011.h5
-
1.6 TiB v3/nsrdb_2012.h5
-
1.6 TiB v3/nsrdb_2013.h5
-
1.6 TiB v3/nsrdb_2014.h5
-
1.6 TiB v3/nsrdb_2015.h5
-
1.6 TiB v3/nsrdb_2016.h5
-
1.6 TiB v3/nsrdb_2017.h5
-
1.5 TiB v3/nsrdb_2018.h5
-
1.5 TiB v3/nsrdb_2019.h5
-
1.5 TiB v3/nsrdb_2020.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_1998.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_1999.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2000.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2001.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2002.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2003.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2004.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2005.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2006.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2007.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2008.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2009.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2010.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2011.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2012.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2013.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2014.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2015.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2016.h5
-
5.7 GiB v3/puerto_rico/nsrdb_puerto_rico_2017.h5
-
831.6 GiB v3/tdy/nsrdb_tdy-2016.h5
-
831.6 GiB v3/tdy/nsrdb_tdy-2017.h5
-
796.3 GiB v3/tdy/nsrdb_tdy-2018.h5
-
796.3 GiB v3/tdy/nsrdb_tdy-2019.h5
-
796.3 GiB v3/tdy/nsrdb_tdy-2020.h5
-
831.6 GiB v3/tgy/nsrdb_tgy-2016.h5
-
831.6 GiB v3/tgy/nsrdb_tgy-2017.h5
-
796.3 GiB v3/tgy/nsrdb_tgy-2018.h5
-
796.3 GiB v3/tgy/nsrdb_tgy-2019.h5
-
796.3 GiB v3/tgy/nsrdb_tgy-2020.h5
-
831.6 GiB v3/tmy/nsrdb_tmy-2016.h5
-
831.6 GiB v3/tmy/nsrdb_tmy-2017.h5
-
796.3 GiB v3/tmy/nsrdb_tmy-2018.h5
-
796.3 GiB v3/tmy/nsrdb_tmy-2019.h5
-
796.3 GiB v3/tmy/nsrdb_tmy-2020.h5
distribution
1 Like
Grabbing this one but DL going kinda slow.
1 Like
I am on v3/nsrdb_2020.h5
Fair speed for it.
Have the space as well.
Will update as I go
3 Likes
“I have 1998-2002 downloading now of the 5.7gb.”
this is of the puerto rico datasets yes? the 5.7gb files?
yeah - will update the list (I can click the boxes on my own)
1 Like
ok since the s3 downloader downloads all the files in parallel rather than sequentially, i’m going to go one by one so we at least have some useful data if the link gets shut off, going backwards in tdy from 2020. unchecking the rest and will check off as i go
2 Likes
Asta
9
I can do all of
831.6 GiB v3/tgy/nsrdb_tgy-2016.h5
831.6 GiB v3/tgy/nsrdb_tgy-2017.h5
796.3 GiB v3/tgy/nsrdb_tgy-2018.h5
796.3 GiB v3/tgy/nsrdb_tgy-2019.h5
796.3 GiB v3/tgy/nsrdb_tgy-2020.h5
if that works
I’ll start at the most recent (2020) and go backwards so if others want to take some just ping and I’ll report my progress
EDIT: estimate is probably around 12 hours for the tgy set
2 Likes
Asta
10
Friend of mine is doing v3/nsrdb_2019.h5 right now.
1 Like
highest priority is apparently the tmy directory - so while someone is already on this, if anyone else has spare bandwidth and space, duplicates of that would be welcome
1 Like
zyyygz
12
Unless there is something more important, I will grab a copy of tmy and:
1.6 TiB v3/nsrdb_2012.h5
1.6 TiB v3/nsrdb_2013.h5
1.6 TiB v3/nsrdb_2014.h5
1.6 TiB v3/nsrdb_2015.h5
1.6 TiB v3/nsrdb_2016.h5
1.6 TiB v3/nsrdb_2017.h5
1.5 TiB v3/nsrdb_2018.h5
1.5 TiB v3/nsrdb_2019.h5
1.5 TiB v3/nsrdb_2020.h5
(assuming more recent data is more important)
2 Likes
Asta
13
Given the importance of tmy, I’ll grab v3/tmy/nsrdb_tmy-2018.h5 after nsrdb_tgy-2020.h5 finishes
Since it seems like people are generally going most recent to backwards.
EDIT since I’m limited to 3 replies lmao
Friend has switched from the above named one to v3/tmy/nsrdb_tmy-2017.h5 since that’s the critical set
EDIT AGAIN
I’ve finished tgy/nsrdb_tgy-2020.h5 and have moved on to tmy/nsrdb_tmy-2018.h5
2 Likes
irenes
14
I have people doing nsrdb_2018.h5, nsrdb_2017.h5, and tdy/nsrdb_tdy-2018.h5.
irenes
15
I also have someone doing nsrdb_2013.h5 but it’s slow
as we get close to the theoretical “end of day may mean midnight eastern time,” if everyone could take a second to refresh what they have and what they are grabbing that would be good:
for me:
Current status
| file |
completion |
percent |
rate |
eta |
tdy/nsrdb_tdy-2018.h5 |
431/796GB |
54% |
40MB/s |
2h |
tdy/nsrdb_tdy-2019.h5 |
494/796GB |
62% |
100MB/s |
1h |
tdy/nsrdb_tdy-2020.h5 |
645/796GB |
81% |
45MB/s |
1h |
Enqueued
I spend the day getting my NAS set up and had a bit of data loss from a canceled transfer, so i am currently only copying these single files, but after this round I will queue these up overnight, in case they stay up, going backwards from where @zyyygz is leaving off
1.6 TiB v3/nsrdb_2008.h5
1.6 TiB v3/nsrdb_2009.h5
1.6 TiB v3/nsrdb_2010.h5
1.6 TiB v3/nsrdb_2011.h5
and that’s probably all i’ll be able to grab overnight so i won’t claim more.
Question
Has anyone been able to exceed ~100MB/s on a single machine? I haven’t been able to with aws cli, rclone, or https downloading across single/multiple files and with a bunch of different threading and cache params. I have confirmed my download bandwidth is >500MB/s and my write speed is 1GB/s and can’t find the bottleneck unless it’s being imposed by AWS
1 Like
irenes
17
from some testing a friend of mine did, it appears that Amazon rate-limits each downloading machine, across all download methods in use at the time
I do have to advise not trying to subvert this, it’s the kind of thing they could ban people for
Asta
18
Just finished tmy-2018; as I do not believe anyone has stated they have tmy-2019, I will download that before moving back to tgy.
zyyygz
19
tmy – 1.5TiB/4.0TiB – ETA 14h
nsrdb_2020.h5 1.3TiB/1.5TiB – ETA 1h
nsrdb_2018.h5 1.3TiB/1.5TiB – ETA 1h
skipping nsrdb_2019.h5 since it’s already checked off (?) and continuing with nsrdb_2017.h5 once 2018 is done.
does “checked” state mean “done”?
Has anyone been able to exceed ~100MB/s on a single machine?
above is on a single machine and appears to be 150MiB/s – three aws processes running in parallel (sync and two cp). no tuning. peak was 300MiB/s and it got throttled to this number. running a single aws process only gives me 90 MiB/s max.
I’ve started on
1.6 TiB v3/nsrdb_1998.h5
1.6 TiB v3/nsrdb_1999.h5
1.6 TiB v3/nsrdb_2000.h5
Will post back once they’re done.
1 Like