wget
I’ve been using the following commands to scrape the sitemap(-s) and to download the HTML:
wget -qO- "https://www.arts.gov/sitemap.xml?page=1" | grep -Po "<loc>\K.+?(?=</loc>)" > urls1.txt
wget --convert-links --no-parent --wait=2 -i urls6.txt -x 2>&1 | tee arts.gov-sitemap1.log
There are better ways for sure, but that’s what I have been using.