Sitemap URL Extractor
Use cases
Dual-strategy XML parsing: ElementTree with namespace handling, regex fallback (<loc>.*?</loc>).
Recursive queue-based processing of sitemap indexes.
Gzip decompression for .xml.gz files.
Configurable request delay (0-5 seconds slider).
Duplicate prevention via processed_sitemaps set tracking.
Platform
Browser-based (no installation required)
Input
Sitemap URL (index or individual .xml/.xml.gz)
Custom user agent string
Request delay: 0-5 seconds
Output
URL list with sitemap metadata (CSV/Excel)
Features
- Recursive sitemap index processing via queue
- Dual parsing: ElementTree + regex fallback
- Gzip decompression (.xml.gz support)
- Duplicate sitemap prevention
- Configurable delay slider (0-5 seconds)
How to use
- 1 Enter sitemap URL in the sidebar
- 2 Set custom user agent if needed
- 3 Configure request delay (recommended for large sites)
- 4 Review extracted URLs with metadata
- 5 Download as CSV, TXT, or Excel (timestamped filenames)
Want me to run this for you?
I offer this as a managed service. You get the insights without touching the tool.
Related Tools
Template Fingerprinting
Technical SEOIdentify page templates by analyzing HTML structure patterns.
Archive.org Broken Link Mapper
Technical SEOFind lost URLs via Archive.org and auto-map redirects using fuzzy matching.
LLM Sitemap Creator
Technical SEOUse GPT to generate hierarchical sitemap structures from keywords.
Let's work together
Monthly retainers or one-off projects. No lengthy reports that sit in a drawer.
Let's Talk