MujRozhlas.cz audiostream downloader.
| debian | ||
| .gitignore | ||
| browser_extract.js | ||
| download_radiobook.py | ||
| episode_api.json | ||
| extract_episodes.py | ||
| MANUAL_EXTRACTION.md | ||
| README.md | ||
| requirements.txt | ||
| setup.sh | ||
MujRozhlas Radiobook Downloader
Python script to download all episodes from radiobook series on mujrozhlas.cz using Playwright to handle JavaScript-rendered content.
Prerequisites
1. Install Playwright
Check if python3-playwright is available in Debian repos:
apt-cache search python3-playwright
If not available, you'll need to create a Debian package (as per your preference) or install via pip:
pip3 install playwright
playwright install firefox
2. Install Python dependencies
sudo apt install python3-requests
Usage
Basic usage:
./download_radiobook.py "https://www.mujrozhlas.cz/radiokniha/zbynek-fiser-egon-bondy-statni-bezpecnost-pohled-do-zakulisi-sledovani-donaseni-vydirani"
With custom output directory:
./download_radiobook.py "https://www.mujrozhlas.cz/radiokniha/..." episodes/
How It Works
- Scraping: Uses Playwright to load the page with full JavaScript execution
- Detection: Tries multiple strategies to find audio URLs:
- Looks for
<audio>elements - Intercepts network requests for MP3 files
- Extracts JSON data from
<script>tags - Simulates clicking play buttons to trigger audio loading
- Looks for
- Debug Output: Saves HTML and JSON data for manual inspection if auto-detection fails
- Download: Downloads all found episodes with progress tracking
Expected Output
- Episode MP3 files named as:
01_Episode_Title.mp3,02_Next_Episode.mp3, etc. - Debug files:
page_debug.html- Full rendered HTMLscript_data_*.json- Extracted JSON data from page
Troubleshooting
Cloudflare Protection
If the script is blocked by Cloudflare, you may need to:
- Run with visible browser (set
headless=Falsein the script) - Add delays between requests
- Use residential proxy
No Episodes Found
Check the debug files in the downloads directory:
- Inspect
page_debug.htmlto see what was loaded - Check
script_data_*.jsonfor episode data structure - Manually extract audio URLs and modify the script
Manual Extraction
If automatic detection fails, you can manually add episodes to the script by modifying the scrape_episodes() method to directly populate self.episodes list with:
self.episodes = [
{'url': 'https://...episode1.mp3', 'title': 'Episode 1', 'number': 1},
{'url': 'https://...episode2.mp3', 'title': 'Episode 2', 'number': 2},
# ...
]
Target Series
Current target: Zbyněk Fišer / Egon Bondy a Státní bezpečnost
- 15 episodes total
- Released daily from January 21, 2026
- ~25-27 minutes per episode