Automating Daily Data Delivery With the API

Why Automate

The WhoisExtractor API refreshes data once per day at 6:00 AM IST (UTC+5:30). If your research or sales process depends on fresh domain, WHOIS, or website data, manually downloading files each morning is error-prone and easy to forget. A scheduled automated job handles it reliably while your team focuses on using the data.

This article shows how to set up automated downloads on Linux/macOS (cron), Windows (Task Scheduler), and in Python and PHP scripts.

Before setting up automation, read Getting Started with the API to confirm your API key and product IDs.

When to Schedule the Job

Data is available after 6:00 AM IST each day. To avoid running the job before data is ready, schedule it for 7:00 AM IST or later. In UTC, that is 01:30 AM UTC.

If you are in a different timezone:

Your timezone	Schedule at
UTC	01:30
US Eastern (EST)	8:30 PM previous day
US Pacific (PST)	5:30 PM previous day
UK (GMT)	01:30
Australia Eastern (AEST)	11:30 AM

Linux / macOS - Cron

Open your crontab:

crontab -e

Add a line to run a download script every day at 02:00 UTC (safely after data is ready):

0 2 * * * /usr/bin/bash /home/user/scripts/download-whois.sh >> /home/user/logs/whois-download.log 2>&1

Example Shell Script

#!/usr/bin/env bash
set -euo pipefail

API_KEY="YOUR_API_KEY"
PID="3"
DATE=$(date -u +%Y-%m-%d)
OUTPUT_DIR="/data/whois"
OUTPUT_FILE="${OUTPUT_DIR}/whois-${DATE}.zip"

mkdir -p "$OUTPUT_DIR"

echo "[$(date -u)] Downloading WHOIS data for ${DATE}..."

curl --silent --show-error --fail \
  "https://get.whoisextractor.com/?api=${API_KEY}&pid=${PID}&date=${DATE}" \
  --output "$OUTPUT_FILE"

echo "[$(date -u)] Saved to ${OUTPUT_FILE} ($(du -sh "$OUTPUT_FILE" | cut -f1))"

# Optional: unzip immediately
unzip -o "$OUTPUT_FILE" -d "${OUTPUT_DIR}/csv/${DATE}/"

echo "[$(date -u)] Done."

Make it executable:

chmod +x /home/user/scripts/download-whois.sh

Windows - Task Scheduler

PowerShell Script

Save this as C:\Data\WhoisDownload\download-whois.ps1:

$ApiKey   = "YOUR_API_KEY"
$Pid      = "3"
$Date     = (Get-Date).ToString("yyyy-MM-dd")
$OutDir   = "C:\Data\WhoisDownload"
$OutFile  = "$OutDir\whois-$Date.zip"

if (-not (Test-Path $OutDir)) { New-Item -ItemType Directory -Path $OutDir | Out-Null }

$Url = "https://get.whoisextractor.com/?api=$ApiKey&pid=$Pid&date=$Date"

Write-Host "Downloading WHOIS data for $Date..."
Invoke-WebRequest -Uri $Url -OutFile $OutFile

Write-Host "Saved to $OutFile"

# Unzip
Expand-Archive -Path $OutFile -DestinationPath "$OutDir\csv\$Date" -Force

Write-Host "Done."

Creating the Task Scheduler Job

Open Task Scheduler (search in Start menu)
Click Create Basic Task
Set the trigger to Daily at 07:30 AM IST (or converted to your local time)
Set the action to Start a program: powershell.exe
Add arguments: -NonInteractive -File "C:\Data\WhoisDownload\download-whois.ps1"
Finish and enable the task

Python Automation Script

This script downloads multiple product IDs in one run, useful if you subscribe to more than one database:

import os
import requests
from datetime import date, timedelta

API_KEY = "YOUR_API_KEY"
OUTPUT_DIR = "/data/whois"
PRODUCTS = {
    "whois-global": 3,
    "domains-daily": 10,
    "website-daily": 11,
}

today = date.today().strftime("%Y-%m-%d")
os.makedirs(OUTPUT_DIR, exist_ok=True)

for name, pid in PRODUCTS.items():
    url = f"https://get.whoisextractor.com/?api={API_KEY}&pid={pid}&date={today}"
    out_path = os.path.join(OUTPUT_DIR, f"{name}-{today}.zip")

    print(f"Downloading {name} for {today}...")
    response = requests.get(url, stream=True, timeout=300)

    if response.status_code == 200:
        with open(out_path, "wb") as f:
            for chunk in response.iter_content(chunk_size=8192):
                f.write(chunk)
        size_mb = os.path.getsize(out_path) / (1024 * 1024)
        print(f"  Saved {out_path} ({size_mb:.1f} MB)")
    else:
        print(f"  ERROR {response.status_code}: {response.text}")

Handling Errors and Retries

The API occasionally returns a 404 if called before that day's data generation is complete. Add a retry:

import time

MAX_RETRIES = 3
RETRY_DELAY_SECONDS = 600  # wait 10 minutes before retrying

for attempt in range(1, MAX_RETRIES + 1):
    response = requests.get(url, stream=True, timeout=300)
    if response.status_code == 200:
        # save file...
        break
    elif response.status_code == 404 and attempt < MAX_RETRIES:
        print(f"  Data not ready yet - retrying in {RETRY_DELAY_SECONDS // 60} minutes...")
        time.sleep(RETRY_DELAY_SECONDS)
    else:
        print(f"  Failed after {attempt} attempt(s): {response.status_code}")
        break

Keeping Only Recent Files

To avoid filling up disk space, clean up files older than 7 days:

# Linux/macOS
find /data/whois -name "*.zip" -mtime +7 -delete

# Windows PowerShell
Get-ChildItem "C:\Data\WhoisDownload\*.zip" |
  Where-Object { $_.LastWriteTime -lt (Get-Date).AddDays(-7) } |
  Remove-Item

Whois Database

Domains Database

Website Database

Help & Support

Resources & Account