AI Automation Workflow: Python + WordPress REST API Migration Engine | A Square Solutions

748

Pages fetched via REST API

718

Successful content updates

25

Posts per batch

1.2s

Sleep between batches

0

API failures / HTTP errors

30

Posts spot-checked post-run

The Problem: 718 Pages to Update, Zero Margin for Error

A Square Solutions needed to migrate its internal URL architecture from /services/ to /services/ across every post, page, and structured data block on the site. With 748 pages in total and no reliable count of which contained the old URL pattern, manual editing was not viable.

The requirements were strict: no broken links at any point, no double-replacement on re-run, no blind faith in a single execution completing without interruption. The automation needed to be robust enough to be killed mid-run and restarted safely.

The core engineering challenge was not the migration logic itself — a simple string replace. The challenge was making it safe to run multiple times on a live production WordPress site without risk of corrupting already-processed content. This is the idempotency problem, and it shaped every design decision.

This case study documents the engineering choices, the real failures encountered (stdout buffering, timeout handling, the 68-post second-run requirement), and the lessons that apply to any batch automation targeting a live WordPress REST API.

Idempotent Batch Architecture

Core Design Principles

  1. Read before write. Every post was fetched with context=edit to get raw (unrendered) content before any update was attempted. This prevented updating already-migrated posts and gave accurate before/after comparison.
  2. Skip if target string absent. If /services/ was not in the raw content, the post was logged as SKIP and no API write was made. On a second run, all previously updated posts would hit this branch — making re-runs free to operate.
  3. Replace all three string variants. The old URL appeared in three forms: absolute URL with domain, root-relative href with double quotes, and root-relative href with single quotes. All three patterns were replaced in a single pass to avoid partial migration states.
  4. Batch with conservative rate limiting. Posts were fetched in pages of 25 via ?per_page=25&page=N. A 1.2-second sleep between batches was enforced based on observed server response times (~5.5s per POST), not on any API rate limit enforcement.
  5. Log everything to stdout. Every post ID, action (SKIP/UPDATE/ERR), and HTTP status was printed immediately — not buffered. This allowed progress monitoring and post-run audit without a separate logging system.

Core Script Pattern

BASE = “https://asquaresolution.com/wp-json/wp/v2”
auth = HTTPBasicAuth(“cantact”, APPLICATION_PASSWORD)
OLD = “/services/”
NEW = “/services/”

page = 1
while True:
r = requests.get(
BASE + f”/posts?per_page=25&page={page}&context=edit”,
auth=auth
)
posts = r.json()
if not posts:
break # exhausted pagination

for post in posts:
pid = post[“id”]
raw = post[“content”][“raw”]

if OLD not in raw:
print(f” SKIP ID={pid}”)
continue # idempotent: already migrated

updated = (raw
.replace(D + OLD, D + NEW)
.replace(f’href=”{OLD}”‘, f’href=”{NEW}”‘)
.replace(f”href='{OLD}’”, f”href='{NEW}’”))

r2 = requests.post(
BASE + f”/posts/{pid}”,
auth=auth,
json={“content”: updated}
)
status = r2.status_code
print(f” {‘OK ‘ if status == 200 else ‘ERR’} ID={pid} HTTP={status}”)

page += 1
time.sleep(1.2) # between batches

Why No Database Direct Access?

Direct MySQL UPDATE on wp_posts.post_content would have been faster (seconds vs 70 minutes), but it bypasses WordPress’s content pipeline: no hook fires, no cache invalidation, no revision history. The REST API approach was chosen deliberately — it ensures WordPress processes each update through its normal stack, maintaining consistency with how the CMS expects content to be modified.

Real Failures and How We Handled Them

stdout Buffering Made the Script Appear Stalled

On first run, the terminal showed no output for the first ~45 seconds. The script appeared to have hung. The cause: Python’s stdout was being line-buffered when run in a terminal but block-buffered when run via certain shell invocations. The print statements were queued, not flushed.

Recovery: Added sys.stdout.flush() after each batch’s print block. On Windows, the alternative is running with python -u script.py (unbuffered mode). The script had not stalled — it was silently processing the first batch.

Single Slow POST Blocking Progress

Within batch 12 (~posts 275–300), one POST request took 47 seconds to return a 200 response. The script appeared frozen. This was a server-side issue: the specific post had complex nested shortcode content that WordPress’s update hook processed slowly. The requests library default timeout was not set, so the script waited indefinitely.

Recovery: Added timeout=90 to all requests calls. If a single POST exceeds 90 seconds, it raises requests.exceptions.Timeout which the except block catches, logs as ERR, and continues to the next post. The slow post was manually re-run in a targeted single-post script after the main batch completed.

Terminal Session Timeout at ~650 Posts

The first run processed approximately 650 posts (26 batches of 25) before the terminal session was interrupted. No error was thrown — the process was simply killed when the session closed. At this point, 68 posts remained unprocessed.

Recovery: Because the script was idempotent, a second run was started with no modifications. All 650 previously processed posts hit the SKIP branch (no /services/ in raw content). The 68 remaining posts were processed normally. Total second-run time: approximately 8 minutes. This is the intended behaviour of the idempotent design.

Pages Endpoint Required Separate Script

The WordPress REST API separates posts (/wp/v2/posts) from pages (/wp/v2/pages). The initial script only targeted /posts. Seven pages — including the /services/ page itself — still contained the old URL after the first full post run.

Recovery: Ran a separate pass against the /pages endpoint using identical logic. 7 pages were found; all 7 were updated. The /services/ page (ID 8) had a nav link that also required a specific fix in its navigation block structure.

Runtime and Throughput Data

MetricObserved ValueNotes
Average GET response time~1.8s per batch of 25Fetching 25 posts with context=edit in single request
Average POST response time~5.5s per updateWordPress processes hooks, invalidates cache, saves revision
Rate limit sleep1.2s between batchesConservative — no API rate limit encountered
Effective throughput~9 posts/minute(25 posts × 5.5s avg) + 1.2s + 1.8s GET ÷ 60
Run 1 — posts processed~650Approx 26 batches before timeout
Run 1 — elapsed time~62 minutes650 posts ÷ 9 posts/minute
Run 2 — posts processed68650 SKIP + 68 UPDATE
Run 2 — elapsed time~8 minutes68 posts + 650 SKIP checks (fast)
HTTP errors0No 4xx or 5xx responses across either run

The 5.5s average POST time is higher than typical because each WordPress REST API update triggers: content sanitisation, hook firing (Rank Math, Astra, cache plugins), revision creation, and LiteSpeed Cache invalidation. This is the cost of using the proper API rather than direct DB access — and it’s worth it for production data integrity.

How We Confirmed It Worked

  1. Spot-check 30 posts. After Run 2, 30 post IDs were selected systematically (every 24th post from the full list). Each was fetched with context=edit and inspected for presence of /services/ and absence of /services/ in raw content. All 30 passed.
  2. Run 3 (verification only). A read-only pass fetched all posts and counted occurrences of /services/ without making any writes. Result: 0 occurrences across all 748 posts. Migration confirmed complete.
  3. Manual browser check. Navigated to 10 posts in a browser, inspected rendered HTML source for /services/ references. Zero found. LiteSpeed Cache had been purged before browser checks.
  4. Schema validation. Ran the Schema Markup Validator on 5 pages after migration. No new schema errors introduced by the content updates. The URL replacements in structured data JSON-LD blocks were correctly handled by the string replacement pass.

Automation Patterns That Apply Elsewhere

The engineering patterns from this migration are not specific to URL migration. They apply to any bulk WordPress content operation — schema injection, SEO meta updates, content block insertion, or structured data patching.

PatternImplementationUse Case
Idempotent skipCheck for target string before writeAny bulk update that may be interrupted and restarted
context=edit fetchAdd ?context=edit to all GET requestsRequired for raw content — rendered content strips code blocks
Conservative rate limit1.2s sleep + timeout=90 per requestAny live production API — prevents server overload
Pagination exhaustionLoop until empty response, not fixed page countCorrect for any API where total count may change during run
Unbuffered stdoutpython -u or sys.stdout.flush()Long-running scripts where progress must be monitored
Separate posts/pages endpointsRun independent loops for /posts and /pagesWordPress always — posts and pages are different REST resources

Experiments at the AI Execution Lab

The batch architecture developed for this migration is documented and tested further at the AI Execution Lab — A Square Solutions’ operational testing environment for automation scripts, LLM integrations, and WordPress workflow tooling.

The Lab is where we stress-test batch sizes, observe real response time distributions, and develop the rate limiting heuristics that feed back into production scripts. The 1.2s/batch and 25 posts/page figures came from Lab observations before the production run.

Current Lab experiments in progress related to this case study:

  • Adaptive rate limiting: adjusting batch sleep based on observed response time variance
  • Error recovery patterns: automatic retry with exponential backoff for HTTP 5xx responses
  • Schema injection automation: inserting structured data blocks into existing Gutenberg pages without breaking block markup
  • Multi-site batch tooling: extending the pattern to manage multiple WordPress installations from a single script

Tools Used

Python 3.x
requests library
HTTPBasicAuth
WordPress REST API v2
LiteSpeed Cache (purge)
Schema Markup Validator
Google Search Console
AI Execution Lab

Engineering Lessons

Lesson 1

Idempotency is the most important property of any batch automation. The ability to kill and restart the script without fear made the timeout incident a non-event. Without idempotency, a mid-run failure would have required manual audit of which posts had been updated and which had not — potentially hours of work.

Lesson 2

Always set request timeouts in production automation. The default requests library timeout is None — it will wait indefinitely. One slow POST holding the entire batch for 47 seconds was the result. timeout=90 is a reasonable default for WordPress REST API calls on shared hosting.

Lesson 3

Test stdout buffering before a long run. A script that appears stalled is psychologically difficult to trust. Confirm flush behaviour before starting any run longer than 5 minutes. python -u is the lowest-friction fix.

Lesson 4

WordPress posts and pages are different REST endpoints. The /wp/v2/posts endpoint does not include pages. Any operation targeting the full site must loop both /posts and /pages separately. Custom post types require additional endpoint loops.

Lesson 5

The REST API’s cost is the guarantee. Direct MySQL access would have been 100× faster, but would have bypassed cache invalidation, hook firing, and revision history. For production content on a live site, the API overhead is the correct tradeoff — it keeps WordPress’s internal state consistent.

Lesson 6

A verification-only run is underrated. Run 3 (read-only, count occurrences of old string) cost ~3 minutes and gave definitive confirmation that the migration was complete. This is cheaper than checking GSC, safer than trusting the spot-check sample, and produces a clear audit record.

Also in This Implementation Series

Methodology & Execution Context

This case study documents work performed on asquaresolution.com (WordPress, Astra, Elementor, Rank Math, LiteSpeed). Changes were applied programmatically through the WordPress REST API in staged, idempotent batches and verified live after each step. Reported figures and timelines reflect this specific site architecture and the crawl and indexing conditions at the time of implementation.

Limitations

Results reflect this site’s architecture and the crawl and indexing conditions during the engagement. AI citation behavior and search algorithms change frequently, and not all implementations generalize identically. Timelines depend on crawl frequency, domain authority, and content depth.

🤖 Ask Our AI — A Square Solutions