Planning
For any migration, it’s important to know what we’re getting ourselves into so we can assess the time/resourcing needed and also get out in front of any potential issues. The following questionnaire should be used whenever we receive a new migration request (this can be used internally and also sent to the partner where appropriate).
Import Questionnaire (should be used for all imports/migrations)
- How many posts, pages, images, and comments are in this migration?
- What’s the combined file size of all your media assets? This helps us estimate approximate import time.
- How many users are in this migration? How many of them require “real” WordPress accounts with login access, and how many of them require “guest” accounts with no login access?
- Please ensure that guest author slugs do not match the WordPress.com username (only applies to WordPress.com imports, not Pressable, etc.), because this will cause problems upon import.
- Are there any custom post types that we should be aware of?
- Are there any custom taxonomies we should be aware of?
- Are there any special considerations or requirements for redirects that we should keep in mind?
- Are there any special considerations for content segmentation (i.e. different languages)?
- Is the site content ready to go live, or do you plan on doing edits and cleanup post-import?
- Will the content be importing into a production/live site getting traffic?
- Will there be any content merges after launch that we should know about (a merge would be importing content from a different source/site)?
- By what date will you be able to get the full export to us?
- By what date do you hope to have the full import completed?
The above questions apply to new migrations, deltas (imports containing new content prior to launch to “catch up” the staging site), and also imports into a production/live site (i.e. a merge of another web property).
Supplemental Questionnaire
(should be used for imports/migrations into production sites where there is already traffic and data present)
The following questionnaire is supplemental and specifically used for imports into a current production/live site.
- Have you thoroughly tested and QA’d this import? Fixing issues on production after the import has completed can be very difficult.
- Have you completed a dry run of any CLI scripts that need to be run? Because we are manipulating content on production, this should be done.
- Are there any high traffic events we need to know about during this import? Running an import on production can impact the performance of your site.
- This may be obvious but just to state to make things extra clear — while we run the import the editorial team should not be attempting anything unusual, like switching themes
Scripting Migrations
When running scripts that create or update posts, it’s essential to set the WP_IMPORTING constant to true. Doing so limits unnecessary actions, reducing the risk of problematic outcomes.
It’s particularly critical on live production environments since imports can trigger emails and other actions, potentially leading to serious implications for our partners’ visitors, customers, and reputation.
Before running scripts, consider if the actions could lead to outcomes like partners and their users receiving unintended emails. If unsure, it’s best to check with the other devs on the team before initiating any data migration.
Tools
We have an import fixer plugin here that provides a number of WP CLI subcommands for fixing and cleaning up imports on (non-WordPress.com) sites.
The WordPress.com codebase also has a number of import/export tools for fixing up various kinds of situations. The plugin linked above is an attempt to formalize and improve upon those, as many of them are not suitable for production use in other environments. But, they may be helpful as a reference point for new tools.
Platform Quirks/Considerations
Pressable
- For simple WordPress-to-WordPress site migrations, use Pressable’s Automated Migration plugin (be sure to deactivate any plugins that are connected to third party APIs until the migrated site is live).
- View this page for a comprehensive checklist to complete a migration to Pressable
- Image src backfill needs to be throttled here in the importer to avoid database replication lag (import tool coming for this soon).
- It’s also important to consider the amount of tags/images/other content that could be very large. Imports with hundreds/thousands of media assets won’t complete using the web importer. Also, any data tables in the database that are not prefixed with the site’s table prefixed will be skipped. Some sites store data in custom tables they set up that get missed by the importer. It should be noted that VaultPress misses these as well.
WordPress.com
- The WordPress.com web importer is much more robust since it uses the jobs system to import and therefore can run much longer without failing. It is still, however, a better choice to use the CLI importer on a WordPress.com sandbox for anything that’s not a small import.
- User mapping/account creation will need extra care since accounts cannot be created by the importer automatically due to WordPress.com’s shared user table (you have one username across all of WPCOM). It’s generally a better idea to create all user accounts prior to import and create an import mapping file (a CSV) to assign users to their posts correctly during initial import.
WordPress.com VIP
- We’ve run into character-encoding issues on migrations from WordPress.com VIP class to Pressable.
- We’ve run into issues where references to images have needed special care.
- Akismet keys should be transferred to maintain ███████████████.
Import Tips & Tricks
Want to list post dates from a WXR in ascending chronological order? Try:
$ grep "post_date" example.wordpress.2018-11-27.xml | awk -F "\[" '{print substr($3,1,19)}' | sort
