Published on July 19, 2022
Folks, we have an important question. Are you considering migrating your content from Drupal to Contentful? Then this post will show you how. Carpe diem!
Drupal 7 is officially reaching end of life on November 1, 2023. After that point, all support for this legacy content management system will come to a close.
So how should you plan for the change? Sure, you could start using the latest version of Drupal. Or — and maybe this is a wild and crazy idea, but hear us out — you could move everything over to a new CMS.
After all, if you’ve taken this long to move on from Drupal 7, there’s likely a good reason why you were taking your time.
Let’s hazard a guess: Was it a feeling of dissatisfaction with using a monolithic CMS? Is it too complex to maintain? Do you desire a conscious uncoupling? To upgrade to something more versatile for your content needs?
It’s time to use Contentful! We have a lot of resources on this site to outline the benefits of adopting a headless CMS. But rather than dive into a discussion about content strategy and data migration, this post explains how to migrate content from Drupal 7 to Contentful.
We’ll be outlining the workflow of a migration project step by step, together with examples, migration tools, and command line prompts.
Before we dive in, there are two prerequisites in order for this guide to be useful. First, you have worked with Drupal 7. Second, you have some knowledge of the command line interface in Contentful.
Ready? Let’s go!
A content audit is an essential stage of any migration process. Don’t embark on a migration plan without one, not unless you’re working purely on a “lift and shift” effort.
First, get a clear understanding of the content you want to migrate from Drupal 7. Extensive discussions with stakeholders will guide the process and establish the parameters of the content migration process.
In this sample checklist, we cover questions like content types, taxonomies, user profiles, and more. The answers are unique to your requirements and your content, and should probably be tracked in an Excel spreadsheet or similar.
Content (types)
Create a full list of content types and their fields.
Once it’s completely understood if all content or just some content needs to be migrated, start thinking about excluding content within the content types.
For example, do you really need to migrate old content published before a certain date?
Do you need to migrate archived content from the old website too?
Consult your SEO data and Google analytics to check visits against the given path of the node, and ensure you only migrate valuable content. Look for indicators like:
Overall traffic
Inbound links
Time spent on pages
For removed content, it would make sense to start thinking about collecting the URL paths for 301 redirects.
Taxonomies
Which tags can be imported?
Will content with unique tags be imported too?
User profiles
These can’t be imported into Contentful’s user profiles, but rather into a new content type, e.g., Blog Authors.
Views that are used
Outdated views may hint at data that is maintained but never used on the website.
We recommend that you use a site crawler to identify all URLs on your Drupal 7 website. Some pages may already have their redirects in place. Once you have migrated all of the content and are ready for QA, running the URL path will reveal any broken links or missing redirects.
Now that the audit is done and you have an overview of your content inventory, you can create a new content model with your stakeholders to come up with a simple structure.
In this example, we have a content model representing the structure of a blog:
This is the part that will help you move content from Drupal nodes into Contentful entries. In theory, you should be able to map each of the Drupal content type fields to an equivalent in Contentful content type fields.
Continuing with our example of the blog content model, this table represents a very simplified use case and mapping from the old CMS to the new system.
Contentful 'Type' | Contentful Field | Drupal Content-Type | Drupal Field | Comment |
---|---|---|---|---|
Blog | Internal Title |
|
| Format: [Blog] - [Author name] - [Drupal Title] |
Blog | Title | Blog | Title |
|
Blog | Slug | Blog | Path |
|
Blog | Body | Blog | Body | for the example we will just import the text |
Blog | Author (reference) | Blog | Author (uid) | user id |
Blog | Hero Image (asset) | Blog | Blog Main Image |
|
Blog | Published on | Blog | Published On |
|
Blog (Tags) | Tags | Taxonomy | Tags | export taxonomies |
Author | Internal Title |
|
| Format: [Author] - [First + Last Name] |
Author | First Name | user | Name (first part) |
|
Author | Name | user | Name (second part) |
|
Author | user |
| ||
Author | Photo (Asset) |
|
| no data to import |
Author | Short Bio | n/a | Biography |
|
Author | Phone |
|
| no data to import |
Author |
|
| no data to import | |
Author |
|
| no data to import | |
Author | Github |
|
| no data to import |
In order to import taxonomies into Contentful, you must decide if you want to work with existing Drupal tags or create a separate content type.
Here’s a brief overview of the two approaches:
Tags | Content Type |
---|---|
Already built into Contentful | Hierarchy is possible |
Only simplified hierarchy possible | Cross content type queries are more complex |
You can query content across content types | Nesting of entries will have an impact on GraphQL complexity |
Content types do not need to be updated as the reference happens inside the Tags entry editor | Additional content type |
For this example, we’ll only be looking at tags within Contentful.
You can create all tags using the CLI and then map the Contentful tag IDs with the tags you need to import.
If you’d like to create tags using the CLI, you can create them as follows:
The important elements to pay attention to are:
The name as it appears in the web interface
The ID which is used in each entry (see below) to refer to the tags
Visibility (either public or private)
For the import, we will need to generate the following construct for each tag within an entry:
To create the JSON export functionality (and to create dummy content for test purposes) you need to install the following modules:
Views & Views UI
If you don’t have views enabled, then we’re not sure why you’re working with Drupal in the first place! =D
Ctools
Required module by devel and others.
UUID
Highly recommended for the export of data to follow Contentful’s guide on entry IDs.
Views data export
Allows for data export with some available options.
Views data export JSON (not covered by Drupal’s security advisory policy) or Views Datasource
Enhances the Views Data Export module by providing a JSON option.
Devel generate
Generate dummy users, nodes, and taxonomy terms.
Realistic Dummy Content
Used to generate dummy content for test purposes.
To create an export view, simply create a new view with the content types you want to export, or start with a single one first to work the process.
You should certainly make use of the filters to exclude old content (in line with the content audit you conducted earlier). For example:
Old, outdated content published before a certain date
Content that’s no longer being used or not published
Excluded via certain criteria (e.g. within a category, written by a certain author, and so on)
For our purposes, we’re only interested in:
Title
User UID
Body
Node UUID
Path
Content type
Updated date (used to display and order the blog post)
This screenshot illustrates how we’ll be migrating content in our example of a blog post:
Important to note is that the format in which we select the JSON format is now available through the different modules we installed earlier.
With the above module and Drupal View, you’ll generate a very flat JSON object for the blog post:
We’ll also do the same for the user data:
Please note: Contentful doesn't apply content types to a user the way Drupal 7 does. We'll need to create a custom text person which we can use later for importing.
In this example, we’ll get all assets without filtering.
This generates the following JSON:
Finally, combine all of the data into one JSON object:
The data that we’re going to export will need to match exactly what the CLI importer can handle.
Here is an example of what will be required for our blog post (with unnecessary data removed for the import).
Sample for assets:
You can generate the above by exporting sample content from your space by using the CLI. This will give you the general object structure you require for the import.
A useful tool for creating PHP objects from JSON is Convert JSON Object to PHP Array Online.
For each node that we want to migrate, we will need to create a content-type specific object, and need to adjust the following aspects from the schema above:
Space ID
Environment ID
Content-type ID
Metadata (if applicable)
Fields
These will be specific for each content type
May require localized content
Include assets
Include references to other entries
Unfortunately, an import will assign the user of the CMA key that was used during the migration as the author. Therefore, the following items are not needed in the import.
First, let’s move the export file into our working directory, then create a simple PHP file. Within our new PHP file, let’s get the JSON:
Let’s set some variables to reuse:
And now the fun part! Loop through the PHP object and create the structure we need for importing assets or entries:
CreateEntry
generates the skeleton of an entry and calls another function to create the fields.
createFields
just calls the specific functions for the different content types and returns the results for each.
Creates the fields for the blogPost
content type. Note that for this example we are moving the title into the internal title.
Creates the fields for the person
content type. Same as above, we’re just moving the name into the first name and internal name fields. If there are fields that you don't have data for, you can remove them from your code (see the image in this example). Phone, Facebook, etc. could also have been removed.
And back to the assets. With the export, we have the desired structure (less the non-required fields), and can call createAsset.
This fills the PHP object with the necessary data. The URL requests the data from a specific folder within the sites/default/files folder. Since we do not have a title for the image, we will be using the image name, but please note that it’s not required to have a title.
In the end, let’s just spit the JSON out on the screen or put it into a file:
By now, we should have a nicely formatted JSON that’s ready for import. We can use Contentful’s CLI for the actual import.
Given that you could also the Drupal node IDs as the unique identifier in the JSON import file, new entries would be created for each entry, and there should not be any conflicts. We can then also reimport the data if needed as the content would be overwritten with an additional import based on the same ID.
The content migration in this example was only concerned with text from a blog. But what happens when you’re working with more than just text?
The Drupal Body and Summary fields, plus many other Drupal fields, may contain HTML and possibly even CSS. Unfortunately, migrating content from free-form text fields to Contentful is not straightforward since Contentful uses a Rich Text Editor that stores the data in a JSON object.
But not to worry! A viable solution is to migrate your HTML to Contentful by using Turndown. This migration tool converts HTML to markdown, and then the next step would be to convert the markdown to Rich Text using a markdown converter.
And there you have it! A successful content migration from Drupal 7 to Contentful. Doesn’t that feel so much better now? It's practically a whole new website!
Hopefully along the way you’ll have learned a few new things. Perhaps this exercise would be a good template for another content migration project you have to perform down the line, or you’ve picked up some tips and tricks for web content management in general.
Here’s a summary of some of the lessons we learned while conducting this exercise for ourselves:
PHP date formats are not the same as on Contentful.
Datetime format in PHP for export: Y-m-d\TH:m:s.ms\Z
which returns 4 digits for milliseconds while Contentful requires 3 digits and fires a validation error for the import.
Quick fix: substr_replace($array["Updated date"],"",23,1)
Drupal node IDs are nice but do not conform to the standards of Contentful. It’s highly recommended to install the UUID module to be able to export them (fewer replacement functions if you do it in the code).
Carefully consider validations and regular expression validation when setting up your content model. Your migrated content will need to pass the validation for the import.
Helpful online tools:
Online JSON to PHP object converter: Convert JSON Object to PHP Array Online
Online PHP date formatted: NSDateFormatter.com - Live Date Formatting Playground for Swift
json_encode: Ensure to have the JSON_UNESCAPED_SLASHES
flag set or else it adds backslashes in front of every forward slash.
The file import requires you to have an SSL connection to your remote server (with the test server, we didn’t set this up initially).
Ensure to aggregate data in Drupal views to avoid duplicates in the export and thus retrieving 409 errors during the import.
CLI: Ensure you have the latest package installed.
The Contentful Professional Services team provides a Content Migration offering as part of a suite of training and consultation products. If you would like some help, don’t hesitate to drop us a line.
Contentful’s API-first content platform is purpose-built for creating omnichannel digital experiences. The platform helps digital teams innovate, iterate, and go to market faster with an agile, modern tech stack that integrates seamlessly with ecommerce tools. Visit the Contentful Marketplace to see these integrations, and read our customer use cases to learn more about how Contentful can help your organization grow its digital footprint.
Subscribe for updates
Build better digital experiences with Contentful updates direct to your inbox.