Drupal to Arc XP migration guide
Why do media companies sway away from the largest open-source project in the world?Talk to us about your Drupal instance, we'll help to get rid of it
Why this guide?
TL;DR> In this long read, you’ll discover:
- What is Drupal, and why have publishers massively migrated to Drupal in the past decade
- Why Drupal is no more a good option for them
- How Arc XP, a SaaS CMS for publishers created by the Washington Post, works and why it’s a better choice compared to Drupal
- How long does it take, how much does it costs, and how you should organize your migration project, from Drupal to Arc XP
What is Drupal?
Drupal is an open-source CMS or, more precisely a framework, created in 2001 by Dries Buytaert. It started as a small blogging platform and became one of the largest and most used CMSes in the world in the past 20 years.
Drupal is popular because of it's modular
Drupal’s popularity comes from its very modular architecture, with a lightweight core and thousands of modules covering most of the features you might need. It’s also very easy to build your own modules.
The main difference between Drupal and, let’s say WordPress is that modules in Drupal can communicate with each other through APIs.
In WordPress (and in many other platforms), on the other hand, modules or plug-ins can only communicate with the core — aka the ‘official’ API.
Drupal has structured content management storage
Another innovation Drupal introduced back in 2005 is its ability to manipulate structured content.
While almost all CMSes at that time were ‘page builders’, Drupal was the first to introduce the CCK module, which was integrated later inside Drupal’s core (Field API).
It basically got the point very early: content may be presented in many shapes and colors to the end-user. Even if it wasn’t crystal clear in 2005, structured content paved the road to Drupal’s success: the launch of the iPhone and the explosion of mobile apps confirmed the right choice for the Drupal community.
Drupal has an active and vibrant community
The last, but most important reason for Drupal’s global success lies in its community.
As an open-source project, Drupal’s Association managed to develop the community to become the largest open-source project in the world, with over a million developers, thousands of contributors, and hundreds of companies working with and contributing to the Drupal codebase.
Why did so many publishers turn to Drupal in the past decade?
During my years running Adyax, one of the leading Drupal agencies in the world, we’ve migrated many publishers to Drupal.
Drupal was selected for the following reasons :
- Features, offered by thousands of modules
- Flexibility in the back-end and front-end
- The total cost of ownership
But the real reason was the lack of competition from SaaS editors. Publishers originally were heavy users of print-oriented platforms like Eidos or OpenText.
Those editor's architecture was print-oriented and when, suddenly, everything became web-first, there was a real void in the publishing CMS space. The old ones weren't ready, the new ones didn’t arrive yet.
The publisher felt tremendous, industry-wide changes in the way their audience consumes content and news. A decade of testing, failing, and learning started with the launch of the first iPhone.
That’s why during the 2007–2017 years, so many publishers decided to build their own platforms, often using Drupal.
It offered full flexibility to test and experiment with features, came without licensing costs, and was quite cheap to host and run.
Drupal wasn’t the best software for publishers, it was the only right choice at that moment for them.
What are the main problems with Drupal?
Drupal is not really a CMS: it’s a framework, a toolbox you can use to build almost anything.
Flexibility is also Drupal’s main problem.
Drupal doesn’t provide out-of-the-box features needed by the publishing industry.
It means that the overall product quality will strongly depend on the maturity of the publisher using Drupal for his site. If they were good and invested a few hundred of thousand dollars in their platform, they got a decent publishing CMS (like The Economist did).
But if you didn’t have the maturity or the budget to build something great, you were stuck with a clunky, slow Drupal that cost you more and more each year.
The drupal community tried to tackle this problem with: distributions.
A distribution is a set of pre-installed and pre-configured modules created for a specific vertical.
For example, Drupal Commerce is a distribution for… e-commerce. There is a distribution for publishers too: Thunder.
Thunder is a web-based open-source Content Management System setting new standards for publishers’ CMS. It is based on Drupal, enabling its users to benefit from the Drupal community’s continuous development efforts, as well as specific modules contributed by Hubert Burda Media, other publishers and industry partners — the Thunder Coalition.
But, just take a look over the modules installed with Thunder.
All those modules will be added to your code base, you’ll have to maintain, update and support them. Each new feature of your website will have to be integrated with all those modules.
The complexity of the front-end management
Another big problem we’ve noticed with Drupal while building publisher sites is the complexity of updating front-end templates.
Modifying what and how goes to a landing page is key to optimizing engagement with your audience.
In large publications, there is an entire team dedicated to home page management. Every few minutes something may be changed depending on the title A/B testing or what happens in the world.
But because of the interoperability with every possible use case (remember Drupal is not a CMS within a single vertical, but a toolbox that might be used in a surprisingly large number of very different sites), the setup and usage of both are complex and painful process.
From my point of view there are several red flags when it comes to Drupal’s Layout Builder:
- While adding new features to the website (new entity types and embeddable elements), the loading time of all available elements increases drastically.
- Adding new layouts is a complex process. Currently, we need to create .yml files and templates using code (Twig)
- With a large number of elements in the layout, dragging elements between sections becomes difficult.
- Difficult access to sections in Twig, names are suffixed with UUID, which makes it difficult to render a selected section.
- Headless mode is totally unsupported
You have to care about things you shouldn’t care about
Finally, the most important issue with Drupal is that publishers need to care about things that are not important to their business. A publisher running their site on Drupal will need to care about:
- Developing back-end features to simply create their stories
- Updating and managing hundreds of modules
- Hosting and scaling a Drupal (which might be really complicated)
A publisher is not a software company, their core business is to tell stories, not scale Drupal on AWS.
What are the most important CMS features for publishers?
So let’s see what actually makes a perfect CMS for publishers.
The main user stories from different roles inside a publishing company are:
- As a journalist, I want to write an article focusing on the content. I want a clean and simple interface, I like Microsoft Word. I want to enrich my article with any social media platform content, photo, video, or links. It should be as simple as a copy-paste of an URL. I want to be able to pitch my story for different publications and platforms: home page, the weekly newsletter, section home page, mobile app, etc… I want to be notified when my story makes progress in the workflow funnel towards publication.
- As a editor-in-chief, I need to be able to supervise my 50+ people newsroom. I want to know what articles are being written, and how late we're. I need pitches from journalists about different topics we need to cover. I need a constant, near real-time overview of everything that happens in my newsroom.
- As a web manager, I want to manage every single page of my site, with blocks, like legos. I want to build new ones. I need to select any kind of rules to pull the content into the blocks. Some of the blocks should be fully automatic, while the most important stories should be manually curated. I need to have full freedom over the front-end, being able to integrate external APIs (sports results, OEMs, advertisement, etc…)
- As a developer working for a publisher, I want to easily connect to the CMS API and pull the content I need in one of the distribution channels I’m working on: main website, mobile app, print XML stream, or Alexa chatbot. I want to work on a modern technological stack like ReactJS or AngularJS. I want a headless architecture to avoid strong coupling with my back-end.
- As a sales/growth manager, I want to be able to manage subscribers and paywalls, easily building plans and offers. I want to manage advertisement spaces on each page, change their position or add new ones without coding.
We start to see a clear scope of what a modern publishing CMS should contain :
- Being headless by design
- Provide out-of-the-box features to create great stories
- Provide out-of-the-box features to animate and manage the newsroom’s daily life through stats and workflows
- Total freedom in the front-end, ideally using ReactJS or AngularJS. Easy to maintain and integrate external APIs
- Being able to modify and create front-end pages without coding, with drag-drop capabilities using a large set of building blocks.
- Fully integrated Digital Assets Management system (Photos & Videos)
- Subscriptions and Paywall management integrated
What is Arc XP?
When Amazon founder Jeff Bezos bought The Washington Post in 2013, he quickly became aware of a longtime problem hobbling the entire news industry: The technology that news organizations employed to publish and make money from their content online was wildly inefficient and inadequate.
Bezos also found a chief information officer at the Post, Shailesh Prakash, with ambitions larger than his budget. Bezos solved Prakash’s budget problems, and the Post built what has over time become a best-in-class platform, conveniently hosted on Amazon’s own cloud computing servers. The Post started licensing its technology to other news organizations in 2016, and its digital publishing division, Arc XP, is now a booming business employing a staff of 300 that is continuously rolling out new functionalities. It powers more than 2,000 sites for media organizations and non-media brands.
In its seventh year in the market, Arc XP serves an even greater variety of clients and industries, expanding its services to major enterprise brands, publishers, and broadcasters. This tremendous growth is the result of strategic evolution and continuous innovation.
Why Arc XP is the best alternative to Drupal for publishers?
Arc XP was built inside a newsroom for and by journalists.
Its genesis is the quintessence of agility: put developers and users in the same room and something great will be created.
Arc XP was designed to solve specific problems and pain points each publisher faces:
- Content creation should be as simple as possible for journalists.
- The role of a publisher is to create great stories, not manage the security and scalability of a CMS back-end
- How can we get rid of WhatsApp groups, endless meetings, and manually crafted reports used in every single newsroom of more than 10 journalists?
- How can we gain more subscribers and revenue?
- How can we reduce our dependence on tech giants for monetization?
Arc XP Features Overview
Let’s dig into Arc XP main features and compare them to Drupal equivalents.
Composer: The Stories Factory
Composer is the main interface most of your team will use to craft stories. It’s the simplest and smartest WYSIWYG editor I’ve seen in many CMSes, packed with features designed for journalists.
You can easily decide in which sections your story have to appear (circulation). Drupal’s equivalent is “taxonomies”. Both CMSes work in a very similar way here.
While writing your story, you can define different headlines for each distribution channel (Print, Web, Mobile, etc…). You can also do that in Drupal by adding more fields in your Article content type.
One of the most interesting features is the “embeds”. You can copy-paste any social media link (YouTube, Instagram, Facebook, Twitter, etc…) and it will be automatically integrated into your story. More importantly, you can add any custom embeds, in half an hour of coding.
If you wanted to replicate equivalent features in Drupal, it would require you to install and configure the Paragraphs module. But then, you’ll need to create a paragraph entity for each social media. It would take weeks of development efforts and even more maintenance (each time one of the social media APIs changes, you’ll need to update your modules).
From Composer, journalists may also pitch their stories to publications or platforms. That means when a journalist thinks his story is worth being included in a landing page, newsletter, or weekly print publication, he may pitch it (we’ll get back to pitching & publications in the WebSked module overview). There is no equivalent in Drupal, so you basically will have to create a bunch of custom code, mixed with Drupal’s Workflow module.
Photo Center: All your photos in one place
The photo center is where all your photos and images will be stored. It offers a complete set of features to manage photos, including:
- Cropping and images manipulation
- Photos related workflows
- Rights & Licensing
- Thumbnails and automatic resizing
- Galleries & Lightboxes
Video Center: Your own YouTube
Video Center is an extremely powerful platform to create, transcode, convert and manage videos, live events streams, and much more. Basically, it offers you equivalent capabilities of running your own YouTube. The big advantage is that you own your content, you are free to inject advertisements and create channels and playlists.
There are no equivalent features with Drupal. It would be amazingly complex and costly to try to replicate it, without guaranteed success.
WebSked: Plan and manage your newsroom
WebSked is, from my point of view the most important and unique feature of Arc XP.
WebSked is the planning and task management tool with the Arc Publishing suite, allowing you to see all content being created in the Arc XP authoring apps (Composer, Photo Center, and Video Center), to create and monitor tasks and notifications for these items, to budget for your print product and newsletters with the Publications feature (if applicable), and manage and curate feeds that will appear on the front end of your site with the Collections feature.
When you launch WebSked, you’ll arrive at your dashboard, which will provide you with some high-level statistics — Key Stats — about your newsroom’s publishing stats as well as saved searches, notifications, publishing times compared to site traffic, story count, pitch activity, tasks (for yourself or for groups), and plans for different websites or individual sections of your publication, including platforms. Please note that these statistics are exclusively publishing stats and are not tied to any outside analytics.
There are no equivalent features in Drupal. But you may actually rebuild it from scratch using multiple modules and thousands of man-hours.
PageBuilder: Create your pages & templates like Legos
With PageBuilder you can compose and update any front-end page of your site. When you build something unique (like a Homepage or a mini-site) you’ll create Pages, when you work on how your stories, sections or tags pages will look, you create Templates. Both Pages and Templates are composed of layouts where you can insert blocks.
Arc XP comes with a ton of ready-to-use, blocks where you can pull any kind of content based on any rules:
- Single-story promotions in different shapes (XL — L — M — S)
- Multiple stories
- Paginated lists
- Manually or query-based curated stories
- Image / text / link blocks
- Custom blocks
You can also chain blocks together, load blocks based on URL rules, inject custom variables, and create your own blocks.
Bandito: Blocks A/B Testing
The goal of Bandito is to assist editors in maximizing the performance of the content on their site.
Performance in Arc is measured with Click Through Rate, which is a measure of how often an audience member clicks on a link out of how many times it is served to the audience. Bandito works by allowing editors to add variants of their content — often headlines, promotional images, or description text — which then get presented to the audience as part of the experiment.
Over time, Bandito learns which variants are outperforming others and begins to route traffic to the better variants. Eventually, the experiment will converge on a winner that can be selected as the new default for the feature. Bandito implemented a Multi-Armed Bandit test.
There are no equivalent features in Drupal.
How migrate from Drupal to Arc XP?
Once you decide to migrate from Drupal to Arc XP you must follow a few very simple steps. Your target architecture will be quite different from the Drupal installation you have today.
Drupal sites are usually built as large monolithic installs with all features managed by Drupal, with some external synchronization through API.
Unlike Drupal, Arc XP will be responsible for the CMS part of your platform, you’ll need to find the best-of-breed replacement for all other features you might have. It’s a way more healthy and future-proof, MACH architecture you’ll be building with Arc XP.
Basically, the migration path follows 6 steps, we’ll dig into some of them:
Step 1: Gap analysis
During this step, we suggest going through every single template of your Drupal site and listing it in a spreadsheet. Then for each template, you must describe all the features you’ll find. The idea is to understand the gaps between the features on your current website and Arc XP.
The next step is to perform the same work but on the back-end part. We suggest going through your Drupal modules list. For each you’ll need to decide if :
- You don’t keep it
- You have an equivalent feature in Arc XP
- You’ll need to find a SaaS editor for it (example: Mailchimp for Simplenews, Algolia for SearchAPI & Apache SOLR, etc…)
- You’ll need to write some custom code (example: user zone features)
Step 2: Entities mapping
The second step is to map existing content types to Arc XP types. In Arc XP you basically have 3 types of content:
- Stories (everything is a story, but you might have sub-types and custom fields)
- Photos (Arc XP’s Photocenter)
- Videos (Arc XP’s Videocenter)
For each content type and each field, you need to decide if it should be migrated and if so, where in Arc XP ANS type/entity it should go.
Step 3: Front-end Migration
When building the front end of an Arc XP site you basically have 2 choices : starting from scratch using any framework you want, pulling content and pages through the Arc XP API, or starting from Arc XP ReactJS ‘Fusion’ theme.
We strongly recommend you opt for the second. There is a lot of business logic in Arc XP that you will need to replicate in order to create a fully functional site.
Moreover, Arc’s Fusion theme is maintained, so you’ll get new features without major re-work from your side.
To migrate from Drupal you’ll need to replicate the HTML structure and CSS of your current site into Arc’s Fusion React theme. It will be the major part of your project, but any ReactJS developer will do the job.
Take each Drupal Twig Template one by one and replicate it on Arc. Strip off all non-CMS features (aka forms, user zone, etc…) and create custom blocks and pages for those, integrating some SaaS APIs or your own back-end.
What are the main issues migrating from Drupal to Arc XP?
Features that are not supported by Arc XP
Arc XP is a CMS, with a lot of cool features, but just a CMS. Unlike Drupal, it will not manage your newsletters, forums, comments, polls, forms, etc… For each feature out of Arc XP capabilities, you’ll need to find a replacement when migrating from Drupal. From my point of view, it’s even better that way. You can select the best-of-breed tools for each of your features.
- Newsletters: Mailchimp
- Forums: Vanilla Forums or Hivebrite
- Comments: Disqus
- Polls: Surveymonkey
- Forms: Typeform or Typebot
Surprisingly pagination might be quite complex to build with Arc and might require extra work. If your landings are full of paginations, you’ll need to add some extra days of work specifically on that feature.
Content deduplication on landings
One of the things that piss me off with Arc XP is that they do not support content deduplication on the landing pages. Imagine you’re building a home page with different blocks and sections, pulling content automatically based on some criteria. If the same stories land in different parts of the home page, you have nothing you can do about it.
Collections of articles
Arc XP supports collections of articles. You can create a manually or automatically fed collection of stories. But if you want to promote this collection in a bloc on a landing page, you’ll need to create a workaround linking through tags, your articles, and the collection, as a collection in Arc XP has no image/description and cannot be loaded inside a bloc easily.
How long does it take to migrate from Drupal to Arc XP?
Obviously, it will depend on the number of templates and custom features your website has, but as a rule of thumb, for a standard CMS installation, count 4 months for a full switch for a site with the following specs:
- 10 templates (HP, Section, Article, Search Results, Static Pages, Tag / Author list pages, Collection of Articles page, some specific landings)
- 10–20 Sections
- Existing mobile app to connect to the new CMS
- Subscriptions / paywall
- Around 150K nodes in your Drupal
- 20–50 authors
How much does it cost to migrate from Drupal to Arc XP?
Again prices will strongly depend on the kind of partner you work (local prices, offshore teams, internal team) with and the engagement you want (a few guys working in your office or a fixed-budget external agency).
So I would disclose some budgets from my company, code.store, which is the main Arc XP partner for the EMEA region, working on a fixed budget model:
- Migration of a large publisher, with 50 templates, 30 years of archives, hundreds of thousands of articles, 100 journalists, and 2 mobile apps. They also have subscriptions, around 10 custom embeds, and a weekly print publication. Budget: 400K$
- Migration of a WordPress publication with 10 journalists, 5 templates, and 1500 articles. Budget: <50K$
- Migration of 5 Drupal sites, 10 templates, 450K articles, 2 languages (including Arabic). 5 mobile apps, and 30 journalists. Budget: 200K$
How your migration team should be organized?
We strongly suggest, that for your first project you go with one of the official Arc XP integration partners to avoid architectural mistakes and migration problems. There are partners all around the world. code.store, for example, if one of them is for the EMEA region, would be happy to help.
The team should be cross-functional mixing external expertise on Arc and internal people with deep knowledge of the existing version of the site. As soon as you’ll sign up with Arc XP you’ll get a dedicated account manager with immediate access to Arc XP expertise, who will follow and help the project roll out.