Importer Tool: Setting Up Your Import Files
Before getting started with your import, you need to make sure all of the data is there.
Whether this is your first CMS or you're switching from another system, importing is a way to bring your existing data over to RebelMouse.
Our importer tool is quite flexible and will help you to transform the data in the current format you have into a format that is accepted by RebelMouse.
In order to do this successfully, we'll ask you to follow some steps, and the first step is to guarantee that your data is prepared for the import.
Import files are like data spreadsheets that help you arrange your website's content and setup. Before you begin, make sure to check out RebelMouse's rules and requirements for your import files below.
This guide covers the process's basic steps:
- Exporting Your Data From the Current System
- RebelMouse File Requirements and Technical Limitations
- Currently Supported Objects
- Object Association/Suggested Order of Import
Object: This refers to a type of content or taxonomy that your business uses, such as users, sections, images, and posts. When you're importing data, an object is the specific dataset you're bringing into RebelMouse.
Record: This is an individual instance of an object (e.g., “Tom Smith” is an author record). In a single object import file, each row of your file represents one object record.
Property: This is a field created to store information about your records. In an import, properties will match up with your file’s column headers (or in a JSON or XML file, to each property or tag of a record).
In the example above, the object being imported is a user. A user in RebelMouse is the object used both for content authors and editorial team members. Each row represents a user's name record, and each column represents a user's property (e.g., first name, last name, email address).
Exporting Your Data From the Current System
Before importing your data, you will need to export the data from your current CMS.
There are various methods to perform the export and they will vary depending on the CMS, but here are some of our recommendations for common platforms:
We recommend using this tool: https://www.wpallimport.com
It helps you export by object type (e.g., posts, taxonomy, users, images, etc.).
We recommend the installation of certain modules in your Drupal instance to facilitate the export of large datasets from views. The necessary modules include:
- Services Views
- Views Data Export
Each CMS behaves in a different way and usually has a different format when it comes to exporting. What is preferable when exporting data is:
- It uses one of the accepted formats defined below.
- It contains all of the data that matters to you (e.g., post custom fields, meta data that's being rendered on the frontend).
- That it's not all in one. Having a dump of a database makes importing very hard. Try to get the content in different chunks divided by object. For example, users in one or more files, posts in one or more files, images in one or more files, and so on.
RebelMouse File Requirements and Technical Limitations
All files being imported into RebelMouse must:
- Be a CSV, JSON, or XML file.
- Have only one file:
- You can run multiple imports, but only one object (e.g, images, users, sections, posts, etc.) at a time.
- We recommend that you don't use files that are more than 50 MB. If your file is larger, then we suggest slicing them into smaller files to avoid potential issues.
- The property names don't need to match RebelMouse's since you will be able to map each of the original CMS properties to RebelMouse's properties.
- Be UTF-8 encoded if foreign language characters are included.
All feeds being imported into RebelMouse must:
- Be in a JSON format.
- Have pagination as an offset or page.
- Base authorization is accepted, but the feed cannot be under any other authorization method.
Currently Supported Objects
Object Properties Documentation
Please Note: The importer tool also allows for content updates, which will be described in another article.
The goal of our importer tool is to help users map the data they have into RebelMouse's format. The instructions provided above is from RebelMouse's APIs database. The importer tool is based on our public API, but it has some differences in order to improve the importing process for our clients.
Here are those differences:
Differences Between API Fields and Importer Fields
- No difference.
- Instead of asking you to map what would be the image_id, we ask you to add image_url instead.
- This is a field that can have the value of null, 301, or 302.
- If it's equal to 301 or 302, it will add that kind of redirect to our redirects dashboard from the original path to the path of the new image being created.
- The reason for this is that when you are migrating images, they will change paths and we don't want them to start 404ing across crawlers employed by search engines such as Google or Bing. If you decide to return a 301, we will then create a redirect that will prevent that.
- Example: If an image exists on the original WordPress CMS under the path "yourdomain.com/wp/assets/image.jpg," once it's uploaded to RebelMouse it might change to be under "/assets/image.jpg" — so we create a 301 or 302. This redirect will only happen when your domain is pointed to RebelMouse's servers.
- Update Existing Post
- The id field is only for updating posts, and shouldn't be used for the creation of new articles. If you intend to import the original post id, you can do so by using the provider_post_id field. When updating an article, this field will expect the RebelMouse id of an existing post.
- Be careful when using the importer tool to update posts, because if you don't add the existing post id you will create new posts instead.
- Associate Post With Author(s)
- The authors_association_list field expects an array of strings of authors. It can be email, id, and/or name. If you use this field, you will need to use the authors_association_field_name field and it will need to have the name that would determine the association.
- This authors_association_field_name field expects a string with the name of the user(s) that will be used to associate the post with the correct author(s).
- Image Fields
- image_id becomes image_url
- teaser_image_id becomes teaser_image_url
- social_teaser_image_id becomes social_teaser_image_url
- image_id becomes image_url
- Since listicles can be a more complex migration, we've prepared a separate guide for you here.
- Update Existing Post
Additional properties are not required, but can also be imported into RebelMouse to add or update data in bulk. You can import the following additional properties:
- RebelMouse allows you to create additional custom properties for sections, posts, and users/authors, and then import them.
- Those fields need to first be created and then they can be imported using specific custom data fields:
- roar_specific_data (posts)
- extras (sections)
- specific_data (users/authors)
Object Association/Suggested Order of Import
When importing data, we're aware that there are relationships between the different types. Posts are normally the most connected object out of all of them, but normally we see the following relationships:
- Sections contain images
- Authors contain images
- Posts contain sections, authors, and images
To connect and associate these objects, we suggest a certain order to your import to ensure the importer tool works in the best way possible to connect the objects.
The suggested order is:
- Section Images
- User Images
- Post Images
P.S. You can swap sections and users completely without a problem.
Why Images First?
The reason we suggest that you always start with images, and that you do the import separately from the other objects, is because there is a lot of SEO value in your images' metadata. So we want to make sure that you first import it with alt tag, caption, photo credit, etc.
The way the importer tool is built is that once you reference an existing image URL, we will pull the existing one with the original metadata. This happens for users, sections, and post images.
Why Posts Last?
The reason we suggest you end with posts is because posts are the most connected entity. We have certain fields that will help you connect and associate them with authors and sections:
This way, when you import posts, we will already be able to associate a post with the pre-existing sections and authors.
Can I Do It in a Different Order?
Sure, but there are some limitations you may encounter, such as:
- Images can be imported during the user creation process, but they won't have any metadata.
- Images cannot be imported during section creation or post creation, so they will be broken.
- If you import posts ahead of time, you won't be able to establish any relationships with authors or sections (unless you already have the RebelMouse section ID or RebelMouse author ID). You can always update the post later on to create the necessary relationship(s).
Once your files are ready to go, you can learn how to import each object into RebelMouse here.