Build an RSS News Aggregator in WordPress with WP-Drudge

 

RSS_150A big question I’m asked on a regular basis is if the WP-Drudge theme can use RSS feeds to display content for a news aggregator in WordPress. Right out the the box, it can and does using the built-in RSS Feed widget (more options than the core widget). This takes an RSS feed and displays the headlines just like the rest of your headline links on the site.

Some people want to take this a step further or combine these links with their own manually curated ones. Also, for folks using the interruption page (a page that site visitors get to first before seeing the link posted), this widget skips that and goes straight to the article. In order to have RSS feed links act like a regular Posted Link, we need to create posts out of these items.

To make that happen, there are a number of plugins you can use. I took some time recently to evaluate the main ones and found one that works well with the theme (after installing a simple and free add-on I created), the WP RSS Multi Importer. This plugin accepts feeds, pulls in items, and created blog posts simply and effectively. I have a few qualms over the code quality I saw but, overall, it does what it says it should and includes many different options to control how it happens. The other plugins out there are either not free to create posts or create enormous overhead and performance problems.

A few words about importing RSS feed content for your news aggregator in WordPress

There are 2 basic problems with importing and aggregating content via feeds. If you understand what you are getting into (or don’t care), you’re welcome to skip this section.

IMG_6278

First, depending on the number of feeds you’re processing, you can create a serious performance issue for your site. This plugin (and other RSS importers) run using something called “wp-cron.” This is an automated process that’s set in mention at regular intervals and started by a site visit. If you have, say, 500 feeds coming in then this process will take a while to run and cause serious slow-downs for a handful of your visitors.

Second, unless you make an effort to point out your sources and link out (which the settings covered below do), then you’re stealing content for your own purposes. Please follow the guidelines below and make sure your feed items are worked properly as from another source. If you’re only posting aggregated content and not making it clear that the content came from another source, then you could easily be excluded from major search engines and denied advertising opportunities. Use this tool at your own risk and do the right thing.

Ok, rant over. Let’s get this thing configured.

Step 1: Installation

We’re actually going to be installing 3 plugins: the WordPress repo plugin mentioned above, another repo plugin to help us keep an eye on our wp-cron processes, and a helper plugin I wrote to make it play nicely with this theme.

  1. Download the WP-Drudge RSS Multi-Importer Add On from this page.
  2. In your WP-Drudge site, go to wp-admin > Plugins > Add New
  3. Near the top, click Upload
  4. Follow the instructions to upload, install, and activate this plugin. When you’re done, you should see it marked as active when you go to wp-admin > Plugins
  5. Again, go to wp-admin > Plugins > Add New
  6. In the search box, type “crontrol,” then install and activate the “WP Crontrol” plugin. When you’re done, you should see an option in the wp-admin > Tools menu for “Crontrol”
  7. Once again, go to wp-admin > Plugins > Add New
  8. Search for “WP RSS Multi Importer,” then install and activate the “WP RSS Multi Importer” plugin. You should see a “Multi Importer” option appear in your wp-admin menu.

Important note: Before you start adding content and working with the RSS Multi-Importer plugin, you will definitely want a viable backup for your database and media content. It’s easy to mis-configure this system and you don’t want to lose any of the work you’ve already put into your site.

Now that we’ve got all the plugins installed and activated, let’s move on to configuring your news aggregator in WordPress.

Step 2: Configuration

The Multi Importer plugin has many different options to set so we’ll start by going to wp-admin > Multi Importer > Global Settings. Make the following changes:

  1. “How often feeds will be updated” should be set as long as possible. The shorter this is, the more often your feeds will update (that’s obvious) but it means that more users might be affected by slow-down.
  2. “Maximum number of items to import for each feed” has the same problems. More items might mean more incoming posts but it also means longer processing time. Once the feed first updates, there should not be many posts to import each time though so you can be a little less careful here.
  3. “Remove older feed item after how much time” sets the expiration time for feed items. For this tutorial, we’ll be creating posts from feed items so this can (and should) be set fairly low. The default “7 days” should be fine. This does not delete the posts that come in, just feed items (a little confusing but we’ll cover that later on).

The other settings are not too critical but look them over to see if they apply to you.

Next, we’re going set up categories for the feeds. When you create a feed source (RSS link), this plugin will ask you to assign it to a feed category. These categories will be mapped to post categories in the next section of this tutorial. Confused yet?

Basically, each feed can be added to a specific category, then multiple feed categories can be added to post categories (the ones that control the widgets displaying headlines on your site). The easiest way to handle this is just to create a feed category for each of your post categories and map them 1-to-1 (covered below). You could also create just a handful of feed categories and map each one to multiple post categories. Or you can create several different feed categories and map several of them to a single post category.

To clarify the process even further:

  1. Feed items are pulled from the RSS feed at regular intervals (set above)
  2. These items are given the category assigned to the feed and stored in the database
  3. At regular intervals (set below), these feed items are used to create posts with the same name and link. These posts are assigned to post categories (set below) that are assigned to the feed categories
  4. The post categories are used by your WP-Drudge Posted Link widgets to output the links on the news aggregator homepage and anywhere else they’re located

OK, I think that’s about as clear as I can make it (feel free to post questions in the comments section below). So we want to set up some feed categories before we configure the post creation and start adding feeds.

  1. Go to wp-admin > Multi Importer > Categories
  2. Click the Add A New Category button and enter the category name (it auto-capitalizes the name, I don’t know why)
  3. Keep clicking the Add a New Category button to add all the different feed categories. When you’re done, click the first blue Save Settings button and the categories will appear in the list below
  4. The “Default Feed Category Image” will assign a default image to the incoming feed items. It requires a full URL to an image but I have not tested this out so I’m not sure how it works
  5. “Post Category Tags” … no idea, to be honest
  6. The “Include Filter Words” will look through the feed content for each item and only import items that include those words. Click the checkbox to the right to exclude items with these words instead
  7. Click Save Settings when complete

Now we have categories that we’ll be adding our feeds to when we’re ready for that. Next step is to activate and configure the Auto Post capability.

Step 3: Auto Post Creation from RSS Feeds

This settings page is the longest by far and there are a lot of important options that I want to cover. If I skip an option below, just leave it as the default.

  1. Go to wp-admin > Multi Importer > Auto Post
  2. “Check to Activate this Feature” turns on the Auto Post creation so go ahead and check that
  3. “How often to import feeds” actually means “how often to turn feed items into posts.” This is the process that turns the imported feed items into posts. Set this to the same thing you set “How often feeds will be updated” in the global settings
  4. “Default status of posts” is how the posts will be imported and whether they will show up automatically on your site. If you plan on moderating the incoming feeds (I recommend this so your site is better curated), set this to “draft” or “pending.” You’ll need to log in at regular intervals and publish the posts you like by switching their status to “publish.” If instead you want the feed items to go live immediately, set this to “publish.”
  5. The “Number of Items per Feed to Fetch” and “Total Number of Entries per Fetch” work together to provide a limit to the number of posts incoming. If you set the “Number of items…” to 10 and you have 50 feeds, you’ll want to set the “Total Number…” to 500 if you want all the items. Just like the other processing limit settings, the higher this is the greater chance you have for a performance problem. Also, if you have 500 items potentially coming in every hours, that’s a huge number of new things saved to the database everyday.
  6. The sections “Link Settings” through “No Index, No Follow, Canonical” are only germane if you’re using the post content anywhere on the site. The individual post pages are the ones you see when you use the interruption page setting or when clicking “View” in the Posts > All Posts list. Not all WP-Drudge sites use these pages but if you are, read through these settings to control how that page looks. A warning here: import the least amount of content possible and make sure to include attribution links everywhere you can. Simply importing full articles from RSS feeds is a bad practice and can lead to being removed from major search engines (Google being the most aggressive here). I recommend a short “Excerpt length,” include “Author Name” and “Feed source,” leave “Set links as No Follow” off, and turn on “Add canonical URL to page…” to make sure it’s clear that this content is not yours.
  7. “Auto Remove Posts” is a good idea if you’re pulling in a lot of posts and don’t need to retain incoming links to your site. If you’re auto-publishing posts from many RSS feeds, you’re going to be creating a potential DB size issue. Removing posts automatically will break incoming links but avoid this issue.
  8. The final step is one of the important ones. The last plugin option maps feed categories (created above) with post categories (explained here). The post categories are what the theme uses to distribute links to your Posted Link widgets. Select a “Plugin Category” (should probably be called “Feed Category”) on the left and select one or more “Blog Post Category” items on the right to pipe posts into a certain category. If you have multiple feed categories, click the link below the first association to make multiple connections.
  9. Click the blue Save Settings button at the bottom and you’re ready to go.

Once these settings are complete, you’re ready to start adding feeds to the system.

Step 4: Adding RSS Feeds to Your News Aggregator in WordPress

If your site is live and posts are set to auto-publish, this step will cause auto-created posts to begin flowing into the system. If you’re not ready for this to happen and want to just see RSS items coming in without showing, uncheck the “Check to Activate this Feature” setting on the Auto Post page first.

  1. Go to wp-admin > Multi Importer > Add a Feed
  2. In the first field, add the name of the feed. This may or may not appear on the site, depending on how you set up the autopost.
  3. Paste a direct URL to an RSS feed in the “Feed URL” field. You can find these on sites by looking for the RSS icon somewhere in the header or sidebar. Not all sites have an RSS feed but most major news sites will.
  4. Next, choose a “Feed Category.” This will pipe posts into the post categories set up on the Auto Post page.
  5. Click the blue Save Feed button on the right to add the feed to the system, then click the green Fetch Feed Items Now button near the bottom of the screen.
  6. Now, click on wp-admin > Multi Importer > Feed Items to make sure they’re coming in as expected.

That’s it! Just rinse and repeat for all the feeds you want to add and your news aggregator in WordPress is ready.

One thing to note: the feed import process and the feed items to posts process are two different system actions. Once the feed items have been imported, they need to be converted to blog posts separately (still automatic). This can be done manually using the WP Crontrol plugin we added earlier.

Step 5: Checking WP-Cron

The WP Crontrol plugin keeps an eye on automated processes in WordPress and allows you to run them manually if you’d like to check that the process is working. If you’re not seeing posts being created in the system, this is a good place to start troubleshooting.

Screenshot 2014-08-31 10.03.57

Go to wp-admin > Tools > Crontrol. In the list that appears, you’re looking for cron processes that start with “wp_rss_multi_event” to identify the automation put in place by this plugin. The process that imports feed items into your site is “wp_rss_multi_event_importfeeds” and the one that turns those feeds into posts is “wp_rss_multi_event_feedtopost.”

The “Next Run” column tells you when that process is scheduled to run next and the “Recurrence” column shows you how often that process is scheduled to run. The latter of the two values were set up in the steps above so you don’t want to alter those here, make sure to do that in the plugin settings. If you just added a feed or two and want that content to show up immediately, click Run Now for wp_rss_multi_event_importfeeds” first, then “wp_rss_multi_event_feedtopost” to create the latest posts. 

It’s important to know that wp-cron relies on two things to run:

  • Your site is publicly and directly viewable. This means that if you’re using this on a development site with an access password of any kind, these will not run automatically. If this is the case, you’ll need to click Run Now as directed above to get content into the system. This also means that sites behind CloudFlare or another firewall may have issues.
  • You have visitors coming to your site regularly. If you have wp-cron processes in place but no visitors, then these will never run. You, however, can trigger them by visiting the site or the wp-admin screens yourself or running them manually.

If you’re having problems getting these processes to run automatically, there might be another problem preventing these actions from occurring. Google is your friend here, though the fixes might be technical and require either a developer or your hosting company to assist.

If everything is running as expected, we’re ready to see the posts that are coming in.

Step 6: Displaying Imported Posts

We want to make sure that posts are coming in as expected so go to wp-admin > Posts > All Posts to see the list. This will include all posts: imported ones and manually created ones, both with links and without.

Screenshot 2014-08-31 10.20.08

You’ll see the Outbound Link in the last column and the Featured Image, if one was imported, as well. You should also see the category it was assigned to (hidden on the screenshot above). This category is what will determine what widget it will appear in.

Last step: set up the site to display this imported content on your news aggregator in WordPress.

Step 7: Use WP-Drudge Posted Link Widgets to Display the Content

The RSS Multi-Importer comes with a widget that can be used to display feed items but it will not use the WP-Drudge settings you might currently have setup. In order to do that, you the WP-Drudge Posted Link widget to display posts in the categories they are being added to. Detailed steps to do that are here.

Stand Back and Watch!

With your feeds loaded to the right category, auto post set up and running, and your Posted Link widgets in place, you should now see imported posts appearing on the site with the same behavior as the rest of the content. You’re still able to create your own posts, both with and without links, and add them to the same categories or different ones. This allows you to combine automatic aggregation with manual curation for increased victory and success.

If you have any questions about integrating this with WP-Drudge or what’s possible using these two together, ask in the comments below, I’m happy to help.

Happy Curating!

6 responses to Build an RSS News Aggregator in WordPress with WP-Drudge

  1. 2014 User Survey Responses - WP-Drudge

    October 8th, 2014 at 11:53 am

    […] There is a lot that would go into an RSS parser so I recommend using the WP RSS Multi-Importer plugin. I wrote a thorough tutorial here. […]

  2. Jean

    November 25th, 2014 at 12:27 am

    Have you taken a look at WP RSS Aggregator? If you believe in good quality then I think you’ll find it a worthy alternative. It’s a premium plugin which helps us sustain development by professional devs and top notch support.

    • Josh

      November 25th, 2014 at 8:02 am

      Jean: thanks for stopping in. I took a look at WP RSS Aggregator and thought it looked great but didn’t dig deep enough to understand everything it can do. I’ll take a closer look and post a tutorial.

  3. WordPress RSS Aggregator Theme

    December 29th, 2014 at 8:45 am

    […] In a previous post, we showed how to connect WP-Drudge to the free RSS aggregation plugin RSS Multi-Importer. This system works great but I was curious to see how it compares with the other major RSS aggregation plugin out there, WP RSS Aggregator. I contacted the fine folks who produce and maintain it and was hooked up with a review copy to try. […]

  4. cr7

    February 11th, 2015 at 7:55 am

    Hi! Recently updated RSS Multi Importer to V 3.13 and I have noticed that my new posts no longer assign an author. Unfortunately I can’t set author in Edit Feed Source… Is there a way to correct this?

  5. Josh

    February 12th, 2015 at 10:22 am

    Hi cr7 … unfortunately, I don’t do any support for that plugin since I’ve only had minor involvement with it. If you’re sure the settings are correct, you can try the support forum for the plugin:

    https://wordpress.org/support/plugin/wp-rss-multi-importer

Leave a Reply

Your email address will not be published. Required fields are marked *