Filter to avoid duplicate content

I applied a filter in my classifieds site Pisos Alquiler, this filter makes a comparison of all ads to avoid duplicate content. The operation of this filter is very simple, compare the title of the ad, email and user name of the advertiser and the ad content with the rest of ads already published. When the filter detects that the ad is duplicated, remove it and sent directly the ad to the trash. However, if the filter detects that the ad is genuine, the filter also publishes the ad automatically.

Needless to say, the ad to review should remain as a Draft ..

Using the WordPress cron feature this filter can be run every 5 minutes “for example” and you can automate the process.

In my case in my Hostgator Cpanel I created this cron job, you can take it as an example. You only need to replace “username” for your username.

Minute: */5 ;Hour: * ;Day: * ;Month: * ;Day of week: *

cd /home/username/public_html/; filter.php

I think this may be of use to some of you if you manage them classified ad sites with WordPress or auto-generated content … or commonly called “AutoBlogging“.

Instructions:

Copy the following code, call filter.php, upload it to the root directory with the rest of the WordPress installation files, published several articles saved as Draft, including several duplicates. You’ll see the result … you’ll love!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
<!--?php
require_once("wp-load.php");
query_posts("post_status=draft&#038;orderby=date&#038;order=DESC");
global $table_prefix;
// The Loop
while ( have_posts() ) : the_post();
	//Process each draft post here
	global $post;
	if ($post--->post_status == "draft") {
		$current_ID = get_the_ID();
		$current_title = $post->post_title;
		$current_content = $post->post_content;
		$current_email = get_post_meta($current_ID, 'email', true);
		$current_postname = $post->post_name;
		echo "Processing draft post: ".$current_ID." - ";
		the_title();
		echo "...";
		// Create a new instance
		// Search for published posts with same content and same email
		$sql = "select * from ".$table_prefix."posts as posts LEFT JOIN ".$table_prefix."postmeta as postmeta ON posts.ID=postmeta.post_id where post_content = '".$current_content."' and post_status = 'publish' and meta_key='email' and meta_value='".$current_email."'";
		$result = mysql_query($sql);
		$deletion = false;
		while ($row = mysql_fetch_array($result)) {
			echo " found a similar post content and email custom field. Deleting ...";
			wp_delete_post( $current_ID );
			echo " done";
			$deletion = true;
			break;
		}
		if (!$deletion) {
			echo " not found any similar one. Publishing ...";
			if ($current_postname == "") {
				$current_postname = sanitize_title($current_title);
			}
			$postUpdate = array();
			$postUpdate['ID'] = $current_ID;
			$postUpdate['post_name'] = $current_postname;
			wp_update_post( $postUpdate );
			wp_publish_post( $current_ID );
			echo " done";
		}
		mysql_free_result($result);
 
	}
	echo "";
	ob_flush();
	flush();
endwhile;
 
// Reset Query
wp_reset_query();
This entry was posted in Wordpress and tagged classifieds site, filter classifieds, filter duplicate content, filter wordpress. Bookmark the permalink.