perl script to get search result in google search

Hi it’s been a while since my last writing.. now i’ll share some simple perl script to check and getting search result from Google.

Before we examine the code i’ll explain about what script is it? why do we need it?
the very basic thing about this script is to search and getting the result  from google search  in this case we are utilizing google custom search API to connect and search for specific keyword. Now who the hell need this script? lots of people doing a manual searching and data comparison from his database and now lets take a google as the engine to mine the data and compare the results.

Now see the perl script to check url with google custom search below:

use WWW::Mechanize;

use JSON -support_by_pp;

use DBI();
my $browser = WWW::Mechanize->new();
my $dbh = DBI->connect(“DBI:mysql:database=db_example;host=localhost”,
“user_example”, “pw123zze2″,
{‘RaiseError’ => 1});

//limit the total of request
my $limit = 100;

my $sth = $dbh->prepare(“SELECT SUBSTRING_INDEX(url, ‘/’, 3)as url FROM PLD_LINK limit $limit”);

$sth->execute();
while (my $ref = $sth->fetchrow_hashref()) {
print “URL = $ref->{‘url’} “;

my ($json_url) =”https://www.googleapis.com/customsearch/v1?key=AIzaSyD0r3sassa599PT29wlRpT0Iop8us6rGP-ecJEY&cx=013036536707430787589:_pqjad5hr1a&q=site:$ref->{‘url’}&alt=json”;

#key is unique id given by google .You must subscribe to google api console and the you will get the key ,free but limited for 100 req/day

$browser->get( $json_url );
my $content = $browser->content();
my $json = new JSON;

# these are some nice json options to relax restrictions a bit:
my $json_text = $json->allow_nonref->utf8->relaxed->escape_slash->loose->allow_singlequote->allow_barekey->decode($content);

# iterate over each response in the JSON structure:

foreach my $request(@{$json_text->{queries}->{request}}){
my %ep_hash = ();
$ep_hash{totalResults} = $request->{totalResults};

# print search result information:
while (my($k, $v) = each (%ep_hash)){

if($v==0){

print “Numbers of URL found = $v –> Delete\n “;
my $delete_url = $dbh->prepare(“delete FROM PLD_LINK where SUBSTRING_INDEX(url, ‘/’, 3)=’$ref->{‘url’}’”);
$delete_url->execute();
}else{
print “Numbers of URL found = $v –> Keep\n”;
}
}
print “\n”;
}
}

#above script is created to check an url indexing in google custom search api and processing the json response and then decide the action taken (delete or keep the links or url)

About these ads

Tinggalkan Balasan

Isikan data di bawah atau klik salah satu ikon untuk log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Logout / Ubah )

Twitter picture

You are commenting using your Twitter account. Logout / Ubah )

Facebook photo

You are commenting using your Facebook account. Logout / Ubah )

Google+ photo

You are commenting using your Google+ account. Logout / Ubah )

Connecting to %s