Programming

Histogram Stretching to Increase Contrast Example in Java

I would like to show you a Java example to stretch an histogram of image to improve it's contrast.

This Java program can read photo in common formats like jpg, gif, png etc as well write the stretched image in any of common formats.

The prime advantage with Java is that first it is platform independent and secondly there come huge number of libraries which let you do just about anything.

How image is laid out internally

To refresh the basics just go through this to understand how raster images like gif/jpeg care composed of.

So we'll be working on each plane/channel/color or band separately without thinking about the rest two.

Input

To show histogram stretching we've considered this jpeg image (2592 pixels width and 1944 pixels height)which is not properly lit up due to less light. Also see the histogram of this photo as seen in Photoshop:

Original jpg photo before histogram stretch(I've removed the person):

Original jpg photo before histogram stretch(I've removed the person)

Histogram of the original image. This is combined histogram of RGB in Photoshop. Individual channel histogram is around the same:

Histogram of the original image. This is combined histogram of RGB in Photoshop.

Histogram

We now know that each pixel of a plane/channel or color occupies 8 bits/1 byte and contain values anything from 0 to 255. So we can layout values of all the pixels in a graph where y axis is total number of pixels having a particular gray value in the x-axis (which ranges from 0 to 255).

So if we have single pixel value 25 then we'll put single dot at position 1 in y-axis from place 25 in x-axis.

You can see histogram of each channel/color individually or combined RGB one. It is better see the histogram of each channel/color for analysis.

Stretching the histogram Linearly

If your pixels are laid out in small span rather than being spread out from 0 - 255 what you can do is to stretch it - assume the lowest pixel is moved to 0 value along x-axis and the largest pixel is moved to 255, and proportionately increase all the pixel values. Pixels falling outside this span are to be made 0. That's the loss we will need to bear. The less pixels we lose, the better is the quality of histogram.

You can follow the linear stretch algorithm here.

Running the Java code

The input is taken from line: ImageIO.read and output is at ImageIO.write. Just change the name of the two images. Also change the min and max level in the code and just run it. Your output image is created.

Output

The image out looks like this:

The output stretched image, now showing the darkly lit areas:

The output stretched image, now showing the darkly lit areas.

Histogram of image after stretching it between 0 to 31 pixel value(of each channel/color/band):

Histogram of image after stretching it between 0 to 31 pixel value(of each chann

The Java Program

  1. import java.awt.image.*;
  2. import java.io.*;
  3. import java.util.*;
  4. import java.awt.image.BufferedImage;
  5. import java.io.File;
  6. import javax.imageio.ImageIO;
  7.  
  8. class StretchHistogram {
  9.  
  10. public static void main(String args[]) {
  11. System.out.println("hello");
  12.  
  13. BufferedImage img = null;
  14. try {
  15. img = ImageIO.read(new File("IMG_3955.JPG"));
  16. writeColorImageValueToFile(img);
  17. } catch (Exception e) {
  18. }
  19. }
  20.  
  21. public static void writeColorImageValueToFile(BufferedImage in) {
  22. int width = in.getWidth();
  23. int height = in.getHeight();
  24.  
  25. int min = 0; //stretch min level
  26. int max = 31; //stretch max level
  27.  
  28.  
  29. System.out.println("width=" + width + " height=" + height);
  30. try {
  31.  
  32.  
  33. int[] r = new int[width * height];
  34. int[] g = new int[width * height];
  35. int[] b = new int[width * height];
  36. int[] e = new int[width * height];
  37. int[] data = new int[width * height];
  38. in.getRGB(0, 0, width, height, data, 0, width);
  39.  
  40.  
  41.  
  42. int[] old_histogram_r = new int[256];
  43. int[] old_histogram_g = new int[256];
  44. int[] old_histogram_b = new int[256];
  45.  
  46. int[] new_histogram_r = new int[256];
  47. int[] new_histogram_g = new int[256];
  48. int[] new_histogram_b = new int[256];
  49.  
  50. for (int i = 0; i < (height * width); i++) {
  51. r[i] = (int) ((data[i] >> 16) & 0xff); //shift 3rd byte to first byte location
  52. g[i] = (int) ((data[i] >> 8) & 0xff); //shift 2nd byte to first byte location
  53. b[i] = (int) (data[i] & 0xff); //it is already at first byte location
  54.  
  55. old_histogram_r[r[i]]++;
  56. old_histogram_g[g[i]]++;
  57. old_histogram_b[b[i]]++;
  58.  
  59. //stretch them to 0 to 255
  60. r[i] = (int) (1.0*( r[i] - min) / (max - min) * 255);
  61. g[i] = (int) (1.0*( g[i] - min) / (max - min) * 255);
  62. b[i] = (int) (1.0*( b[i] - min) / (max - min) * 255);
  63.  
  64. if(r[i]> 255) r[i]=255;
  65. if(g[i]> 255) g[i]=255;
  66. if(b[i]> 255) b[i]=255;
  67.  
  68. if(r[i]<0) r[i]=0;
  69. if(g[i]<0) g[i]=0;
  70. if(b[i]<0) b[i]=0;
  71.  
  72. new_histogram_r[r[i]]++;
  73. new_histogram_g[g[i]]++;
  74. new_histogram_b[b[i]]++;
  75.  
  76. //convert it back
  77. e[i] = (r[i] << 16) | (g[i] << 8) | b[i];
  78.  
  79. }
  80. //convert e back to say jpg
  81. in.setRGB(0, 0, width, height, e, 0, width);
  82. ImageIO.write(in, "jpeg" /* "png" "jpeg" ... format desired */,
  83. new File("newout.jpg") /* target */);
  84.  
  85. PrintHistogram(old_histogram_r, "hist_before_r.txt"); //before stretchig ie original
  86. PrintHistogram(old_histogram_g, "hist_before_g.txt");
  87. PrintHistogram(old_histogram_b, "hist_before_b.txt");
  88. PrintHistogram(new_histogram_r, "new_histogram_r.txt"); //after stretching ie modified ones
  89. PrintHistogram(new_histogram_g, "new_histogram_g.txt");
  90. PrintHistogram(new_histogram_b, "new_histogram_b.txt");
  91.  
  92.  
  93. } catch (Exception e) {
  94. System.err.println("Error: " + e);
  95. Thread.dumpStack();
  96.  
  97. }
  98. }
  99.  
  100. static void PrintHistogram(int[] hist, String file) {
  101. try {
  102. FileWriter op = new FileWriter("F:/tmp/JavaApplication1/"+file);
  103.  
  104. for (int i = 0; i < hist.length; ++i) {
  105. op.write("[" + i + "]=" + hist[i]+"\n");
  106. }
  107. op.close();
  108. } catch (Exception e) {
  109. System.err.println("Error2: " + e);
  110. Thread.dumpStack();
  111. }
  112. }
  113. }

My Article not ranking well in blogger

If your blogger articles not ranking well then you must check if they are optimized as per SEO. The foremost thing to check is - open your blog and check it's title in the browser title bar. Here is one example of how title of your article will look in the top-left of your browser window:

Browser title bar

SEO Requirement

To make best out of any article in internet you must do this:

  1. You title must appear first in browser title bar ( ie., within <title></title> text in your web page html)
  2. Use strong/em html tags for the most important keywords at least once
  3. Use same(similar) keywords in the title or Alt text of any image

Don't use too much of keywords and you'll be fine. Write for readers - just keep in mind those 3 small points. Point 1 is most important, if you've missed it, it's better to delete your article than publish it on internet.

Blogger article poorly ranking

Just because the blogger/blogspot templates are highly un-optimized with regard to point - 1 you must make follow these to correct it in your articles. Just follow these steps.

Correctly positioning your title tag

Go to your blog and click on 'Template'(circled in red) on the left site and then click "HTML"(circled in pink) below:

Go to your blog

Click Proceed as in below:

Click Proceed; Ranking well your blog

Locate this line within red rectangle. You need to replace it. This line looks like:

 <title><data:blog.pageTitle/></title>

. Remember you want to replace anything within<title> and </title> - this actually is the html title tag to show your article title in the browser.

Your original template( you'll replace text within red rectangle)

Replace the single line as shown above with this text in the photo below.

<b:if cond='data:blog.pageType == &quot;index&quot;'>
  <title><data:blog.pageTitle/></title>
  <b:else/>
  <title><data:blog.pageName/> | <data:blog.title/></title>
</b:if>

Then save your template and you're done!

Replaced text (within red rectangle):

Replaced text (within red rectangle)

Perl code to find yesterday's date

Here is the Perl sample code I'm showing how to find yesterday's day. By changing the format in strftime you can get yesterday's date too.

To manipulate time you'll need to use tzset function and don't forget to set your timezone. I'm calculating yesterday's day. But you can find the date also by using string for example in this format:%d-%M-%y with strfttime.

The trick is to pass day of month minus 1 or as many days you want to subtract or add. The strftime automatically adjust the date by subtracting or adding that many days in the parameter passed. You can similarly add/minus from other paraments to strftime as well.

use strict;
use warnings;


use Time::Local;
use POSIX qw/strftime/;
use POSIX qw(tzset);



$ENV{TZ} = 'Asia/Kolkata'; #Set your timezone here
tzset;


my $time      = time;
my $today     = strftime "%d", localtime $time;
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime($time);
my $yesterday = strftime "%d", $sec, $min, $hour, $mday - 1, $mon, $year;
print "today=$today yesterday=$yesterday\n";

Send file as attachement using in PHP using SMTP email

You are on Windows/Linux or any system and you want to run a php file with command line argument(a file) so that this php program will email(using SMTP) you this file by attachment. Actually this backup script calls it to send each backup file one by one.

I've installed it on my web-hosting account with Hostgator so that daily backups of databases are emailed to my email addresses on rotating basis.

How to Run it

php email.php <filename>

Settings

  • $from: Set it to your from email id
  • $host : You SMTP email server
  • $username: SMTP login name
  • $password: SMTP password
  • $all_emails: Enter in an email in a new line. Each day send the file to this one email, tomorrow to next email and so on. When it reaches the last email, it again starts from the very first one. Please see I've inserted paidemail [at] onlinefilefolder [dot] com on every next line so that it receives the email on every alternate day to my paid email account. Take care to login into you free email accounts once in a few months otherwise you lose that account.
<?php
require_once "Mail.php";
require_once(
'Mail/mime.php');
if(
$argc!=2)
{
print
"Usage email file\n";
exit(
1);
}

$file="$argv[1]";

ini_set('memory_limit', '180M');

$from = "contactus [at] mysite [dot] com";

$all_emails = array(
   
"jjkkk [at] gmail [dot] com",
"paidemail [at] onlinefilefolder [dot] com",
"sdssd [at] gmail [dot] com",
"paidemail [at] onlinefilefolder [dot] com",
"ssdds [at] gmail [dot] com",
"paidemail [at] onlinefilefolder [dot] com",
"sdfsfd [at] hotmail [dot] com",
"paidemail [at] onlinefilefolder [dot] com",
"fddsfsdf [at] hotmail [dot] com",
"paidemail [at] onlinefilefolder [dot] com",
"sdfsdf [at] gmail [dot] com",
"paidemail [at] onlinefilefolder [dot] com",
"fsdfsdfsdf [at] gmail [dot] com  ",
"paidemail [at] onlinefilefolder [dot] com",
"sdfsdfsdf [at] gmail [dot] com",
"paidemail [at] onlinefilefolder [dot] com",
"fdsfsdfsdf [at] yahoo [dot] com",
"paidemail [at] onlinefilefolder [dot] com",
"fsdfsdfsdf [at] hotmail [dot] com",
"paidemail [at] onlinefilefolder [dot] com",


);

$dateinfo = getdate();



$to = $all_emails[$dateinfo['yday'] % count($all_emails)] ;

$subject = "Backup of file";
$body = "backup of file $file";

$host = "mail.mysite.com";
$username = "contactus+mysite.com";
$password = "mypassword";


$headers = array ('From' => $from, 'To' => $to, 'Subject' => $subject);

$crlf="\n";

$mime = new Mail_mime($crlf);
$mime->setTXTBody($text);
$mime->setHTMLBody($html);
$mime->addAttachment($file, 'application/octet-stream');
//do not ever try to call these lines in reverse order
$body = $mime->get();
$headers = $mime->headers($headers);




$smtp = Mail::factory('smtp',
  array (
'host' => $host,
   
'auth' => true,
   
'username' => $username,
   
'password' => $password));

$mail = $smtp->send($to, $headers, $body);

if (
PEAR::isError($mail)) {
  echo(
"

"
. $mail->getMessage() . "

"
);
} else {
  echo(
"Message Sent successfully
  Thank you.
Please Visit Again!
"
);
}

?>

Simple Java Working Example to Upload File in Amazon Glacier or S3

It's been 2 months since I started using this very easy and nice Perl module made for glacier backups from command line: https://github.com/vsespb/mt-aws-glacier . It has worked amazingly well. I've even extracted the backups and compared them. It is in fact the most simple and effective way to take backups on Amazon Glacier.

(Pl. see the bottom for more)

For simple programmers or for impatient Amazon AWS api is indeed complicated API. To tell the truth I could not find even single working example anywhere which you'll simply copy-paste and that'll run.

Update: There is better Java code here: https://github.com/MoriTanosuke/glacieruploader and it works nicely.

Here is one I've written myself on my Linux using Java API. .Net is not my cup of tea. I then tried PHP but it's documentation looks like to be quite limited and also there are hardly any resources I could find on the internet to make a php program run. I've gone for Java api and somehow I manage it to run - upload an archive at least. So use this simple working example for Amazon Glacier/S3 for uploading. Here are the steps. (Click here to see the following image in full)

Folder Structure of Amazon AWS Glacier Installation

  1. In the top folder create bin,jar,sdk and src folders
  2. Download the Java sdk and install in sdk folder ( see my installation in above picture)
  3. Now take the two files within sdk/lib/aws-java-sdk-1.3.26.jar and aws-java-sdk-1.3.26-sources.jar which contain com package and extract them in jar folder
  4. Copy this code and place it in at src/ArchiveUploadHighLevel.java. Also enter the full path of the file you want to upload at line 14 (mine looks like: public static String archiveToUpload = "/home/raws/aws-java-sdk.zip";) and fill in the vault name on line 13. Mine looks like( public static String vaultName = "test";).
  5. In src folder copy the AwsCredentials.properties file from samples folder and then fill in you access key and secret key it looks like:
    # Fill in your AWS Access Key ID and Secret Access Key
    # http://aws.amazon.com/security-credentials
    accessKey =AKIai5AG4P664HGPZUVA
    secretKey =cs8zWyZ034gIKbBEZqJNyosSPiLLoQ9IALeR6PYH
    
    
  6. In the top folder create run.sh and put the following code in it:
     javac  -cp jar src/ArchiveUploadHighLevel.java -d bin
    
    java -Dcom.amazonaws.sdk.disableCertChecking=true -cp "bin:sdk/lib/*:sdk/third-party/commons-logging-1.1.1/*:sdk/third-party/httpcomponents-client-4.1.1/*:sdk/third-party/jackson-core-1.8/*" ArchiveUploadHighLevel
    
    

    Now if you're on windows you can place it in batch file or simply run these two lines one by one on the command prompt. Also the class path folders in -cp must be separated by semi-colon(;) and not colon on windows.

    Mind you I'm using -Dcom.amazonaws.sdk.disableCertChecking to disable ssl certificate checking. If you want to use it then have look here if you get ssl peer not authenticated exception.

This tool easily maintains the file name and the id assigned at glacier for restore purpose. How I work is : I maintain a backup folder. I place any files within it and simply run the sync.sh command. It simply looks into a log file to see if it was last backed up. If it wasn't then it simply backs it up as per the details in glacier.cfg file.

Here are the content of my sync.sh:

  echo "Input secret"

  read secret
 /home/john/aws/perl/mt-aws-glacier-master/mtglacier  --config=glacier.cfg --secret=$secret create-vault oct-13
  nohup /home/john/aws/perl/mt-aws-glacier-master/mtglacier  sync --config=glacier.cfg --dir ~/backups/oct-13/  --journal=oct-13.log --concurrency=3 --secret=$secret


Here are the content of my glacier.cfg:

key=A***999AG4PG54***ZUVA

  # region: eu-west-1, us-east-1 etc

  region=us-east-1

  # protocol=http (default) or https

   protocol=http

  vault=oct-13



My backups are monthly wise. So every month I create a new folder in the backup directory. This month I created oct-13. I placed whatever files and just replaced sep-13 in all my files to oct-13 (e.g.., in glacier.cfg,sync.sh etc)
Then I simply run this "sh sync.sh" and it does it.
This too maintains a log file storing all id's of files stored in Glacier. I back it up also on another computer in case I ever lose it.

Crawling a Website with VIEWSTATE & EVENTVALIDATION using PHP

It was really a tough job when I tried to make automated requests to a site using VIEWSTATE & VIEWARGUMENTS which is using I think ASP .Net. To make a request to any page, we need to send with the request the __VIEWSTATE, __VIEWARGUMENT hidden values along with the next request. This will change with every page you get it.

In short unlike HTTP stateless protocol, these two variables make remember the cgi-script of the previous state that client must have been to fetch data of current state.

You should also send all POST hidden variables too otherwise the request could fail.

There is one useful function 'exfield' which will extract any passed hidden field's value and there is 'sendpost' method which will send to the url all the arguments in the passed array using POST method and return you the result.

I hope it is of some help to you!

<?php

define
('URL', 'http://www.example.com');  //The website using .Net ASP
define('PAT_RESULTS_FOUND', '/Search result:([0-9]+) Results found/');
define('TOTAL_officeofficeS_IN_PAGE', 10); //max Select01 like links in a page
//................................................................
$total_internet_requests = 0;
assert_options(ASSERT_CALLBACK, 'my_assert_handler');



$pppcodearr = array('1', '2');
foreach (
$pppcodearr as $pppcode) {
   
$not_found_file = $pppcode . "not-found";
   
    if(
file_exists($not_found_file))
    {
        continue;
//skip this ppp-code
   
}
   
   
$content = sendget(); //the very first page
   
$fields = array(
       
'ddl_dist' => '341',
       
'ddl_state' => '1',
       
'hdn_tabchoice' => '1',
       
'search_on' => 'Search',
       
'txt_dist_on' => '',
       
'txt_offname' => $pppcode,
       
//'__EVENTARGUMENT' => 'Page$3',
       
'__EVENTTARGET' => 'ggg',
       
'txt_stateon' => '',
    );

   
$exflds = array('__VIEWSTATE', '__EVENTVALIDATION');
    foreach (
$exflds as $val) {
       
$fields[$val] = exfield($val, $content);
    }
   
$fields['__VIEWSTATEENCRYPTED'] = '';

   
$content = sendpost($fields);   //this is Page$1

   
assert(checkvalidpage($content));
   
$total_recs = get_total_results_found($content);
   
   
    if(
$total_recs == 0)
    {
       
//indicate no records for this pppcode
       
assert(file_put_contents($not_found_file, ''));
        continue;
    }
   
   
$total_pages = ceil($total_recs / TOTAL_officeofficeS_IN_PAGE);

   
$page_no = 1;

   
$post_offices_in_page = $post_offices_in_page_remaining = officeofficesinpage($content);


   
//if it is the first page then check if all records have already been downloaded
   
$total_recs_ctr = $total_recs;
   
$not_exists = false;
    for (
$pg = 1; $pg <= $total_pages; ++$pg) {
       
$sel = -1;
        do {
            ++
$sel;
            --
$total_recs_ctr;

           
$file = coin_ppprecord_filename($pppcode, $pg, $sel);

            if (
file_exists($file) && checkvalidpage(file_get_contents($file))) {
               
//skip it
               
if (dbg()) {
                    print
"$file already exists ... skipping\n";
                }
            } else {
               
$not_exists = true;
                break
2;
            }
        } while (
$total_recs_ctr && $sel < TOTAL_officeofficeS_IN_PAGE - 1);
    }

    if (
$not_exists) //if at least 1 records does not exist then only enter this loop.
       
do {
           
//this the Page$1
           
wrt($content);

            if (!
checkvalidpage($content)) {
                break;
            }

           
$fields = array(
               
'ddl_dist' => '0',
               
'ddl_state' => '1',
               
'hdn_tabchoice' => '1',
               
'txt_dist_on' => '',
               
'txt_offname' => $pppcode,
               
'__EVENTARGUMENT' => 'Select$0',
               
'__EVENTTARGET' => 'ggg',
               
'txt_stateon' => '',
               
'__VIEWSTATEENCRYPTED' => '',
            );

            foreach (
$exflds as $val) {
               
$fields[$val] = exfield($val, $content);
               
//print "$val= $fields[$val] \n";
           
}

            for (
$sel = 0; $post_offices_in_page_remaining--; ++$sel) {
               
$fields['__EVENTARGUMENT'] = 'Select$' . $sel;

               
$file = coin_ppprecord_filename($pppcode, $page_no, $sel);

                if (
file_exists($file) && checkvalidpage(file_get_contents($file))) {
                   
//skip it
                   
if (dbg()) {
                        print
"$file already exists ... skipping\n";
                    }
                } else {
                   
$result = sendpost($fields);
                    if (
checkvalidpage($result)) {
                       
assert(file_put_contents($file, $result));
                    } else {
                        print
"Is not valid page found for $file\n";
                        print
" $sel < $post_offices_in_page $page_no\n";
                       
assert(true);
                    }
                }
            }
//go over to the next page
           
++$page_no;

           
$fields = array(
               
'ddl_dist' => '0',
               
'ddl_state' => '1',
               
'hdn_tabchoice' => '1',
               
'txt_dist_on' => '',
               
'txt_offname' => $pppcode,
               
'__EVENTARGUMENT' => getpageno($page_no),
               
'__EVENTTARGET' => 'ggg',
               
'txt_stateon' => '',
               
'__VIEWSTATEENCRYPTED' => '',
            );

            foreach (
$exflds as $val) {
               
$fields[$val] = exfield($val, $content);
            }
           
$content = sendpost($fields);
        } while (
$page_no <= $total_pages);
}
//for each


print "Total internet page requests = $total_internet_requests\n";

function
dbg() {
    return
1;
}

function
my_assert_handler($file, $line, $code) {
    echo
"<hr>Assertion Failed:
        File '
$file'<br />
        Line '
$line'<br />
        Code '
$code'<br /><hr />";

   
var_dump(debug_backtrace());
    exit(
1);
}

function
get_total_results_found($content) {
    if (
strstr($content, 'No Matched Post offices found')) {
        return
0;
    } else if (
preg_match(PAT_RESULTS_FOUND, $content, $matches)) {
        if (
dbg()) {
            print
"total pppcode results=$matches[1]\n";
        }
        return
$matches[1];
    } else {
       
assert(true); //can't reach here
   
}
}

//count number of officeoffice link in the page
function officeofficesinpage($content) {
   
//The look like javascript:__doPostBack(&#39;ggg&#39;,&#39;Select$[0-9]{1,2}

   
$pat = '/javascript:__doPostBack\(&#39;ggg&#39;,&#39;Select\$[0-9]{1,2}/';

   
wrt($content);

   
$ret = preg_match_all($pat, $content, $matches);

   
assert($ret !== FALSE);

    return
$ret;
}

function
getpageno($page) {
    return
'Page$' . $page;
}

function
getselno($sel) {
    return
'Select$' . $sel;
}

function
checkvalidpage($content) {
    if (
strlen($content) < 65000 || strstr($content, 'Sorry this site has encountered a serious problem, please try reloading the page')) {
        return
false;
    } else {
        return
true;
    }
}

//extract value of a hidden field
function exfield($field, $content) {
   
$pat = '{<input\s+type="hidden"\s+name="' . $field . '".*?value="([^"]+)"}';

    if (
preg_match($pat, $content, $match)) {
        return
$match[1];
    } else {
        print(
"Unable to extract $field\n");
    }
}

function
wrt($content) {
   
file_put_contents("F:/tmp/a.htm", $content);
}

function
sendget() {
    global
$total_internet_requests;
   
$ch = curl_init(URL);
   
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
   
curl_setopt($ch, CURLOPT_HEADER, 0);
   
$txResult = curl_exec($ch);
   
$statuscode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
   
    ++
$total_internet_requests;

    if (
dbg() >= 2) {
        print
"statuscode=$statuscode\n";
        print
"Result=$txResult\n";
    }
   
assert(file_put_contents("F:/tmp/abc.htm", $txResult) !== FALSE);
   
curl_close($ch);
    return
$txResult;
}

function
sendpost($postarr) {
    global
$total_internet_requests;
   
$data = '';
    foreach (
$postarr as $key => $val) {
       
$unit = "$key=" . urlencode($val);
        if (
strlen($data) == 0) {
           
$amp = '';
        } else {
           
$amp = '&';
        }

       
$data .= "$amp$unit";
    }

   
$custom_headers = array();
   
$custom_headers[] = "Accept: text/html, application/xhtml+xml, application / xml;q=0.9, */* ;q=0.8";
   
$custom_headers[] = "Pragma: no-cache";
   
$custom_headers[] = "Cache-Control: no-cache";
   
$custom_headers[] = "Accept-Language: en-us;q=0.7,en;q=0.3";
   
$custom_headers[] = "Accept-Charset: utf-8,windows-1251;q=0.7,*;q=0.7";
   
$ch = curl_init();
   
$useragent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0.1) Gecko/20100101 Firefox/9.0.1";
   
curl_setopt($ch, CURLOPT_USERAGENT, $useragent); // set user agent
   
curl_setopt($ch, CURLOPT_URL, URL);

    if (
strlen($data)) {
       
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
       
curl_setopt($ch, CURLOPT_POST, 1);
    }
   
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
   
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
   
curl_setopt($ch, CURLOPT_HEADER, false);
   
curl_setopt($ch, CURLOPT_HTTPHEADER, $custom_headers);

   
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 20);
   
curl_setopt($ch, CURLOPT_TIMEOUT, 40); //timeout in seconds

   
$txResult = curl_exec($ch);
   
    ++
$total_internet_requests;

   
$statuscode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
   
curl_close($ch);
    if (
dbg() >=2 ) {
        print
"statuscode=$statuscode\n";
        print
"Result=$txResult\n";
    }
    if (
dbg()) {
       
assert(file_put_contents(tempnam(get_temp_dir() . "pppcode", "post_req"), $txResult) !== FALSE);
    }
    return
$txResult;
}

function
get_temp_dir() {
    return
"f:/tmp/";
}

function
coin_ppprecord_filename($pppcode, $page_no, $sel) {
    return
get_temp_dir() . "pppcode/" . "$pppcode-" . getpageno($page_no) . '-' . getselno($sel) . ".htm";
}
?>

API to find Alexa rank in bulk in PHP

You have bunch of sites and you want to api to find Alexa rank of them. Here is PHP function which does exactly that. Mind you if you don't know PHP, you can simply download PHP and run this code on command line. It is simple.

What it does it whenever it fetches the Alexa rank of domain it also stores it in a serialized file so that if you ever pass that same domain again, it will simply read it from the disk and return it.

If the .sr file gets corrupted then just delete it and start over.

Input

Just pass the domain name like this: domain.com all in small case

For Wikipedia.org : just pass wikipedia.org

<?php

$arr
= array(
   
"facebook.com",
"google.com",);


foreach(
$arr as $item)
{
   
$alexa = get_alexa($item);
    print
"$item $alexa\n";  
}

function
get_alexa($url)
{
   
define("ALEXA_FILE", "alexa.sr");
   
   
//print "Need alexa for $url\n";
   
   
static $alexa_arr;
   
    if(!
$alexa_arr)
    {
       
$alexa_arr = array();
        if(
file_exists(ALEXA_FILE))
        {
           
$alexa_arr=unserialize(file_get_contents(ALEXA_FILE));
           
assert($alexa_arr);
        }
       
    }
   
   
//check if site exists in this arr
   
   
$url=trim($url);
   
    if( isset(
$alexa_arr[$url]))
    {
       
    }
    else
//fetch it from the server
   
{
       
$response = file_get_contents("http://data.alexa.com/data?cli=10&url=$url");
       
assert($response !== false);
       
$xml = new SimpleXMLElement($response);
       
$rank = -1;
        if(isset(
$xml->SD->POPULARITY["TEXT"])){
           
$rank=$xml->SD->POPULARITY["TEXT"];
           
        }
       
//store it in the array
       
$alexa_arr[$url] = (string)$rank;
       
        print
"Got from Web Alexa for url=$url => ".(string)$rank."\n";
       
       
assert(file_put_contents(ALEXA_FILE,serialize($alexa_arr)));
       
    }
   
//print "Alexa[$url]=".$alexa_arr[$url]."\n";
   
return $alexa_arr[$url];
   
}

?>
Syndicate content