Tuesday, October 8

Website Crawler with fork and Join Framework

                Website Crawler with fork and Join Framework 
Here are the classes involved in writing code for this exercise . It can be directly copied and executed using java 7 as fork and Join libraries are available in java only version 1.7 onwards.

Along with these classes you would need HTMLParser jar file , which is used to retrieve links available in a page linked to a particular link. 

Please download htmlparser-1.6.jar file and include in the class path to execute below code


====================================================================

WebsiteCrawler class initiates the logic . It create ForkJoinPool which is used to contain the threads to take up and execute the work stealing algorithm.total work is divided among these threads and is executed is parallel . Thus overall processing is executed faster and multiple processor/core hardware is effectively utilized






import java.util.Collection;
import java.util.Collections;
import java.util.HashSet;
import java.util.concurrent.ForkJoinPool;

/**
 *
 * @author Manoj

  */
public class WebsiteCrawler implements LinkTracker {

    private final Collection linksCrawled = Collections.synchronizedSet(new HashSet());
    private String inputUrl;
    private ForkJoinPool pool;

    public WebsiteCrawler(String inputUrl, int maxThreadCoulnt) {
        this.inputUrl = inputUrl;
        pool = new ForkJoinPool(maxThreadCoulnt);
    }

    private void init() {
        pool.invoke(new LinkSearcher(this.inputUrl, this));
    }

 
   

    public void addVisited(String s) {
        linksCrawled.add(s);
    }


    public boolean visited(String s) {
        return linksCrawled.contains(s);
    }

    public static void main(String[] args) throws Exception {
        new WebsiteCrawler("http://efectivejava.blogspot.in", 50).init();
    }
}




-->


===================================================================

LinkTracker interface provides the basic methods required to execute the link search logic
/**
 *
 * @author Manoj
 */
public interface LinkTracker {

  
    boolean visited(String link);

    void addVisited(String link);
}

import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.RecursiveAction;

import org.htmlparser.Parser;
import org.htmlparser.filters.NodeClassFilter;
import org.htmlparser.tags.LinkTag;
import org.htmlparser.util.NodeList;




==================================================






This is the class where core recursive logic is executed . To divide ,assign and execute the logic recursively this class extends RecursiveAction class and overrides compute() method. compute method is invoked recursively and execute the logic for every link . After visit ,visited link is added to the set and all child URLS found for current URL are added as recursiveAction in the list to be executed by compute() method. 



To understand the code further Please execute this code in debug mode and walk through the flow.



 public class LinkSearcher extends RecursiveAction {

    private String url;
    private LinkTracker tracker;

  
    public LinkSearcher(String url, LinkTracker tracker) {
        this.url = url;
        this.tracker = tracker;
    }

    @Override
    public void compute() {
        if (!tracker.visited(url)) {
            try {
                List actions = new ArrayList();
                URL uriLink = new URL(url);
                Parser parser = new Parser(uriLink.openConnection());
                NodeList list = parser.extractAllNodesThatMatch(new NodeClassFilter(LinkTag.class));

                for (int i = 0; i < list.size(); i++) {
                    LinkTag extracted = (LinkTag) list.elementAt(i);

                    if (!extracted.extractLink().isEmpty() && !tracker.visited(extracted.extractLink())) {

                        actions.add(new LinkSearcher(extracted.extractLink(), tracker));
                    }
                }
                tracker.addVisited(url);
                System.out.println(url);

                invokeAll(actions);
            } catch (Exception e) {
            }
        }
    }
}









Why should override hashcode method while overriding equals method

 Why should override hashcode method while overriding equals method

How do we compare two instances of a class in java 

Lets say there is a class 

public class Employee {

    int age;

    Employee(int age) {

        this.age = age;

    }

    public static void main(String args[]) {

        Employee emp1 = new Employee(10);
        Employee emp2 = new Employee(10);
      
        System.out.println("Are two instances equal?  :" +emp1.equals(emp2));

    }

}


On executing above program , It prints :
Are two instances equal?  :false

Why equals method says that two instances are different although they are instances of same class and have same age?

Reason : While invoking equals method first hashcode() method is executed . If hashcode() method returns different hashcode value for instances being compared equals() method is not called. Different hashcode declares that instances are different and no further comparison takes place.

So Is hashcode values for above tow instances are different ?

Lets see that 

public static void main(String args[]) {

        Employee emp1 = new Employee(10);
        Employee emp2 = new Employee(10);
       
        System.out.println("hashcode value for emp1  " + emp1.hashCode());
        System.out.println("hashcode value for emp2  "+emp2.hashCode());


    }




-->


Executing above method prints 

hashcode value for emp1  327325694
hashcode value for emp2  1657319091


 

So hashcode value returned is different for two instances . Actually hashcode() method works on its inherent algorithm to generate hashcode value of instances . Executing above program again might generate different hashcode values 

Let me execute the above program again and see what values it prints on console 

It prints 

hashcode value for emp1  1025601370
hashcode value for emp2  1578474768

 


So it has it's internal algorithm to generate hashcode value which dynamically generates the hashcode value and we have no control on that..

So how do we make equals method return true flag when comparing two instances of Employee class

let's override equals method as below 

  public  boolean equals(Object obj){
       
       return this.age==((Employee)obj).age  ;  
    }
 

Here Object class equals method is overridden . Now equals method is customized so it will declare two instances equal if their ages are equal . So emp1.equals(emp2) should return true 







But bigger barrier is hashcode() method . This method does not let the call go to equals() method . It returns false based of different hashcode value for two instances . 

So what do we do to make the call reach equals() method?

 Equals() method can be invoked only if hashcode value for two instances being compared are equal . So can we customize hashcode() method to make it happen . Let's try doing that 

     public int hashCode() {
         return this.age;
     }
 
Now executing hashcode value will print 
 
hashcode value for emp1  10
hashcode value for emp2  10

So hashcode values are equal. Hashcode method itsself can't decide in this case If 
two instance are equal and to decide that equals() method is invoked. 
 
Thus It is must to override hashcode() method to make equals() method invoked. 
 
So we can conclude 
 
If equals() method is customized and overridden then it make it work as we expect It is 
must to override hashcode method in a way to make call to equals() possible. 

 







Sunday, October 6

How to make a website SEO friendly , Search engine optimization

                         How to make a website SEO friendly , Search engine optimization


Page Title
Your title should be from 20 to 70 characters length. However up to 120 characters can be considered as correct.
Make sure that title is correct and includes most important keyword.
Each page should have it's own, unique title.

Meta Description
Meta description should be from 100 to 160 characters length. However up to 200 characters can be considered as correct.
Use all characters to describe your page using most important keyword.
Each page should have it’s own, unique description.
Meta description usually shows in search engine results as a description of page and it should encourage to visit your site.

Meta Keywords
Meta keywords has no influence on website positioning.
You can remove all meta keywords or limit them to only few describing your content.

Meta Robots
Meta tag robots has crucial impact on page indexing by search engine robots.
Value "noindex" prevents from indexing of page. Value "nofollow" prevents robots from following all links at page.
Most common and default (if empty) values are index, 

Encoding
 Encoding has an impact on the correct display of special characters.
Popular encodings are UTF-8, Latin1, ISO-8859-2 etc.
Encoding has no influence on website positioning, but incorrect encoding can cause problems with display special characters.
Body Content

Words and Chars
 Make sure that the text on page was not less than 250 words.
 
Text / HTML Ratio
The TEXT / HTML ratio is the ratio of the number of characters of plain text to the page's HTML code, expressed as a percentage.
The higher the number, the greater amount of content is on your site, and lesser HTML code.
A value below 10% means that there is too little text or that the HTML code is "littered" or negligent.
Make sure that the TEXT / HTML ratio is not below 10%.

Headers
H1
H2
H3
H4
H5
H6
Headers are very important factor of on-page optimization.
Correct document should contain most important

and second-level

.
Remember to insert important keyword in header.


Bolds
 The and bold type indicate the important keywords on the page.
Search engines take into account emphasized keywords that are in the text.



-->


Images
 Images enrich the content of the website.
No images on the page may be a warning signal against spam, or the page is of low quality.
Publish interesting pictures, photos or diagrams related to the content of the page.
Using the ALT attribute describe accurately each picture.

Frames
 Frames can cause problems to search engines and make your website works slow.
Avoid use of frames on websites.

Internal and External Links
External Links
1                    External Links
Outbound links (external links) are all links leading outside your website.
It is worth to link to other sites that contain high quality content related to the theme of your site.
Do not link to sites which you do not trust, in particular the spam pages.
Limit the number of outbound links to a maximum of 15 links.
Internal Links
2                    Internal Links
Internal links are any links pointing to the other pages of your website.
By placing links to other pages you are reinforcing internal linking and strengthening the usability of your website.
Always link to the most important pages of your website.

Additional Files
Robots.txt
The robots.txt file instructs search engine robots by allowing or restricting access to selected folders and pages of your website.
The robots.txt file has a huge impact on the correct page indexing in search engine results.
Make sure your website has a valid robots.txt file.

Sitemap.xml
The sitemap.xml file should contain a list of all pages of your website that need to be indexed by search engines.
Sitemap.xml file informs the search engine about relevant pages on your site, their latest update and significance in the context of the entire site.
Ensure the correctness of sitemap.xml

Social Media Signals
Facebook Likes

The number of gained "Likes" can have a positive impact on the position of your site in search results.
Install the Facebook Like button on your website and encourage your users to like your site.
Facebook Shares






By sharing your site with friends on Facebook you have a chance to gain new visitors. A large number of "shares" may have a positive impact on the position of your site in search engines.
Google +1

The number of gained Google +1 has a positive impact on the position of your site in search results.
Install Google +1 button on your website and encourage your users to click on +1.

HTTP Headers
HTTP headers is the server return information, not visible on the website.
SEO audit can be performed only on sites that have an HTTP 200 OK header, which indicates proper functioning of the page.
Any other value of the HTTP code prevents from page positioning.

Domain and Server
IP Address
IP Address has no influence on website positioning.
On one IP address can work hundreds of different websites.
Avoid sharing IP address with spam websites.
Name Servers
DNS servers has no influence on website positioning.
Like IP address, many websites can use the same DNS servers.
Avoid sharing DNS names with spam websites.
Server Geolocation
 Geographical localization of server has no direct influence on website positioning.
Use hosting companies located in regions where you are making your business.







'javac' is not recognized as an internal or external command, operable program or batch file.

'javac' is not recognized as an internal or external command, operable program or batch file.

This error might occur as JDK is not set in path System variable in environment variable


-->


To set this find out where Java is installed on your system

Let's say you have Java 7 installed on your system , copy bin directory path as below :

\jdk1.7.0\bin

Now access the path variable in environment variable from :

Control Panel\System and Security\System



Set Java_Home as
\jdk1.7.0\bin

then set  path variable value as : %Java_Home%;(current value in path)

OR

set path directly as
\jdk1.7.0\bin;(current value in path)


If Still it does not work and throw same error , Access command line 

and check the path with path command 

Now see if \jdk1.7.0\bin is there in the output ..

If it is not there then set path from command line using command 

set path =%path%;\jdk1.7.0\bin

Now check again the value of path variable , \jdk1.7.0\bin should be there 

Try executing program again , it should work now.. 












Builder Design Pattern

What is builder design pattern ?

This is used to segregate the logic for creation of complex objects. 

For example

 If we want to create an object of class representing real Estate residential project . We need to take into account lot of factors in building full fledged object . Object will consist of features like 
 payment plan 
layout 
construction plan 
builder information 
land details 
finance details 
location details
salient features 

and so on.....




So we won't prefer to embed the logic of creation of this instance in actual business logic and unnecessarily clutter the business logic flow Instead It would be good to have a dedicated service which can build up this object and once prepared can return it to business logic . Thus actual business logic remains agnostic of all object creation complexities..

So how do we achieve that in Object oriented language . 


Let us try to understand this with code. As usual I have written lot of System.out.println statements in the code to bring are execution flow steps in print statements . This code can be directly copied and executed .All steps of design pattern will be clearly written on console.


-----------------------------------------------------------------------------------------------

//This is the Client class which basically place an order. Here this client first place an order of //commercial project and after it's successful delivery It approaches for residential project and //place an order for that.



package realEstate;

public class Client {

    /**
     * @param args
     */
    public static void main(String[] args) {
      
        projectOwner owner=new projectOwner(new CommercialProjectBuilder());
        owner.placeRoder();
        owner.getProject();
        System.out.println("CLIENT :::: Thank you for timely delivry of commerical project");
        System.out.println("===============================================================");
        System.out.println("CLIENT :::: Now let's deal in residenrial");
         owner=new projectOwner(new ResidentialProjectBuilder());
        owner.placeRoder();
        owner.getProject();
        System.out.println("CLIENT :::: Thank you for timely delivry of Residential project.. Rocking performance");
    }

}

// This is project Owner . Client passes the type of project It is looking for : commercial or residential //and creates Project Owner instance . Owner will further place order of construction to Commercial //or Residential department based of which object is passed by client 
class projectOwner{
    ProjectBuiding building;
    projectOwner(ProjectBuiding building){
        this.building=building;
        }
   
  void  placeRoder(){
      building.constructBase();
      building.constructFloors();
      building.doFinishing();
      building.decorate();

   }
 
  ProjectBuiding  getProject(){
      return building;
  }
  
   
}

// Interface for Residential and commercial project builder classes 
interface ProjectBuiding{
   
   
   
    void constructBase();
   
    void constructFloors();
   
    void doFinishing();
   
    void decorate();
   
   
}

// entire process and logic of building a residential project is encapsulated in this class
class ResidentialProjectBuilder implements ProjectBuiding {
   
    ResidentialProjectBuilder(){
        System.out.println("ResidentialProjectBuilder:::Thank you for reaching us..We deal in Residential Projects..");
    }

    public void constructBase() {
      
        System.out.println("ResidentialProjectBuilder:::Construction is already started.. Promise to deliver on time ");
    }

    public void constructFloors() {
        System.out.println("ResidentialProjectBuilder::::Construction is on full Swing.. Pay installments timely ");
       
    }

    public void doFinishing() {
        System.out.println("ResidentialProjectBuilder::::About to deliver .. Have litte more Patience ");
       
    }

    public void decorate() {
        System.out.println("ResidentialProjectBuilder:::IT is well decorated.. Ready to move");
       
    }
   
}


// entire process and logic of building a Commercial project is encapsulated in this class


class CommercialProjectBuilder implements ProjectBuiding{

    CommercialProjectBuilder(){
        System.out.println("CommercialProjectBuilder ::: Thank you for reaching us..We deal in Commercial Projects..");
    }
   
    public void constructBase() {
        System.out.println("CommercialProjectBuilder :::Construction is already started.. Promise to deliver on time ..");
       
    }

    public void constructFloors() {
        System.out.println("CommercialProjectBuilder :::Construction is on full Swing.. Pay installments timely ");
       
    }

    public void doFinishing() {
        System.out.println("CommercialProjectBuilder :::About to deliver .. Have litte more Patience ");
       
    }

    public void decorate() {
        System.out.println("CommercialProjectBuilder :::IT is well decorated.. Ready to move");
       
    }
   
}


 Below would be the output of console on program execution
-------------------------------------------------------------------------------------------










CommercialProjectBuilder ::: Thank you for reaching us..We deal in Commercial Projects..
CommercialProjectBuilder :::Construction is already started.. Promise to deliver on time ..
CommercialProjectBuilder :::Construction is on full Swing.. Pay installments timely
CommercialProjectBuilder :::About to deliver .. Have litte more Patience
CommercialProjectBuilder :::IT is well decorated.. Ready to move
CLIENT :::: Thank you for timely delivry of commerical project

===============================================================



CLIENT :::: Now let's deal in residenrial
ResidentialProjectBuilder:::Thank you for reaching us..We deal in Residential Projects..
ResidentialProjectBuilder:::Construction is already started.. Promise to deliver on time
ResidentialProjectBuilder::::Construction is on full Swing.. Pay installments timely
ResidentialProjectBuilder::::About to deliver .. Have litte more Patience
ResidentialProjectBuilder:::IT is well decorated.. Ready to move
CLIENT :::: Thank you for timely delivry of Residential project.. Rocking performance

 


-->