Wednesday, July 10

Effective java ; Item 74 Implement Serialization judiciously

                   Implement Serialization judiciously

Implementing default serialization is just two words away. One need to add implement Serializable with class name . 

But this simple change comes with huge cost . 

What is that cost : 

1. Once you make a class serializable you loose the flexibility of making any change in the class.

 What does that mean??

It means : Once an instance of serializable class in serialized , and after that you make some change in class implementing default serialization , you won't be 

able to deserialize that object from serialized data stream. JVM generates serial version id for every serializable class and persist it with serialize stream So as 

to validate at the time of deserialization if JVM still have same complied version of class as it was at the time of serialization. As class has been changed later 

and that has changed the automatically generated version id , so id stored in stream does not match with id available in compiled class code and this 

mismatch causes invalidClasscast exception.  

So you need to be highly judicious to decide If implementing serialable is right for a class As making any change to that class, later, will cause problems in 

getting those persisted data stream back in jvm

2. testing effort of making any change to class will be painstaking

You never know how many instances of that class are deserialized already , So for making any change you need to support the backward compatibility so that 

every serialized instance can be perfectly deserialized. It's never an easy task As you might have to test a lot of data depending upon usage of that class. 

So it has to be a very conscious decision to use default serialization otherwise lot of pain awaits you.

3. Security issue 

An attacker can play with data stream serialized and can temper the state of object in such a way that at the time of deserialization a an instance with different 

state is created which might cause lot of instability in the system and attacker can easily penetrate the system by intelligently tempering the data steam . This 

can generate huge security concern depending upon severity and criticality  of persisted data steam

4. You always need to make sure that a class available for inheritance must have accessible default constructor As no subclass would be able to implement 

serializable if this is not done . So if you are taking decision of applying serializable on any class you need to take in consideration the structure of classes 

making its inheritance hierarchy as well

So these are some of the reasons why you should take decision of implementing default serialization very judiciously . Custom serialization provide you a lot 

of address to avoid many of these concerns. Custom serialization is discussed in detail in Item 75,76,77,78 .

please comment/question to discuss it further



Tuesday, July 9

Effective java item 75: CONSIDER USING A CUSTOM SERIALIZED FORM




                     CONSIDER USING A CUSTOM SERIALIZED FORM


It is suggested to always use the custom serialization and deserialization process.

This help us control the overall process of serialization and deserialization. Default serialization and deserialization process may cause lot of data invariants ,security breaches .

For example If you serialize an instance of a class having non transient data. And at the time of deserialization you want to perform certain validation before constructing object, default deserialization mechanism does not provide that liberty.

So to get better control and make your serialization deserialization process more extensive ,flexible and meaningful It is always good to customize the process by overriding readObject() and writeObject() methods.

If your custom serialized form is identical to the default serialize form, you may decide to use the default serialized form.

 Even in this case you should implement writeObject() and readObject() methods and call defaultWriteObject() and defaultReadObject() methods respectively from within them





---please comment/question to discuss it further

Item 76: Write readObject methods defensively


Item 76: Write readObject methods defensively



When does readObject() method comes in picture?

Method is  invoked when and serialized object is deserialized. 

Now Why does it say that readobject method should be written defensively. 

Lets understand how deserialization works. That would help understand Why it says so. 

As a result of deserialization serialized data stream of java instance is unmarshled back into the java object and brought back into the JVM. To create this object readObject method is invoked. Now what is the guarantee that after deserialization we would get logically same instance as it was persisted at the time of serialization. The data stream can bbe tempered . Attacker might modify the stream data which might result into different state of object after deserialization than what it was at the time of serialization. 

Lets take an example : 

Public final class Product  implements Serializable{
final Date date=new date();

}

Now if you serialize an instance of this class ,object will be persisted with data stream having Date variable having value of current date . If attacker temper the stream in such a way that it modify the data value in stream and then deserialzed instance have different date . Now on object constructed after deserialization delete method is invoked to delete all entries prior to the parameterized date. Now actual program had to remove the records prior to date value stored in Product instance at the time of serialization But attacker has tempered that value and it causes different set of records get deleted. 

This is just an simple example . An attacker can perform endless malicious things with data stream . So we need to safeguard that .

how can we do that? 

readObject() method comes to our safety here. This method is invoked to deserialized the instance . We can validate the data of deserialized instance and perform some checks and operations to create a instance with valid set of data. mutable data of immutable Product class i.e. date , that gets tempered by attacker should be validated against date value stored in final date variable and if it is changed method should set it back to correct value .

So we may apply required checks of mutable data of deserializing instance.We can also through exception instead of letting instance created , if we find that data of stream is not valid to create instance back. 

So readObject method should defensively assign the actual value to mutable variables of deserializing instance and prevent hazardous attacks on the program. 

So the only reason Why readObject() method should be written defensively is to ensure the security and validity of instance getting deserialized. 

Please do remember ,any inconsistency or invariants you find in deserializing instance readObject gives you a great opportunity to remove all abberations and create a proper valid instance.

please share your comments/question to discuss it further

Effective java Item 77: For instance control, prefer enum types to readResolve




                      For instance control, prefer enum types to readResolve

What is instance control?

Instance control basically refers to single instance of the class OR singleton design pattern .

Java 1.5 onwards we should always prefer ENUM to create singleton instance. It is absolutely safe . JVM guarantees that. All earlier mechanical of controlling instance to single are already broken.

So in any interview you can confidently say ENUM= provides perfect singleton implementation .

Now what is readResolve?

readResolve is nothing but a method provided in serializable class . This method is invoked when serialized object is deserialized. Through readResolve method you can control how instance will be created at the time of deserialization.Lets try to understand some code

public class President {
private static final President singlePresident = new President();

private President(){
}
}

This class generates single instance of President class. singlePresident is static instance and same will be used across.  

But do you see it breaks anywhere?

Implement this class with Serializable interface and our instance control strategy to single instance will immediately fail. At deserialization time brand new instance will be created thus we are no longer restricting single instance.

Is there a way to control that.. Let’s try.

public class President implements Serializable {
private static final President singlePresident = new President();

private President(){
}

Private Object readResolve(){

return singlePresident;
}
}

What did we do? 
We controlled the deserialization mechanism through readResolve method. We are returning the same instance what was created at the time of serialization Or when President was instantiated at first place. So the instance remains the same.

So can we say readResolve() method is sufficient to control the instance creation??

 Lets have a deeper look :


We need to understand that before readResolve() method returns singlePresident , an instance of President is created during normal deserialization process. This method does not let it escape instead override with singlePresident and that deserialized instance is garbage collected in almost no time. Now how can an attacker misuse this deserialization step involved in the process.

An attacker will make use of serialized steam , create an Dummy class , will create an private instance referring to s3erialized instance of President class.

 Ok ok , lets get into that code ;

Public class Dummy {
Private static President impersonator;
Private President instance;

readResolve (){
instance=impersonator;
}

}   

What is happening here. Attacker used stream deserializing President instance  , injected an instance of Dummy class in that. Further  Dummy class has an instance of President class. So here we are having circularity. When deserialization will happen , instance of Dummy class being inside deserializing President class, readResolve() method of Dummy class will be invoked first.

As you see in above code . What does this method do in Dummy class. 

 This method is assigning deserializing instance to static impersonator variable. So even after deserialization is complete Dummy class have cleverly retained the instance of deserialized class in static field impersonator.

So actually happened here 

 two instances of President class are available , one by attacker  impersonator and one original singlePresident returned by readResolve() method of President class. So attacker could successfully break our instance control to single.



That’s the reason Effective java says For instance control, prefer enum types to readResolve as ENUM = perfect singleton implementation




Please comment for any question or to discuss it further





Monday, July 8

Effective java : Item 78 : serialization proxy pattern


Serialization Proxy Pattern :

Need of the pattern : In java serialization mechanism , It is actual Object of the class that gets serialized . This increases the security risk. Any attacker can construct an real object from that serialized state of object and invoke the method. 

Do we want to get our object exposed so openly. Obviously No. Won't it be good that instead of real object a proxy of real object is serialized. This proxy will represent the state of real object that was earlier intended to be serialized. Now through this proxy attacker can not hit on real object and object safety is appropriately imposed.

Now how does all that happen technically in our actual code : 

Let's consider we have a class Student that we had been traditionally serializing . Student class two variables 
name and class. 

public class Student implements Serializable {
private String name;
private String class;

Student(){
}

Student(String name, String class0{
this.name=name;
this.class=class;
}
}

So if you serialize an object of this class actual object of of the class will be serialized and exposed. 

Lets try to create a proxy of this object . 

To create a proxy let's create a private static inner class in Student class. Let the name of class be StudentProxy. 

StudentProxy class will have single argumentconstructor . This constructor will only assign data from actual object to proxy object 

private static class StudentProxy implements Serializable {
private String name;
private String class;

StudentProxy (Student student){
this.name = student.name;
this.class=student.class
}
}

Please note StudentProxy class need to have the same intrinsic state as original. So it has declared all variables that were there in Student class and were desired to be serialized.

Now , how will you create the StudentProxyObject ?

 StudentProxy  is private class in Student class. We need to provide a method to create the proxy. In Student class we write a method writeReplace.


private Object writereplace(){

return new StudentProxy(this);
}

This methos is simply invoking the constructor and creating the StudentProxy instead of Student itself.


Now we are serializing the proxy But attacker might try to fabricate this to Actual Student object and might succeed to attack the code. How do we save that?

Simply we need to insulate that by adding readObject method to Student class which basically will be invoked if attacker tries to reconstruct the object. In readObject method we need to restrict object creation . 

private void readObject(ObjectInputStream stream) throws InvalidObjectException{
throw new InvalidObjectException("Go via proxy");
}

So here attacker has got no option to attack object directly and If he tries to attack on proxy it will fail to attack the real object as it has no public access to proxy . So we have safeguarded our serialization and deserialization mechanism . 

Finally in our StudentProxy class we need to provide readResolve method. This method will create the object at deserialization time using Student constructor. So if serialized object got maligned it would not get though ,this method will create same only using same constructor , nothing extra will be added to deserialized object.

private Object readResolve(){
 return new Student(name, class)
}

That's how we complete the proxy pattern implementation . 

Need to consider the fact that this proxy based security does not come free of cost but it adds to overall performance degradation to certain extent. So one need to be judicious to use proxy pattern only when it is really required .

Please share your comments /questions to discuss it further .