Saturday, June 28, 2014

Printing : "CAFEBABE" : The Beginning HEX characters in all Java .class files

So I came across something interesting. All java .class files have "CAFEBABE" as the beginning hex characters.

Code for a test:

Some points to note:

1. If you're wondering what ba[i] & 0xff does:
   I.   ba[i] stores byte values.
   II.  0xFF is an int in hex(to the base 16) representation. Basically it is 0x000000FF
   III. & operator is only applied to ints so ba[i] is promoted to an int
 
Suppose we have
  0x001101FF and we are &nding it with
  0x000000FF then we should get

  0x000000FF

If we had not done the &nding with 0xFF,
we would have gotten the output:

    FFFFFFCAFFFFFFFEFFFFFFBAFFFFFFBE so, &nding with 0x000000FF gives us
    000000FF   000000FF  000000FF  000000FF which equals

    000000CA 000000FE 000000BA 000000BE

[Note that (Hex) F = 1111 (Binary), C(or anything between 0 and 255) & F gives C(or that value)]

0s at the start are not printed in output) so you get the output:

CAFEBABE

import java.io.*;

public class Ex19 {
 
 public static void main(String[] args) throws IOException{
    byte[] ba = read(new File("e:\\directory.class"));
    for(int i=0;i<4;++i)
     System.out.print(Integer.toHexString(ba[i] & 0xff).toUpperCase());
    System.out.println();
 }
 
 public static byte[] read(File bFile) throws IOException{
     BufferedInputStream bf = new BufferedInputStream(
       new FileInputStream(bFile));
     try {
       byte[] data = new byte[bf.available()]; 
       bf.read(data);
       return data;
     } finally {
       bf.close();
     }
   }
}

If you are wondering why "CAFEBABE" shows up:

James Gosling, the father of Java programming language, once explained it as follows:
As far as I know, I'm the guilty party on this one. I was totally unaware of the NeXT connection. The small number of interesting HEX words is probably the source of the match. As for the derivation of the use of CAFEBABE in Java, it's somewhat circuitous:
We used to go to lunch at a place called St Michael's Alley. According to local legend, in the deep dark past, the Grateful Dead used to perform there before they made it big. It was a pretty funky place that was definitely a Grateful Dead Kinda Place. When Jerry died, they even put up a little Buddhist-esque shrine. When we used to go there, we referred to the place as Cafe Dead.
Somewhere along the line it was noticed that this was a HEX number. I was re-vamping some file format code and needed a couple of magic numbers: one for the persistent object file, and one for classes. I used CAFEDEAD for the object file format, and in grepping for 4 character hex words that fit after CAFE (it seemed to be a good theme) I hit on BABE and decided to use it.
At that time, it didn't seem terribly important or destined to go anywhere but the trash-can of history. So CAFEBABE became the class file format, and CAFEDEAD was the persistent object format. But the persistent object facility went away, and along with it went the use of CAFEDEAD - it was eventually replaced by RMI.

No comments:

Post a Comment