Parsing, Tokenizing and Formatting


This section is intended to review all the approches to find, token and format stuff.
Let's see the first example:

As you can see the mehtod matcher from the class Matcher gets a source and the class Pattern uses the method compile to handle a pattern that you want to search.


  a b a b a b a
  0 1 2 3 4 5 6

Why the little program showed above didn't print 0 2 4?
The reason is the regex engine does not consider the index 2 because it was consumed, and cannot be reused, but there are expections for this rule and will be shown sooner.

  a b a
  0 1 2

Using Metacharacters

\d A digit
\s A whitespace character
\w A word character (letters, digits, or "_" (underscore))
 . Any character

Matcher ==> 0
5 7 16
Matcher ==> 1
6 14
Matcher ==> 2
0 1 2 3 4 5 7 8 9 10 11 12 13 15 16
Matcher ==> 3
0 1 2
Matcher ==> 4
0 1 2 3 4 8 13
Matcher ==> 5
0 1 2 3 4 8 10 13 15

Using Quantifiers

+ One or more occurrences
* Zero or more occurrences
? Zero or one occurrence

Greedy Quantifiers


Tokenizing is the process of taking big pieces of source data, breaking them into
little pieces, and storing the little pieces in variables

Tokenizing with Scanner

Formatting with printf() and format()

Both methods have exactly the same behaviour which means anything we say about one of these methods is applicable to both.

Let's see how formatting works:

%[arg_index$][flags][width][.precision]conversion char

The values within [ ] are optional.

1. arg_index - An integer followed directly by a $, this indicates which argument should be printed in this position.

2. flags - While many flags are available, for the exam you'll need to know:
¦ "-" Left justify this argument
¦ "+" Include a sign (+ or -) with this argument
¦ "0" Pad this argument with zeroes
¦ "," Use locale-specific grouping separators (i.e., the comma in 123,456)
¦ "(" Enclose negative numbers in parentheses

3. width -  This value indicates the minimum number of characters to print. (If you
want nice even columns, you'll use this value extensively.)

4. precision - For the exam you'll only need this when formatting a floating-point
number, and in the case of floating point numbers, precision indicates the number of
digits to print after the decimal point.

5. conversion - The type of argument you'll be formatting. You'll need to know:
¦ b boolean
¦ c char
¦ d integer
¦ f floating point
¦ s string

Working with Dates/Numbers and Currencies

In this section you'll learn the most common methods to format Dates, Numbers and Currencies as well as the relationship between the main classes responsible to do that.

Let's start with the classes that you have to learn.



Wed Jul 18 10:18:42 AMT 2012
Wed Dec 31 19:16:40 AMT 1969
Wed Jul 18 10:18:42 AMT 2012
Sun Nov 21 04:58:20 AMT 2286

The class java.util.Date is hard to work with,if you look at the preceding example to set a specific date we have to use a number representation(long), this class is normally used to create a date that represents "now" as well as working with java.util.Calendar.


Sat Sep 15 00:00:00 AMT 2012
Today is Wed Jul 18 10:37:26 AMT 2012
Day of the week 4
Day of the year 200
Last day of the year 259
Remaining Days ==> 59
Fri Jul 18 10:37:26 AMT 2014
Sun May 18 10:37:26 AMT 2014

Watch out, the class Calendar is abstract you cannot create an instance of it(look at line 16), also it provides friendly methods to manipulate dates, where you can add hours, days, weeks, and so on.
Notice the class java.util.Calendar work together java.util.Date, instead of setting a date using number(long) you can use a Date object.
Finally the method roll is similar to add, the difference is it does not change into next month(if you're adding days), years(if you're adding months) and so on.


Sexta-feira, 17 de Agosto de 2012
vendredi 17 août 2012
viernes 17 de agosto de 2012
Parse ==> Fri Aug 17 00:00:00 AMT 2012

The class java.text.DateFormat is an abstract class that provides us formatted dates using or not pré-defined styles as well as Locales.
You can get a DateFormat class by invoking the methods getInstance() and getDateInstance().
Finally we can parse a String to Date by using the method parse() which must be enclosed in a try-catch block.


pt ->
pt -> Brasil
fr ->
en -> United States

Let's take a briefly look at java.util.Locale.

The first argument is related to the language and the second is a country.
The most important methods are getLanguage() and getDisplayCountry().
There are not setters to define language and country at java.util.Locale.


R$ 1.100,57
¤ 1.100,57

Similar to java.text.DateFormat the class java.text.NumberFormat works with numbers(currency, percent, and so on), where there are many ways to get a concrete class since NumberFormat is abstract.

.. and so on

The method parse() is present as well where you can get a Number object.

I/O - Getting better understanding of API

Let's start with the class File

Notice that the file has not been created until the line 13 be reached, to memorize it think of when you create a File(line 10) it doesn't throw any exception but the method createNewFile does.


Serialization is the mechanism that allows you save the state of objects.
Find below the keys to make a class serializable.

First of all we have to implement the interface serializable( look at line 11), also to save the state of object we need the classes FileOutputStream and ObjectOutputStream, the writeObject(object) method is responsible for it and to retrieve the serializable object from the file we need FileInputStream and ObjectInputStream the readObject() is responsible for reading it.


- If you don't want to save a member of object you must mark it as transient
- Whether a super class implements Serializable its subclasses are indirectly Serializable
- Whether a subclass implements Serializable and a superclass doesn't the constructor will be invoked as well as the instance variable will get their default values

Let's see an example:

Let's see what's been saved.

- if you wanna save a state of object that has a reference to another object which does not implement Serializable you have to write the methods writeObject() and readObject().

- Finally we can use serialization with static variables but as you've seen serialization is useful for objects

Working with Strings and StringBuffer/StringBuilder

In this section we'll discuss about String as well as its creation, polemic things like "Strings are immutable" and see wthat's going on behind the scenes also StringBuffer and StringBuilder will be on focus.

Let's take a look at how the string objects are created and understand the immutability.

As you can see it prints "Book" because we don't refer the new String created to "Java - Book", so the object is in the heap however nobody refers to it.

Commom methods
Let's see common methods in action.

Pool of String

Also you can use the method intern() (String class) this way we can get the string from the pool if the string pass using the method equals().

StringBuffer and StringBuilder friends of memory

Both classes are intented to help you handle Strings without wasting memory, as we mentioned Strings are immutable and if we have to handle lots and lots of them StringBuffer and StringBuilder will help you.
The unique difference between StringBuffer and StringBuilder is StringBuilder is not thread-safe which means its methods aren't synchronized, as you can imagine StringBuilder is faster than StringBuffer.

The most import thing here is StringBuffer and StringBuilder are not immutable like String and the StringBuffer contains the synchronized methods;

Unusual things in Exceptions

I called this post "Unusual things in Exceptions" because you're going to see something that isn't usually used in a real word as well as staffs that you have to memorize for the OCJP certification.

Let's start with the class hierarchy:

The code above does not compile you cannot handle an exception if the block try does not "throw" one.

The block finally always is called, even if there is a return statement in the try/catch block, the finally block will be executed before the return stamement.

In this case the block try "throws" an exception throught the creation of FileOutputStream, but we still getting a compiler error, because the catch that handles FileNotFoundException will never be reached.

That's the rule for checked exceptions if you throw an exception you must declare(throws) from the calling method or catch(try/catch).

Different from checked exception, every exception under RuntimeException/Error are not obligated to catch or declare, but if you did you would be able to catch it by using try/catch block.

Common Exception

Let's take a look at the common exception, watch out where they come from

Autoboxing Overloading - Part 3

I've created a new topic for overloading because it involves Autoboxing and as you'll see there are some tricks that need to seen closely.
As you know to overload a method you have to change its arguments and the compiler will decide which method to invoke.
Let's see a simple example of overloaded method:

Ok that's a easy one, the important thing here is the statement at line 19, it invokes the method doSomething(int i), despite we're passing a short value you know that int supports short because int is bigger than short, that's why the method doSomething(int i) was called instead of doSomething(byte b). Also if the method doSomething(int i) did not exist the method doSomething(double f) would be invoked.

Now take a look at the following image:

The image tells us that Widening beats Boxing and Var-args and Boxing beats Var-args, so to get a better understanding of it keep in your mind this rule.
Let's see it in action


You cannot widen an Integer wrapper to a Long, so keep it in your mind it's not able to widen Wrapper to Wrapper.
Looking closely to the doSomething() method what happens first is a boxing int - Integer and then Integer is a Number.

Finally sometimes the compiler cannot tell apart which overloaded method is eligible to be invoke which results in a compiler error, for more details about this look it out. click here