Wednesday, December 30, 2015

Understand spring security easily – concept view

Spring security is designed to provide both authentication and authorization to Java applications.

This article mainly try to describe spring security from a general concept view, give you a whole picture of how the spring security works in most usage. Other articles are:

Understand spring security easily – developer view (to be continued)  

Understand spring security easily – annotation example (to be continued)  

0. Basic senario

In this serial we only focus on the most popular senario, web application security and using username+password to get access. Passwords are stored in database. This article is based on spring security  4.x

1. key concept

Credential – namely password in our username + password senario.

Princple – you can think it’s a kind of identification of a user.  It includes username, password and all the authorities that this user has. Most authentication mechanisms withing spring security return an instance of UserDetails as the principal.

UserDetails – just an interface in package org.springframework.security.core.userdetails. Like said above, an instance of UserDetails always used as identification of a user. What does this mean? It means when you read you database get all information for a user, you finally get a instance of UserDetails.

3 most used methods of UserDetails are getUserName(),getPassword and getAuthorities().

Spring security provide a implementation, org.springframework.security.core.userdetails.User. But in practicle, in a spring project with ORM, you normally will have your own implementation of UserDetails. It’s often looks like:

public class CustomUser extends YourUserEntity implements UserDetails {
 //...
}

Spring security will use UserDetails instance created according to database to test browser provided info. Now the question is, “create UserDetails instance”, where does this happen?  In UserDetailsService.

UserDetailsService – This interface only has one method,UserDetails loadUserByUsername(String username)In real project you also need to provide an implementation of this interface and it often looks like:

public class CustomUserDetailsService implements UserDetailsService {
  @Override
  UserDetails loadUserByUsername(String username) {
    // access database by DAO or Spring data repository
    CustomUser userInDatabase = (CustomUser)yourUserEntityRepository.findByUsername(username);
    return userInDatabase;
  }
}

This is the place you put your own code to access database to load user information.(We define CustomUser as a child of YourUserEntity, remember?)

1. Filter Chain

The spring security is mainly build on servlet filters. Filter has a doFilter(…,FilterChain chain) . In method doFilter , there’s always a call to chain.doFilter(), which devides the filter into 2 pieces. Code before chain.doFilter() run before the request reach any servlet, code after chain.doFilter() run after the request being processed and before response send back to browser.

There are many filters in spring security and the order of these filters matters. Here is a filter list from spring security reference. There are 10+ filters in spring security, but check several key filters:

  • UsernamePasswordAuthenticationFilter – This filter get your http post username + password and create and verify the password.
  • ExceptionTranslationFilter – If not authenticated, jump to login page.
  • FilterSecurityInterceptor – if the logined user has right to access the target url. (Authorization)
  • These 3 filters are key to understand the work flow of spring security authentication and authorization.

Tuesday, December 29, 2015

"Config method" in Spring framework

1. Concept

What is config method in Spring? Any method that is anotated by @autowired is config method.

What’s the difference between a normal method and a config method in spring? Config method will be automatically  invoked when the bean instance created, after constructor but before @PostConstruct. The parameters of config method will be autowired from the application context. 

The name of the method doen’t matter and parameter number doesn’t matter. In fact we often use @autowired before setter method, that’s  just a special case of spring config method.

2. Usage of config method

Genetic config method is not widely use as field injection, setter injection or constructor injection,  but config method is used  in spring security.  According to spring security official reference here, the first step to config spring security is to extend from WebSecurityConfigurerAdapter like below.

import org.springframework.beans.factory.annotation.Autowired;

import org.springframework.context.annotation.*;
import org.springframework.security.config.annotation.authentication.builders.*;
import org.springframework.security.config.annotation.web.configuration.*;

@EnableWebSecurity
public class SecurityConfig extends WebSecurityConfigurerAdapter {

  @Autowired
  public void configureGlobal(AuthenticationManagerBuilder auth) throws Exception {
    auth.inMemoryAuthentication().withUser("user").password("password").roles("USER");
  }
}

This is a good example of using config method.  config method will be automatically invoked when bean instantiated, that’s why the document said

The name of the configureGlobal method is not important.

Because it just a config method, and will be automatically invoked with a AuthenticatonManagerBuilder bean from context. The purpose of this method is just to using AuthenticationManagerBuilder to setup authentication provider before real logic begins.

3. See also

spring framework javadoc of @Autowired

Wednesday, December 23, 2015

Break down package java.util.concurrent.locks

Since JDK 5, Java introduced the most powerful enhancement for concurrence, the package java.util.concurrent.This package contain 2 subpackages, one of them is java.util.concurrent.locks, which provides the notion of locks for synchorization. (BTW, the other subpackage is java.util.concurrent.atomic, which is a small toolkit of classes that support lock-free thread-safe programming on single variables, making common operation like i++ atomic in concurrent environment)

This article will give a big picture of the package java.util.concurrent.locks. Help you get a better understanding of this package.

This article suppose you have already know basic usage of the locks, so will not try to go into usage details, but to focus on the realations of all interfaces and classes as a whole.

1. Hirerarchy diagram

java_locks

There are only 3 interfaces in this package(green boxes). The diagram also has 4 concreate classes. Only the 2 classes in pink are usually created by keyword new directly, the rest two can not create by a direct new, because their constructor are not public but protected.

2. More explanation

The Condition instance is used to replace the low-level synchronization monitor. (If you want to know more about Java build-in low level synchronization mechanism, see here)

There is no public implementation of this interface in JDK. The only way to create Condition instance is  by newCondition() method of a Lock objects.

The methods await() and signal() are designed to replace the built-in wait() and notify() of every java Object.

The interfaces Lock and ReadWriteLock has no parent-child relation! Although Lock may sound like the parent of ReadWriteLock, but in fact they have no relations at all.

Since all locks implementation are reentrant, which means a thread already has lock can successfully call lock() again without blocking. So there are ways to get how lock hold count in program to know how many times to call unlock(). See the methods getHoldCount() and getReadHoldCount()/getWriteHoldCount().

ReentrantReadWriteLock.ReadLock  and ReentrantReadWriteLock.WriteLock (2 white boxes in diagram),can not be initialized by keyword new since they don’t have public constructors. So the instaces of these two classes can not live independently, but always accompanied by a ReentrantReadWriteLock instance.

Hope now you have a more clear view of package java.util.concurrent.locks.

Wednesday, December 16, 2015

Use Spring Test without @RunWith(SpringJUnit4ClassRunner.class)

This is a new feature from Spring framework 4.2. Now you can use other JUnit's runners,like Parameterized or MockitoJUnitRunner but without losing spring test benefits. (with all the features you love with spring-test like spring  Dependency Injection , Auto-rollback Transaction for test and etc).

In this article, a simple hello world level JUnit test case is provided with JUnit Parameterized runner, with spring-test support enabled.

0. What you need

  • JDK 1.7 +
  • Spring framework 4.2 + ( 4.2.1.RELEASE is used in this demo)
  • Maven 3.2+ (This demo is a maven project, but maven is not necessary for enable Spring-test support in other JUnit runners)

1. Define pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" 
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
                      http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.shengwang.demo</groupId>
  <artifactId>spring-test-simple</artifactId>
  <version>1</version>
  <packaging>jar</packaging>

  <name>spring-test-simple</name>
  <url>http://maven.apache.org</url>

  <dependencies>
    <!-- Spring framework -->
    <dependency>
      <groupId>org.springframework</groupId>
      <artifactId>spring-context</artifactId>
      <version>4.2.1.RELEASE</version>
    </dependency>

    <!-- Spring test -->
    <dependency>
      <groupId>org.springframework</groupId>
      <artifactId>spring-test</artifactId>
      <version>4.2.1.RELEASE</version>
      <scope>test</scope>
    </dependency>
    
    <!-- JUnit test -->
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.11</version>
      <scope>test</scope>
    </dependency>

  </dependencies>
  
  <build>
    <plugins>
      <!-- Use Java 1.7 -->
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>2.5.1</version>
        <configuration>
          <source>1.7</source>
          <target>1.7</target>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>

The pom specifies 3 dependencies, spring-context, spring-test and junit. Also it specify the Java version to 1.7.

2. Define Java Class

There are 3 classes in this demo. First is HelloService.java, which is a Spring bean as test target. 

package com.shengwang.demo;

import org.springframework.stereotype.Service;

@Service
public class HelloService {

  public String sayHello(String name) {
    return "Hello " + name;
  }
}

The second is JavaConfig.java, as Spring context configuration.

package com.shengwang.demo;

import org.springframework.context.annotation.ComponentScan;
import org.springframework.context.annotation.Configuration;

@Configuration
@ComponentScan
public class JavaConfig {}

The Last is the JUnit test case HelloServiceTest.java use Parameterized as runner.

package com.shengwang.demo;

import java.util.Arrays;
import java.util.Collection;

import org.junit.ClassRule;
import org.junit.Rule;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.junit.runners.Parameterized;
import org.junit.runners.Parameterized.Parameters;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.test.context.ContextConfiguration;
import org.springframework.test.context.junit4.rules.SpringClassRule;
import org.springframework.test.context.junit4.rules.SpringMethodRule;

@RunWith(Parameterized.class)
@ContextConfiguration(classes=JavaConfig.class)  // specify context config
public class HelloServiceTest {
  
  // -------------------------------------------
  //  spring test support requirement from 4.2
  // -------------------------------------------
  @ClassRule
  public static final SpringClassRule SPRING_CLASS_RULE= new SpringClassRule();
  @Rule
  public final SpringMethodRule  springMethodRule = new SpringMethodRule();
  // -------------------------------------------
  //  spring test support requirement over
  // -------------------------------------------
  
  private String name;
  
  @Autowired
  HelloService service;

  public HelloServiceTest(String name) {
    this.name = name;
  }
  
  @Parameters
  public static Collection<String[]> data() {
    return Arrays.asList(new String[][] {
        {"Tom"},{"Jerry"}  
    });
  }
  
  @Test
  public void testSayHello() {
    service.sayHello(name);
  }
  
}

The test case enable spring-test support by 3 steps:

  1. 1. Use @ContextConfiguration to config Spring TestContext .
  2. 2. Add a SpringClassRule static variable
  3. 3. Add a SpringMethodRule field variable

The project's hierarchy looks like below:

image_thumb5

Now the test get all abilities from spring-test. The @Autowired dependency injection works perfectly.

image_thumb2

Wednesday, December 9, 2015

Most used Hibernate properties during development

Hibernate has many properties that can be very helpful during development phrase. In this article we list some most used hibernate properties  used when developing/testing persistence layer applications.

1. Meaning of most used hibernate properties

1.1 Print out Hibernate created SQL

hibernate.show_sql = true

1.2 Automatic create tables according to entities

hibernate.hbm2ddl.auto = create

Change to create-drop if you want to drop the created table when application over. Useful for prototype and testing.

1.3 Run SQL scripts at beginning of application

hibernate.hbm2ddl.import_files = /path/of/sqlfile

This's most used to insert some test data before unit test case running.

1.4 Use c3p0 connection pool

hibernate.c3p0.min_size=5                              // minimum connections in pool
hibernate.c3p0.max_size=20                           // maximum connections in pool
hibernate.c3p0.timeout=1800                        // time out to remove idle connection. In seconds
hibernate.c3p0.max_statements=50          // cache prepared statements
hibernate.c3p0.idle_test_period=3000      // how often to validate a connection. In seconds.

1.5 Use Improved Naming Strategy ("clientAddress"->"client_address")

hibernate.ejb.naming_strategy  = org.hibernate.cfg.ImprovedNamingStrategy

2. Use Hibernate properties in JPA persistence.xml

If use Hibernate as JPA provider, you can set hibernate properties in persistence.xml.

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.1"
  xmlns="http://xmlns.jcp.org/xml/ns/persistence"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/persistence 
                      http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd">
  <persistence-unit name="persistenceUnitName"  transaction-type="RESOURCE_LOCAL">
    <provider>org.hibernate.ejb.HibernatePersistence</provider>
    <exclude-unlisted-classes>false</exclude-unlisted-classes>
    <properties>
      <!-- common jdbc configuration -->
      <property name="javax.persistence.jdbc.driver" value="org.h2.Driver" />
      <property name="javax.persistence.jdbc.url" value="jdbc:h2:mem:mydb" />
      <property name="javax.persistence.jdbc.user" value="sa" />
      <property name="javax.persistence.jdbc.password" value="" />
    
      <!-- ==================== -->
      <!-- hibernate properties -->
      <!-- ==================== -->
      <property name="hibernate.hbm2ddl.auto" value="create" />
      <property name="hibernate.show_sql" value="true"/>
      <property name="hibernate.ejb.naming_strategy" value="org.hibernate.cfg.ImprovedNamingStrategy"/>
    </properties>
  </persistence-unit>
</persistence>

3. Use Hibernate properties in Spring Boot


Spring boot does a lot of auto configurations.  So there is no need to use hibernate.hbm2ddl.import_files to set a sql script file location to run. Just put the sql script file as src/main/resources/import.sql. Spring boot will auto config it if spring-boot-starter-data-jpa is in pom.


There is also no need for hibernate.hbm2ddl.auto. Spring boot will also auto detect if the database used is a embedded one, like HSQL, H2 or Derby, or external database. If a embedded database is used,  default is create-drop. Otherwise for external database,  default is none. (No tables will be auto created). You can set this behavior in spring boot's  configuration file application.properties with spring.jpa.hibernate.ddl-auto= create, for example.


Other hibernate properties can be se with prefix 'spring.jpa.properties.' , such as add following lines in /src/main/resources/application.properties 


# hibernate special properties, with prefix "spring.jpa.properties."
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.H2Dialect
spring.jpa.properties.hibernate.c3p0.min_size=3
spring.jpa.properties.hibernate.c3p0.max_size=10
spring.jpa.properties.hibernate.show_sql=true

4. What's More


For complete hibernate 4.x configuration properties, check official hibernate reference.

Tuesday, December 1, 2015

Remote Debug in Eclipse

Remote debug a Java application means the application run on a different VM rather than the one used for Eclipse. The Java application can run on the same host of you Eclipse as a separate process or on remote host.

In practical remote debug happens when there are some external environments in production that you don't have in your development environment. Or you code works in development environment, but has problems in production environment.

Remote debug can let you connect to the running applications and do what you can do for debug, like breakpoint/single-step-run/watch variables, just like debug a local application.

To make remote debug possible, extra JVM options have to be set when running the application remotely.

In this article, we create a hello world Java application, run it with command line, then use Eclipse to connect to it. In fact this hello world application can run on any remote host, but for demo it will run on the same host as where Eclipse is.

1. Create Project in Eclipse

Create a Java project in Eclipse with a main class.

package com.shengwang.demo;

import java.io.Console;

public class DemoMain {

public static void main(String[] args) {
Console console = System.console();
console.readLine("Please hit Enter to continue...");
System.out.println("Hello World! - Remote Debug in Eclipse");
}
}

The demo project only has only 1 DemoMain.java.


image


Compile the project and it's ready to run. In your case this step means you finish your project's coding in Eclipse.


2. Run with extra JVM options


To enable remote debug, application has to be run with extra JVM options.


-Xdebug -Xrunjdwp:server=y,transport=dt_socket,suspend=n,address=0.0.0.0:8000


JDWP is the agent for debug and it will listen on port 8000, which is the default debug port in most case.


image


3.  Run Debug in Eclipse


Now the application is running, let's remotely debug it.


First select you project in Eclipse's package explorer. (make it easy to create debug configuration)


image


Then  click "Run-> Debug Configurations" from menu. Double-click "Remote Java Application" to create a debug configuration for the selected project.


image


Make sure the host and port is where you application runs and debug server specified port.  Then hit "Debug" to connect to the remote debug application. Almost done, you can debug the remote like you usually do. Let's set a break point on our source code.


image



Go back to our running application and hit enter to let it continue, the break point get hit and eclipse automatically switch Debug Perspective like local debugging.


image

Monday, November 30, 2015

Understand Persistence Context Collision in JEE

When deal with persistence layer in JEE application, persistence context is the magic that make your entity instances different from a normal POJO. Persistence context manages the entity instances to make it finally synchronized with database.

The persistence context collision happens when invoking stateful session bean from a stateless session method if both beans has their EntityManager defined. (BTW, Use stateful bean from stateless bean itself is not a good idea, but this article just focuses on persistence context)

In this article, for simplicity, when we mention stateless bean, we mean a stateless bean with EntityManager variable and its methods operate on the persistence. When  we mention stateful bean, we mean a stateful bean with EntityManager variable and its methods operate on persistence. In short we only talk about beans in persistence layer.

0. Basic rules about persistence context

Here are some basic rules about persistence context:

There will be only one active persistence context at any time for a transaction.

JEE container can propagate persistence context between different EntityManager variables in a single transaction. (Different EntityManager field variables of different beans can use the same persistence context).

Stateless bean usually use transaction-scoped persistence context, which means when the transaction is over, the persistence context is also gone. (When the bean's method is over, transaction is gone, persistence context is also gone)

Stateful bean usually use extended persistence context. The persistence is created when the bean instance is created and only destroy when the stateful bean is removed. The extended persistence context will be associated to a transaction when a method of stateful bean is called. When the method is over, transaction is gone, but the persistence context doesn't  go with the transaction but stay for the next transaction until the whole stateful bean is removed by the container.

Stateful bean always use his own extended  persistence context,  if the active persistence context is not the extended one, javax.ejb.EJBException will be thrown, this is called persistence context collision. This usually happens when call a stateful bean from a stateless bean.

Let see a simple example of persistence context collision.

1. Define pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.shengwang.demo</groupId>
<artifactId>jee-persistence-context-collision</artifactId>
<packaging>war</packaging>
<version>1.0</version>
<name>jee-persistence-context-collision Maven Webapp</name>
<url>http://maven.apache.org</url>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<!-- where glassfish 4.0 is installed -->
<glassfish.home>D:\glassfish4.0</glassfish.home>
</properties>

<dependencies>
<dependency>
<groupId>javax</groupId>
<artifactId>javaee-api</artifactId>
<version>7.0</version>
</dependency>
</dependencies>
<build>
<finalName>jee-persistence-context-collision</finalName>
<plugins>
<!-- Use Java 1.7 -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.5.1</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>

<!-- use mvn cargo:run to deploy and start server-->
<plugin>
<groupId>org.codehaus.cargo</groupId>
<artifactId>cargo-maven2-plugin</artifactId>
<inherited>true</inherited>
<configuration>
<container>
<containerId>glassfish4x</containerId>
<type>installed</type>
<home>${glassfish.home}</home>
</container>
<configuration>
<type>existing</type>
<home>${glassfish.home}/glassfish/domains</home>
</configuration>
</configuration>
</plugin>
</plugins>
</build>
</project>

The pom has 1 dependency for JEE 7 and 2 plugins. The first plugin specify the Java version (Java 1.7), the second one is for deploying package to local glassfish 4.0 with maven command-line. Demo use glassfish 4.0 as JEE container.


2. Define entity class


A very simple entity class Client.java with 2 fields, int clientId and String name.

package com.shengwang.demo.entity;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;

@Entity
public class Client {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
@Column(name = "CLIENT_ID")
private int clientId;

private String name;

/* getter setter omitted */

@Override
public String toString() {
return "{" + clientId + "," + name + "}";
}
}

The entity is trivial.


3. Define stateful bean

package com.shengwang.demo.session;

import javax.ejb.Stateful;
import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;
import javax.persistence.PersistenceContextType;

import com.shengwang.demo.entity.Client;

@Stateful
public class MyStatefulBean {
@PersistenceContext (type=PersistenceContextType.EXTENDED) // extended
EntityManager em;
Client client;

public String changeClientName() {
if (client == null) {
client = em.find(Client.class, 1);
}
client.setName(client.getName() + "_" + "hello");

return client.toString();
}
}

This stateful bean just for demo, so doesn't make much sense. Its only method change the first client's name, add suffix to the name.


4. Define stateless bean

package com.shengwang.demo.session;

import javax.ejb.EJB;
import javax.ejb.Stateless;
import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;

import com.shengwang.demo.entity.Client;

@Stateless
public class MyStatelessBean {
@PersistenceContext
EntityManager em;

@EJB
MyStatefulBean statefulBean;

public String methodA() {
String c1 = em.find(Client.class, 1).toString(); // any operation of em
String c2 = statefulBean.changeClientName();

return c1 + "," + c2;
}
}

The stateless bean has one method, methodA. Call the em.find() first then call the stateful beans's method. Calling methodA() will cause persistence context collision!


Why? Because the transaction-scoped persistence is created lazy,  so it(PC-A) will be created only when em.find()  called, and this persistence context is now the active persistence context. But the stateful bean with extended persistence context init persistence context eagerly, which means the stateful bean already has a persistence context(PC-B) when it initialized. Now the active persistence context propagated to stateful bean,PC-A,   is not the extended persistence context PC-B. So collision happens and exception will be thrown. We can see this when we try to run it below.


5. Define a servlet as EJB client


To use the beans we defined above, let's define a simple servlet.

package com.shengwang.demo.servlet;

import java.io.IOException;
import java.io.PrintWriter;

import javax.ejb.EJB;
import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.shengwang.demo.session.MyStatefulBean;
import com.shengwang.demo.session.MyStatelessBean;


@WebServlet(name = "testServlet", urlPatterns = { "/test" })
public class TestServlet extends HttpServlet {
private static final long serialVersionUID = 1L;

@EJB
MyStatelessBean statelessBean;

@EJB
MyStatefulBean statefulBean;

@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
String str = statelessBean.methodA();

PrintWriter out = resp.getWriter();
out.printf("%s", str);
out.flush();
}
}

In the servlet, the stateless bean's only method is invoked.


6. Config persistence.xml


Usually the META-INFO/persistence.xml is simple for JEE applications, tell the server which data source application wants to use.

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.1" xmlns="http://xmlns.jcp.org/xml/ns/persistence"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/persistence
http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd">
<persistence-unit name="demo-persistence-unit" transaction-type="JTA">
<jta-data-source>jdbc/MySQLDataSource</jta-data-source>
</persistence-unit>
</persistence>

There's a data source on server with JNDI name  jdbc/MySqlDataSource.


image


Finally, before running the demo, the project hierarchy looks like below.

image 

7. Run the demo


Run the demo with mvn command-line:

mvn clean verify cargo:run

This command will start glassfish server and deploy our application to it. Then access the servlet by browser. You can see the exception: javax.ejb.EJBException: There is an active transactional persistence context for the same EntityManagerFactory as the current stateful session bean's extended persistence context


image


8. What's more


What will happen if we make a little change to the stateless bean.

@Stateless
public class MyStatelessBean {
@PersistenceContext
EntityManager em;

@EJB
MyStatefulBean statefulBean;

public String methodA() {

// These 2 lines switch order
String c2 = statefulBean.changeClientName();
String c1 = em.find(Client.class, 1).toString();

return c1 + "," + c2;
}
}

The order of two lines are swithed. Now the demo can run without any exception.

Why? Because when invoke the stateful bean there is no persistence context yet, so the stateful bean's method can use its own extended persistence context to run. Then the em.find() is invoked, a transaction-scoped persistence create when first em operation invoked. So these 2 lines use different persistence context! If this is understood, then then output also makes sense to you.

image


Furthermore, If you really need to call stateful bean from stateless bean in it original order, which cause the persistence context collision. Here are some detours ( think twice when you want to do this):


1. Mark stateful bean (or its method invoked by stateless bean) REQUIRE_NEW for TransactionAttributeType. So there will be 2 transactions for stateless and stateful bean respectively. Each transaction can has its own persistence context. Similar scenario like above, entities may have different values on different context.


2. Mark stateful bean (or its method invoked by stateless bean) NOT_SUPPORT for TransactionAttributeType. If only invoke stateful bean's read-only operation from stateless bean.  (Servers like Glassfish may need to config JDBC connection pool to enable Non Transactional Connections)


3. Use Application-Managed EntityManager instead of the default Container-Managed Entity Manager in the stateful bean.

Monday, November 23, 2015

How to setup a maven JEE project in Eclipse

Let's suppose you are new to JEE programming. After reading oracle official JEE tutorial document for a few days, you decide to get your hands wet.  You  download and install a JEE container, GlassFish 4 in this tutorial,  to your PC. You have JDK,maven and Eclipse all ready. So, what's next?

This demo shows you how to create maven project in Eclipse that can automatically deploy to a local GlassFish server. 

0. What you need

  • JDK 7+
  • Glassfish 4.0
  • Maven
  • Eclipse Java EE IDE (Luna 4.4.2 used in this demo)

1. Create a Empty Maven project in Eclipse

image

image

Since the demo is about servlet, so choose webapp archetype. For EJB you can choose the most simple one "maven-archetype-quickstart". The click  next until finish.

2. Modify pom.xml

Modify pom file for 4 reasons.

  • Change default package from "jar" to "war"
  • Add JEE dependency
  • Change default Java version to 1.7 
  • Configure Cargo plugin so we can use maven command-line to deploy you JEE package to server
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.shengwang.demo</groupId>
<artifactId>jee-servlet-demo</artifactId>
<packaging>war</packaging>
<version>1.0</version>
<name>jee-servlet-demo Maven Webapp</name>
<url>http://maven.apache.org</url>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<!-- This is where I have my glassfish installed -->
<glassfish.home>D:\glassfish4.0</glassfish.home>
</properties>

<dependencies>
<!-- JEE dependency -->
<dependency>
<groupId>javax</groupId>
<artifactId>javaee-api</artifactId>
<version>7.0</version>
</dependency>
</dependencies>

<build>
<finalName>jee-servlet-demo</finalName>

<!-- Use Java 1.7 -->
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.5.1</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>

<!-- config cargo use local installed Glassfish -->
<plugin>
<groupId>org.codehaus.cargo</groupId>
<artifactId>cargo-maven2-plugin</artifactId>
<inherited>true</inherited>
<configuration>
<container>
<containerId>glassfish4x</containerId>
<type>installed</type>
<home>${glassfish.home}</home>
</container>
<configuration>
<type>existing</type>
<home>${glassfish.home}/glassfish/domains</home>
</configuration>
</configuration>
</plugin>

</plugins>
</build>
</project>

3. Change Servlet version


The just created project may have error about servlet version, since the archetype in Eclipse is not actively maintained. The project is still using servlet version 2.3.  There's also a bug in Eclipse that you can't change the version, You can only disable the Dynamic Web Module facets and set it again.


Right click on project name, and choose "Project Facets", then unselect Dynamic Web Module first.


image


Uncheck the Dynamic Web Module, click OK to finish.  Then open project facets again, set it back.


image


4. Set JEE application related config 


In this demo, servlet need a web.xml for deploy. If writing a EJB has persistence, then you may need a JPA config file persistence.xml.

<web-app xmlns="http://java.sun.com/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/javaee
http://java.sun.com/xml/ns/javaee/web-app_3_0.xsd"
version="3.0">

<display-name>Demo for jee servlet hello world config</display-name>

</web-app>

The web.xml for this JEE servlet is almost empty. There is no <servlet> and <servlet-mapping> for normal servlet container (like tomcat) deployment. Because JEE server can automatic register servlets with annotation @WebServlet


5. Define Java class


We has only one class to write, HelloServlet.java. If the src/main/java directory not exists, create it by hand.

package com.shengwang.demo;

import java.io.IOException;
import java.io.PrintWriter;
import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;


@WebServlet(name = "helloServlet", urlPatterns = { "/hello" })
public class HelloServlet extends HttpServlet {
private static final long serialVersionUID = 1L;

@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
PrintWriter out = resp.getWriter();
out.printf("%s", "hello world from a JEE servlet");
out.flush();
}
}

This class use JEE annotation @WebServlet to provide http service at url localhost:8080/context_name/urlPattern . context_name is set in pom.xml by <finalName>, urlPattern is the urlPattern set in Java code. So for this demo the url is http://localhost:8080/jee-servlet-demo/hello


All done, now this 3-files project hierarchy looks like:


image


6. Run


Use maven command line to auto deploy our servlet to the local GlassFish server.

mvn clean verify cargo:run

The console output looks like below, the cargo maven plugin will start server and deploy automatically.


image


Now access our hello world level JEE servlet by browser.


image


Now if you open admin console of GlassFish, you can see our demo is correctly deployed. (Login required, default user is 'admin', default password is 'adminadmin' for cargo plugin)


image

Sunday, November 15, 2015

Low-level synchronization in Java

Low level synchronization is the mechanism used before package java.util.concurrent was introduced into Java 1.5. You may not use it since the modern Java provide more high level sync mechanism since Java 1.5, but understand low-level sync can help you understand high-level counterparts like java.util.concurrent.locks.Condition, which not only duplicates low-level's function but also provides more flexibility.

The low-level sync mechanism requires 2 things:

  • keyword synchronized
  • methods wait/notify(notifyAll)

BTW, you may notice that wait()/notify() methods are build-in nature for every Java Object.

image

In this article, we try to build a simple producer/consumer model. They both work on an ArrayList, but consumer thread will block if the list is empty. As long as a producer thread put in a Integer into the list, the consumer can resume from the blocking and continue. (In reality, java.util.concurrent.BlockingQueue should be used instead write your own one).

0. How it works

In low-level sync, lock can be acquired from every object. we call it monitor lock, monitor, technically speaking, is the object whose lock you are acquiring. In this demo an ArrayList instance (called queue, see blow java definition) is the monitor we try to get lock from. 

Keyword synchronized(...) is used to acquire locks. although it looks like a methods but it's a keyword( You can  NOT Ctrl + click to goes into its implementation like a method of any Object). Synchronized block is always based on an object instance, the monitor, which is the parameter of synchronized(...)

wait() and notify() can only be called after the lock acquired, which means these methods can only be called within synchronized block.

when wait() invoked,  lock will be immediately release. then thread suspend. You may think "thread steps back from running and enters some kind of a  waiting-room" . The thread stays in "waiting-room" until notify() get called. Then this thread will try to get the lock again.After acquired the lock again, the wait() method finishes.

when notify() invoked, "one of the waiting thread gets out of the waiting-room" . normally this thread will soon get CPU time and  run again. notify() don't release lock, so it normally is put just before end of the synchronized block. When synchronized block ends, lock released.  If there are more threads in "waiting-room" which one get out is random. notifyAll() make all waiting threads get out of the waiting room, but only one of them can get lock.  So these thread will run one by one.Running order is not guaranteed.

1. Demo

There are 3 classes, consumer, producer and main class. Firstly define MyProducer class.

class MyProducer implements Runnable {
private List<Integer> queue;

public MyProducer(List<Integer> queue) {
this.queue = queue;
}

public void put(Integer e) throws InterruptedException {
synchronized (queue) {
System.out.printf("put %d to queue\n", e);
queue.add(e);
queue.notify();
}
}

@Override
public void run() {
try {
put(new Integer((int) (Math.random() * 10))); // random 0 ~ 9
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}

The producer class implement Runnable interface and just input a random number into to list.  The instance variable queue are pass in by constructor, on which lock will be acquired by call synchronized(queue). The notify() is invoked to notify any thread may stay in the "wait-room" that this queue is not empty anymore.  


Secondly define MyConsumer class.

class MyConsumer implements Runnable {
private List<Integer> queue;

public MyConsumer(List<Integer> queue) {
this.queue = queue;
}

public Integer take() throws InterruptedException {
synchronized (queue) {
while (queue.isEmpty()) {
queue.wait();
}
return queue.remove(0);
}
}

@Override
public void run() {
try {
System.out.printf("Try to take from queue\n");
System.out.printf("take %d from queue\n\n", take());
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}

The consumer will try to get an element out of the list. if the list is empty, consumer thread will wait.


Finally the main class.

public class MyArrayBlockingQueue {
static private List<Integer> queue = new ArrayList<>();

public static void main(String[] args) throws InterruptedException {
MyProducer pr = new MyProducer(queue);
MyConsumer cr = new MyConsumer(queue);

Thread p1 = new Thread(pr);
Thread p2 = new Thread(pr);

Thread c1 = new Thread(cr);
Thread c2 = new Thread(cr);

p1.start();
Thread.sleep(1000);
c1.start();
Thread.sleep(1000);
c2.start();
Thread.sleep(5000);
p2.start();
}
}

The main class creates 2 producer threads and 2 consumer threads. The first producer thread p1 put a integer into the list and first consumer thread c1 take it out. Then second consumer c2 try to take from the empty list, it will block until 5 seconds later, the second producer p2 put integer into the list.


2. what's more.


Now you should understand how low-level synchronization mechanism works and the usage for methods wait()/notify() of every objects. In reality, you should use high-level sync mechanism like Lock, Condition and many other power ways provided in package java.util.concurrent and java.util.concurrent.locks.

Thursday, November 12, 2015

Understand Glassfish JDBC configuration by inspecting JDBC DataSource API

When configure DataSource on JEE container Glassfish, There are 2 things need to be set, JDBC Resource and JDBC Connection Pools.

image

What's the difference and why are these two items? What's the relation between them? We can answer that by inspecting the Java JDBC API, mainly in package javax.sql

1. How to connect to DB

To work with DB in Java, the application must have a connections  (java.sql.Connection) to the database.  There are 2 ways to get a connection in JDBC:

  • Method 1, using static method, DriverManager.getConnection(url). This is a very old way, still valid, only used in small standalone applications, doesn't support pooling.
  • Method2, using DataSource interface. Recommended. In most case work with JNDI lookup to get one from a container.

2. The hierarchy of DataSource

 

jdbc

There are some points need to be noticed about this diagram:

They are all interfaces. No concrete class at all.

There are 3 interfaces related to data source, DataSource/ConnectionPoolDataSource/XADataSource.

Application Java code will ONLY come into contact with the green part. DataSource and Connection.

ConnectionPoolDataSource instance can create PooledConnection instance(pooled...you know,name tells all), XADataSrouce can create XAConnection (for distributed transaction)

The ConnectionPoolDataSource, although may sounds like, but is not a Child of DataSource. In other words, A  ConnectionPoolDataSource instance can't be simply casted to DataSource instance. So inside a DataSource instance, the real work will somehow be delegate to a ConnectionPoolDataSource/XADataSouce.

Same thing applied to Connection. The Connection has nothing to do with PooledConnection, although they looks similar. No casting but delegation.

Up to know you should understand the relationship between JDBC Resource and JDBC Connection Pools in Glassfish Administration GUI. Think JDBC Resource as DataSource and JDBC Connection Pools  as ConnectionPoolDataSource.

3. Using MySQL in Glassfish

Let's use MySQL's official developer guide on "Using Connector/J with GlashFish" as a testify. On this document, the configuration has 2 steps:

  • step1. Creating a Connection Pool, filling in  information like DB's IP address, username and password for make "real" connections to database.
  • step2. Create a JDBC Resource. choose a pool created in step 1 and set a JNDI name for it.  We can say the JDBC Resource HAS-A pool.

The steps can pretty much demonstrate the hierarchy of DataSource in JDBC API, which once get understood will make the configuration steps make sense and easy to remember.

Tuesday, November 10, 2015

Break down class "Files" of java 7 NIO.2

Since Java 7, File related IO operation has been redesigned. The following 3 classes/interfaces are key for file related operations: Path,Paths, and Files. They are all in package java.nio.file. Among all these 3, Files is the most fundamental one.

0. Brief about Path and Paths

Paths is a concrete classes which only has method get(...) to create Path instance. so Paths is just a factory class whose only usage is get a Path instance from a input String or URL to static method get(...)

Path is an interface designed to replace old-styled java.io.File class.  First thing to notice is that Path is not a class but a interface, can't create instance with keyword new. So Paths was introduced for doing the initial.  Second thing need to know about Path is its all method will not do all really I/O, just string manipulation. All methods of Path will not touch anything on the HD. For example method to normalize a path "/usr/local/../bin/../tmp" to path "/usr/tmp" located in class Path.

Then who make the "real" I/O operation? The class Files - this beauty has it all! 

1. Anatomy of class Files

Files is a concrete class only has plenty of static methods. All most all methods need Path instance as its first parameter.

java_nio_file

There seems a lot of stuff, but take it easy, let's go through them step by step. The functions can be categorized in 4 groups.  Also The diagram contains most used methods of Files, but not all. The methods' signature are simplified just for easy to understand without bothering too many details.

  • group 1, basic manipulation. Create/Delete/Rename/Move a file or directory on the file system.
  • group 2, go through directory. 2 methods used to list directory contents without and with subdirectories respectively.
  • group 3, read/write content of file.
  • group 4. get/set file related attributes (the rest parts on the above diagram with pink/blue/green colors)

1.1 Basic I/O manipulation

The methods in these group are pretty self-explained.  By utilizing methods in this group you should be able solve followings:

  • How to create/copy/rename/move/delete a single file?
  • How to create/copy/delete empty directory, rename/move non-empty directory?

There is no single method can copy/delete non-empty directory, has to fulfill that with methods in group 2.

1.2 Go through directory

There are 2 methods in this group. Method newDirectoryStream() works just like command-line 'ls' or 'dir', not recursively deep into subdirectories. On the contrary, walkFileTree() recursively go though every subdirectories.

How to list a directory, only one tier?

  public static void listDirectory(String dir) throws IOException {
Path path = Paths.get(dir);
DirectoryStream<Path> dirStream = Files.newDirectoryStream(path, "*.java");

for (Path p:dirStream) {
// do something will every file/subdir in this directory
System.out.println(p.getFileName());
}
}

How to go through a direcoty recursively, include all subdirectories ?

  public static void goThoughDirectoryRecursively(String dir) throws IOException {
Path path = Paths.get(dir);
Files.walkFileTree(path, new SimpleFileVisitor<Path> () {
@Override
public FileVisitResult visitFile(Path path, BasicFileAttributes attrs) throws IOException {
System.out.println(path); // print out name of every file/directory
return FileVisitResult.CONTINUE;
}
});
}

The walkFileTree() usage is a standard incarnation of "Visitor" design pattern.


1.3 Read/write file contents


These group are for reading or writing contents of file. Use Reader/Writer for text file and InputStream/OutputStream for binary file. Also there are helper methods to read all contents at once for small files.

  public static void readSingleFile(String fileName) throws IOException {
Path path = Paths.get(fileName);
BufferedReader reader = Files.newBufferedReader(path, StandardCharsets.UTF_8);

// print the file content to screen
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
reader.close();
}

1.4 Get/set file attributes


The file and directory on file system can have many attributes, some basic attributes like creation/access/modification time. some of the attributes are specific to windows, like is the file hidden? Some of them are specific to Unix, like is the file writable to other users?


There are 3 ways to get/set file attributes:

  • Use simple shortcut static methods provided by class Files to test and set  (pink on the diagram)
  • Use attributes related classes  like XXXFileAttributes and XXXFileAttributeView (blue on the diagram, see explanation below)
  • Use powerful shortcut static methods getAttribute()/setAttribute() (green on the diagram, see blow)

1.4.1 simple shortcut


The methods for simple shortcuts are ....simple. yes, they are strait forward.  But simple shortcuts don't cover all attributes just some most used ones. Also you may already notice that fewer methods for set than get. ( There is a 'hole' on up-right corner). Shortcut methods in fact just use other attributes related classes we will see below for simplicity.


1.4.2 attributes related classes


Check the blue part on the diagram, left 3 named like XXXFileAttributes, which are read-only. Right 3 named like XXXFileAttributeView are used for file attribute modification.  The XXX here in class name is called view-name, we'll see the usage of view-name very soon.


Here is code snippet to read file attributes on windows.

  // test if a file is read-only on Windows
public static boolean isFileReadOnly(String fileName) throws IOException {
Path path = Paths.get(fileName);
DosFileAttributes dosAttr = Files.readAttributes(path, DosFileAttributes.class);
return dosAttr.isReadOnly();
}

Here is code snippet for modify file attributes on Linux

  // set file permission to "rw-r--r--" on Linux
public static void modifyFilePermission(String fileName) throws IOException {
Path path = Paths.get(fileName);
PosixFileAttributeView posixAttrView = Files.getFileAttributeView(path, PosixFileAttributeView.class);
posixAttrView.setPermissions(PosixFilePermissions.fromString("rw-r--r--"));
}

1.4.3 powerful shortcut


Unlike simple shortcuts, powerful shortcuts only have a pair of getAttribute(path,attributeString...)/setAttribute(path,attributeString...) , but can cover all file attributes. The format of string parameter attributeString is  [view-name:]attribute-name

  • The default value for view-name is basic
  • The attribute-name is in lower case.
  • Need type-casting because get/set only return or use Object instance as values 

We have a code snippet above isFileReadOnly(...) to test if a file is read-only on windows. Here is the equivalent.

  // test if a file is read-only on Windows 
public static boolean isFileReadOnly(String fileName) throws IOException {
Path path = Paths.get(fileName);
return (boolean) Files.getAttribute(path, "dos:readonly");
}

2.Recap


Get a clear view of class Files will give you a good understanding of file related I/O functions in Java 7 + NIO.2.  Again let's review these 4 groups :

  • group 1, basic manipulation. Create/Delete/Rename/Move a file or directory on the file system.
  • group 2, go through directory. 2 methods used to list directory contents without and with subdirectories respectively.
  • group 3, read/write content of file.
  • group 4. get/set file related attributes (the rest parts on the above diagram with pink/blue/green colors)

How to understand < ? super Child> in Java Generics

In Java, polymorphism does NOT apply to generic types. So suppose class Child extends from class Parent, then the following 2 lines, only the first can pass compilation.

  Parent var = new Child();  // compiler is happy with this
ArrayList<Parent> myList = new ArrayList<Child>(); // compilation error

The way to bring the idea of inheritance into Java generic types is using <? extends Parent> or <? super Child>. In this article we'll see how to understand the meaning of syntax <? super Child> in Java generics. 


Check another article on how to understand syntax <? extends Parent> in Java generics.


0. Clarify concept

  interface Animal{ }
class Dog implements Animal { }
class Cat implements Animal { }

public void test() {
// make sure you understand these 2 lines
List<Animal> pList = new ArrayList<Animal>();
pList.add(new Dog()); // It's fine
}

There is a List of type Animal variable pList. Can a Dog instance be added to this list? The answere is yes, because the Dog IS-A Animal;


1. Basic meaning


<? super Child> refer to any class that see class Child as its descendent.


<? super Child> can be used for variables definition and  method's parameter, but most used in latter.


2. Make a collection (kind of) write-only


If generic types' syntax <? super Child> used as method's parameter definition, then the parameter is (kind of) write-only. Usually it's used with a collection parameter.


Why ? Let's image you have a method  addDogToList defined like

  void addDogToList (List<? super Dog> myList) {
myList.add(new Dog()); // design to add Dog to collection

Cat obj = (Cat) myList.get(0); // Compile OK, but risky!
// Never explicitly cast type
// when using generic types
}

The addDogToList has a paramenter myList as type List<? super Dog>.  So the following usage are both correct since Dog instance can be add to a either Dog list or Animal List (see charter 0. clarify concept above)

  addDogToList(new ArrayList<Dog>());
addDogToList (new ArrayList<Animal>());

we just explained why "write" to that list is OK, but why "write-only"? Technically speaking, you can read elements from collection marked by , but the return type is Object, which means you can cast it to any Class you want and the compiler will let you pass anyway, like what we did above, we cast element to type Cat, but during runtime that normally means a disaster. Furthermore, the whole meaning of Generics is to eliminate type casting for the sake of type-safe. That's why call it "kind of write-only" (you can read, but don't)


3. Recap


When use a collection variable with <? super Child> style, it means " Hey, I'm going to add Child instance into this collection, just make sure the argument passed in can hold new Child instance."

Monday, November 9, 2015

How to understand < ? extends Parent> in Java Generics

In Java, polymorphism does NOT apply to generic types. So suppose class Child extends from class Parent, then the following 2 lines, only the first can pass compilation.

  Parent var = new Child();  // compiler is happy with this
ArrayList<Parent> myList = new ArrayList<Child>(); // compilation error

The way to bring the idea of inheritance into Java generic types is using <? extends Parent> or <? super Child>. In this article we'll see how to understand the meaning of syntax <? extends Parent> in Java generics. 


Check another article on how to understand syntax <? super Child> in Java generics.


1. Basic meaning


<? extends Parent> means any class that see class Parent as its ancestor. Parent can be either a class or an interface. (yes, interface is OK although keyword extends is used)


<? extends Parent> can be used for variables definition and  method's parameter


2. Make a collection read-only


If generic types' syntax <? extends Parent> used as method's parameter definition, then the parameter is read-only. Usually it's used with a collection parameter, and that collection is read-only inside the method.


Why ? Let's image you have a method printName defined like

  interface Animal{
String getName();
}
class Dog implements Animal { // omit getName implementation }
class Cat implements Animal { // omit getName implementation }

void printName(List<? extends Animal> animals) {
// Good to read from List
for (Animal animal : animals) {
System.out.println (animal.getName());
}

// Error when write to list, compile fail
animals.add(new Dog());
}

The printName has a paramenter animals as type List<? extends Animal>.  So the following usage are both correct since Dog and Cat are all subclass of Animal.

  printName (new ArrayList<Dog>());
printName (new ArrayList<Cat>());

Now let answer the question about why read-only. In method printName, every element out of list can be guaranteed IS-A type Animal, that's why read from list is OK. But when try to add new element,  the compiler has no idea the input list animals has Dog or Cat in it. Because anyone can call this printName method with eight Dog list or Cat list. This information is unknown to compiler, so all the java compiler can do is  fail there to avoid making severe mistakes adding  a Dog instance into a Cat list.


3. Recap


When use a collection variable with <? extends Parent> style, it means " Hey, I'm gonna use elements in this collection and make sure every one in this collection fulfill IS-A type Parent requirement. But I promise will never ever try to add anything back into this collection"

Thursday, November 5, 2015

Glob in Java file related match

Globs are not regular expression. Glob is simpler and earlier than regular expression. For historical reasons, glob are more used as file name or file path filtering.

In Java 7 NIO package has 2 places that glob appears as file names or file path filter. One is when you create a PathMatcher to test if a java.nio.file.Path instance matches a patten like

Path path = Paths.get("abc.java");
PathMatcher matcher = FileSystems.getDefault().getPathMatcher("glob:*.java");
boolean isJavaFile = matcher.matches(path); // true

The other is when create a DirectoryStream to iterate all files and subdirectories( not including files under subdirectories, just like the command 'dir' or 'ls')

Path dir = Paths.get("/tmp");
try (DirectoryStream<path> stream = Files.newDirectoryStream(dir,"[vt]*")) {
for (Path path : stream) {
System.out.println(path.getFileName()); // only files start with 'v' or 't'
}
}

Here are the rules for glob:


  • * match any char except a directory boundary
  • ** match any char include a directory boundary
  • ? match any ONE char
  • [] same as regular express, like [0-9] match any ONE digit.
  • {} match a collection of patten separated by comma ','. Such as {A*, b} means either a string start with 'A' or a single char 'b'.

More examples:

pathglobmath result
/tmp/src/main/Demo.java*.javafalse
/tmp/src/main/Demo.java**.javatrue
/tmp/src/main/Demo.java/tmp/*/*.javafalse
/tmp/src/main/Demo.java/tmp/**/*.javatrue
/tmp/src/main/Demo.java/tmp/**/[dD]*.javatrue
/tmp/src/main/Demo.java/tmp/**/[aA]*.javafalse

Tuesday, November 3, 2015

A brief in Java string format

When to format a Java String using System.out.printf or System.out.format, The syntax for the format is :

%[arg_index$][flags][width][.precision]<conversion_char>

Here are more explanation:
  • arg_index    An integer with suffix '$'. begin with 1, not 0. The position of the args.
  • flags 
    - Left-align, default is right-aligned
    + Include +/-before number
    0 Pad with zeros before number
    , User comma to separate number
    ( Enclose negative numbers in parentheses. '-' will not display
  • width  The width of the printed String, very useful to get table like output
  • precision  For float/double, specify the digits after "."
  • Conversion_char
    b boolean
    c char
    d integer
    f float/double
    s string

Below is a simple example for demostration

    System.out.printf("<%-3s><%6s><%10s><%10s>\n","Id","Gender","Name","Balance");
System.out.printf("<%3$03d><%2$6s><%1$10s><%4$+10.2f>\n", "Tom","M",1,688.972); // arg_indx
System.out.printf("<%03d><%6s><%10s><%+10.2f>\n", 2,"M","Jerry",-12.345);
System.out.printf("<%03d><%6s><%10s><%(10.2f>\n", 3,"F","Rose",-12.345);

The "<"  and ">" are added to make the border visible. The outputs are:

<Id ><Gender><      Name><   Balance>
<001>< M>< Tom>< +688.97>
<002>< M>< Jerry>< -12.35>
<003>< F>< Rose>< (12.35)>

Friday, October 30, 2015

How to use embedded Java DB (Derby) in maven project

For now, Java DB is actually just Apache Derby with a different name.  In the following of the article, we will call it Derby. It  comes with JDK installation. ( Although what's normally used in maven project is not the same binary install in local JDK directory)

Using embedded Java DB means the database will run in the same JVM as your application. The Java DB engine actually gets started when you try to connect to it by JDBC. When the application exits, the database also exits. If you choose to run the Java DB total in memory, when the JVM stops, the data will be gone. Or you can choose to store the data on local file system to make them usable during multiple runs.

Java DB (Derby) is mostly used for convenience in development. No external database is needed even you have code need to play with RMDB.

0. What you need

  • JDK 6+ (JDK 7 in this demo)
  • Maven 3.2 +

1. POM file

There's only one dependency needed to use Derby database.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>com.shengwang.demo</groupId>
<artifactId>javadb-derby-embedded-basic</artifactId>
<version>1.0</version>
<packaging>jar</packaging>

<name>javadb-derby-embedded-basic</name>
<url>http://maven.apache.org</url>

<dependencies>
<dependency>
<groupId>org.apache.derby</groupId>
<artifactId>derby</artifactId>
<version>10.8.3.0</version>
</dependency>
</dependencies>

<!-- Use Java 1.7 -->
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.5.1</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
</plugins>
</build>
</project>

Since the version of Java DB comes with JDK 7 is 10.8.3.x, we also use 10.8.3.0 in our maven project. If just using in-memory database(see below), the version used in maven actually doesn't matter. But if database file stores on file system and you want to check the database content after application finishes, you'd better use the same version as from JDK, so the 'ij' tool in $JAVA_HOME/db/bin can open the database without version conflicts.


2. When the embedded database starts


The database starts when you java code try to connect to it by using standard JDBC. How the derby work depends on the way to connect to it, or in other words, depends on the connection url. Suppose we need to connect to a database named 'demo'.


In-memory database, url looks like: jdbc:derby:memory:demo;create=true


'demo' is the database name and can be any string you choose, "memory" is a key word to tell Derby  to goes to all-in-memory mode.


File-based database, url looks like: jdbc:derby:c:\Users\shengw\MyDB\demo;create=true


'c:\Users\shengw\MyDB\demo' is the directory to save database files on local file system. (On windows its actually jdbc:derby:c:\\Users\\shengw\\MyDB\\demo;create=true because of the String escaping)


'create=true' is a Derby connection attribute to create the database if it doesn't exist. If use in-memory database, this attribute is mandatory.


3. A complete example


This is a complete hello world level example using embedded Derby database in Maven project. The HelloJavaDb.java lists below.

package com.shengwang.demo;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class HelloJavaDb {
Connection conn;

public static void main(String[] args) throws SQLException {
HelloJavaDb app = new HelloJavaDb();

app.connectionToDerby();
app.normalDbUsage();
}

public void connectionToDerby() throws SQLException {
// -------------------------------------------
// URL format is
// jdbc:derby:<local directory to save data>
// -------------------------------------------
String dbUrl = "jdbc:derby:c:\\Users\\shengw\\MyDB\\demo;create=true";
conn = DriverManager.getConnection(dbUrl);
}

public void normalDbUsage() throws SQLException {
Statement stmt = conn.createStatement();

// drop table
// stmt.executeUpdate("Drop Table users");

// create table
stmt.executeUpdate("Create table users (id int primary key, name varchar(30))");

// insert 2 rows
stmt.executeUpdate("insert into users values (1,'tom')");
stmt.executeUpdate("insert into users values (2,'peter')");

// query
ResultSet rs = stmt.executeQuery("SELECT * FROM users");

// print out query result
while (rs.next()) {
System.out.printf("%d\t%s\n", rs.getInt("id"), rs.getString("name"));
}
}
}

The demo uses derby database, creates a table 'users', inserts 2 rows into the table and prints the query result set.  The whole maven project hierarchy is :


image



After running the HelloJavaDb example, you can verify the database, because we are not using derby in all-in-memory mode.  After running, the database files will appear on local file system like this. 


image


If you connect to the database in command line, you can see the 2 rows add from your Java code. 'ij' is a tool provided by Derby works as a sql client.


image

Tuesday, October 27, 2015

How to use Java parallel Fork/Join framework - Hello world example

Since Java 7, fork/join framework has be introduced in to Java API. The main difference between fork/join and other multi-threading mechanism like executor or thread pools is that: traditional multithreading focuses on "let every task has the chance to run simultaneously", fork/join framework focuses on "saturate the CPU usage, make full use of the hardware resources".

0. Why fork/join?

So what's the problem for traditional thread pools, or why does fork/join need to be introduced since there are already many ways for working in parallel?Fork/join framework normally used with single task, which is BIG. Let's suppose the CPU has 4 cores. To make max usage of the quad-core CPU, you don't want any core idle while others are still busy with some work.  But If you split the big task into 4 subtasks and give them to each core to run using the traditional thread pools, when every thread terminate, a core will become idle while the rest cores are still struggling. It's a kind of waste of CPU resources. The fork/join framework can prevent this from happening. Every core keeps busy  until the whole job is done.  When a thread(CPU core) is idle or done with it's workload, it will try to help other threads instead of just sit there doing nothing. In fork/join framework, it's called work stealing.

Work stealing is the key feature of the fork/join framework.

1. Working theory

In brief, the fork/join framework also use a thread pool, java.util.concurrent.ForkJoinPool, but unlike traditional thread pool, every thread in this pool has a queue. Every thread can access each other thread's queue. The queue of each thread can be treat as a work load buffer.

1.1 What happens when firstly a thread get a task

The fork/join framework starts in this way, suppose the first thread get the task is call Thread A:

step 1. When thread A gets a task, if it's small enough, do the real calculation; if still big, the task will be cut into 2 subtasks.

step 2. The thread A will keep working on one of the subtasks, and put the rest into thread A's queue (its own queue).

step 3. An Idle thread, thread B, can take subtasks out from thread A's queue, which is called work stealing. Then on thread B, same process repeats from step 1

After a big task submits to one thread initially, it will soon propagated to ALL threads of the fork/join thread pool. Something's worthy to mention is that Thread A will keep recursively cut task->queue 1/2->cut 1/2 task-> queue 1/4 -> cut 1/4-> queue 1/8...... until the task is small enough. Recursion is also a feature of fork/join framework.

By default the fork/join thread pool will has threads size exactly same as the available threading unit that you CPU can run simultaneously. For example a Quad-core CPU with Hyper-Threading(2 threads on each physical core), the pool will has 4*2 = 8 threads.  So after task has been given to the fork/join pool, all threads/ all CPU will be occupied.

1.2 What happens when any threads finish a subtask

If thread X gets a task and splits it into 2 subtasks, it puts half in to its queue and starts working on the other half. When the second half is done, it will try to check if the first half is done.

  • if  the first half is done, then it can continue to work stealing.
  • if the first half  has been stolen and processed by other thread,  thread X has to wait until this half finish.
  • if the first half is still in the queue, thread X will start to process the first half itself recursively, which means cut the first half, queue 1/4 and work on the other 1/4.

2. Java API

In API level, when to put a subtask into the queue, call fork(). when to process a some pieces of work ,call compute(). when to wait for rest to finish call join(), These 3 methods are key methods of the fork/join framework, which are also where the framework's name comes from. 

Always call fork() before compute() and join() so other threads can have the chance to help sharing the workload

In package java.util.concurrent, there are 4 classes key to fork/join framework.

  • ForkJoinPool - Thread pool for fork/join framework. Implements ExecutorService interface.
  • ForkJoinTask -  Abstract class, has fork() and join() method, as parent class for the next 2 children. 
  • RecursiveTask - Abstract class extends ForkJoinTask, only abstract method is compute()
  • RecursiveAction - Abstract class extends ForkJoinTask, only abstract method is compute()

The only difference between RecursiveTask and RecursiveAction is that RecursiveTask's compute() has return, but RecursiveAction's compute() doesn't. (Task has return, action doesn't)

3. Demo

We have big char array, 100M items. Every item in this array is one upper case letter from A-Z.  The application tries to count how many letter 'A' in this big array. By using fork/join framework, the array will be divided into small area for each thread to go through.  Let's first see the main class.

package com.shengwang.demo;

public class ForkJoinDemo {
private static final int ARRAY_SIZE = 100_000_000;
private static char[] letterArray = new char[ARRAY_SIZE];

private static int countLetterUsingForkJoin(char key) {
int total = 0;
ForkJoinPool pool = new ForkJoinPool(); // create thread pool for fork/join
CountLetterTask task = new CountLetterTask(key, letterArray, 0, ARRAY_SIZE);
total = pool.invoke(task); // submit the task to fork/join pool

pool.shutdown();
return total;
}

public static void main(String[] args) {
char key = 'A';
// fill the big array with A-Z randomly
for (int i = 0; i < ARRAY_SIZE; i++) {
letterArray[i] = (char) (Math.random() * 26 + 65); // A-Z
}

int count = countLetterUsingForkJoin(key);
System.out.printf("Using ForkJoin, found %d '%c'\n", count, key);
}
}

The main class is simple, main() first fill a big array with random upper case letters, then call the countLetterUsingForkJoin(), in which a ForkJoinPool is created and task submit to it.  After finishing whole task and get the final result, the pool shuts down and result returned.  The task class CountLetterTask is the kernel of this demo and it's shown below.

package com.shengwang.demo;

import java.util.concurrent.RecursiveTask;

class CountLetterTask extends RecursiveTask<Integer> {

private static final long serialVersionUID = 1L;
private static final int ACCEPTABLE_SIZE = 10_000;
private char[] letterArray;
private char key;
private int start;
private int stop;

public CountLetterTask(char key, char[] letterArray, int start, int stop) {
this.key = key;
this.letterArray = letterArray;
this.start = start;
this.stop = stop;
}

@Override
protected Integer compute() {
int count = 0;
int workLoadSize = stop - start;
if (workLoadSize < ACCEPTABLE_SIZE) {
// String threadName = Thread.currentThread().getName();
// System.out.printf("Calculation [%d-%d] in Thread %s\n",start,stop,threadName);
for (int i = start; i < stop; i++) {
if (letterArray[i] == key)
count++;
}
} else {
int mid = start + workLoadSize / 2;
CountLetterTask left = new CountLetterTask(key, letterArray, start, mid);
CountLetterTask right = new CountLetterTask(key, letterArray, mid, stop);

// fork (push to queue)-> compute -> join
left.fork();
int rightResult = right.compute();
int leftResult = left.join();
count = leftResult + rightResult;
}
return count;
}
}

Let's go through class CountLetterTask. It extends RecursiveTask<Integer> which mean final result of the task is an Integer. To avoid creating copy of the original big array, the reference of the big array will be send in as a constructor parameter. The current task size is defined by the start(inclusive) and stop(exclusive) index in the array. The criteria to say whether the current task is small enough is defined as a constant variable ACCEPTABLE_SIZE. Here when the subtask deal with part of the array less than 10k is considered as "small enough".


The most interesting part is the compute() method, it first checks if the current task is smaller enough, if so,  do the real calculation. If not, the array range will be divided into 2 parts. One task becomes two subtasks, each is also a CountLetterTask instance. Put the first part into queue then call compute() on the second half. The task will be recursively cut small until it's "small enough".  Then call the join() to make sure  whole task is done. Remember fork() has to run before compute() and join()


4. Run 


image


From the screenshot, CPU resources are fully used for the big task. ( Since the task will only take less than 30ms also on my PC to finish, the screenshot actually comes from a even bigger array running in a loop for many times)

Powered by Blogger.

About The Author

My Photo

Has been a senior software developer, project manager for 10+ years. Dedicate himself to Alcatel-Lucent and China Telecom for delivering software solutions.

Pages

Unordered List