Sunday, December 30, 2012

Apache Maven (II) - Project Object Model

This is my second post about Apache Maven. In my last post, Apache Maven (i), we saw how to install Maven and build a minimal project. Today, we are going to see what is the Project Object Model (POM).

POM is a descriptor of a Maven project. That project may be formed from one  xml file (pom.xml) to several scattered around the its own directory tree.  In each 'pom.xml', you will provide all required information to describe what the project is about.

On this post we will see only a part of POM, but you can get a full guide with all the elements and its descriptions visiting the POM page at the official Maven website: http://maven.apache.org/pom.html

POM elements


The root element of 'pom.xml' is 'project'

<project xmlns="http://maven.apache.org/POM/4.0.0" 
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://maven.apache.org/POM/4.0.0                      
                           http://maven.apache.org/xsd/maven-4.0.0.xsd"> 
  <modelVersion>4.0.0</modelVersion> 
...

</project>

Don't be worried about the attributes of 'project' element. They are used to define the namespace and the location of schema for validation. Just try to keep it like the example.

The element 'modelVersion' is used to define which version of project object model is being used. The 4.0.0 is accepted by Maven 2 and 3.

Essential identity elements

There are three elements that identify a project: 'groupId', 'artifactId' and 'version'.  These element values will be used to generate the output package (artifact) and to place it on a specific path on the repository.  That repository usually is on '~/.m2/repository'. From now, we will refer to that path as '$M2_REPO'.
  • groupId: Generally, this element is the identifier of the company / organization / group that creates the project. The 'groupId' can use "dot notation", but is not necessary. We can find 'groupId's like "com.mybussiness.maindepartment" or just "myfantasticbussiness". All the artifacts under this 'groupId' will be under subdirectories placed in a directory named as the value of that 'groupId'. In case of being a doted string value, Maven will create a subdirectory for each element of the 'groupId'. In the case above "com.mybussines.mydeparment" you will get a the project output under "$M2_REPO/com/mybussiness/mydepartment".
  • artifactId  It indicates the unique base name of the primary artifact that will be generated by this project. An artifact could be a jar, war, ear, ... Its value is generally the name of the project is known by. In our repository each artifact will generate a directory named as its value inside its 'groupId' output directory. 
  • version It indicates the version of the artifact generated by the project. The repository can store multiple versions of the artifact. For each version you will find a directory with the artifacts inside.
An artifact generated by Maven would have the following name structure: <artifactid>-<version>.<extension> (for example, myfirstapp-1.0.jar).

If you are using inheritance mechanism (we will it see below), you are not required to define 'groupId' and 'version' elements explicitly.

Other basic elements

There are other elements that although not required, are highly recommended to be supplied.
  • packaging:  It is the project's artifact type. Maven core provides the following types: pom, jar, maven-plugin, ejb, war, ear, rar and par. That list can be extended by some plugins.
  • name: This element indicates the display name used for the project.
  • url: The location where the project's site can be found.
  • description: A basic description of your project.

Build process

There is another element that is very important in the POM. Although is not required to be declared, is necessary to know what its mission is, and how you can set it up. It's the element 'build'. In that element we can configure all about the build process.

'build' element could be find in two different scopes. First one is the project scope. 'build' element is declared as a child of 'project' element, and its parameters are applied to all projects/subprojects that 'pom.xml' covers. The other one is behind a profile (we will work with profiles soon on other posts). In you project you can define various profiles and activate its settings depending on some parameters. 'build' element could be declared or redeclared in each 'profile' element  to modify the behavior of our project depending on the profiles that has been chosen on build time.

<project xmlns="http://maven.apache.org/POM/4.0.0"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                      http://maven.apache.org/xsd/maven-4.0.0.xsd">
  
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.mycompany.mydepartment</groupId>
  <artifactId>myfirstapp</artifactId>
  <version>1.0</version>

  <build>
    <!-- Here, you can set up you build element -->
  </build>

  <profiles>
    <profile>

      <build>
        <!-- Here you can set up you build element -->
      </build>
    </profile>
  </profiles>
</project>

Let's see which are the most frequently used elements:
  • defaultGoal: The default goal to execute if it's not provided by command line.
  • directory: This is the directory that Maven will use to dump all the output data.  The default value is '${basedir}/target'. It indicates that the output directory is a directory named 'target' in our project directory.
  • finalName: This would be the name of the bundled project . By default is set to ${artifactId}-${version}.
  • filter: It defines a set of properties files (ex sample.properties), that contains a list of key=value list, that will be applied to the resources defined into 'resources' element that has his filtering value to true.
  • resources: That element contains a set of elements 'resource' that describe files associated with the project but not containing code. Each 'resource' element configure set of files through  'include' and 'exclude' elements and regular expressions to determine with files of which directories are included and which not.
  • plugins: That element contains a set of elements 'plugin' that describe a Maven plugin to be used in the build process. We find a lot of variety of plugins. Essentially, in each 'plugin' element you have to indicate which plugin to use indentifing it with its 'groupId', 'artifactId' and 'version'. Then you have to configure them (each plugin has its own configuration parameters) and optionaly attach a goal that plugin provides to an specific phase of the build life cicle (I also will talk about this in the following post).
  • pluginManagement:  It also has a 'plugins' element as a child. The main diference between the 'plugins' element under 'build' element or 'pluginsManagement' remains in that all configuration that we find in pluginManagement is not executed directly by Maven. Each 'plugin' element description in the pluginManagement is applied to the corresponding plugin described in the 'build' plugins 'list'. The advantage of configure plugins thought pluginManagement is that all these descriptions can be inherited by child subprojects.

Reporting your project

At same time you use Maven to build your project, you can use it to publish all information you need. When you execute the command 'mvn site', you are telling to Maven to generate all reports about your project. There are a lot of plugins that provides extra information like generating javadoc, check code style, find for bugs, etc. You can set up your site generation using the element 'reporting'.

As you saw in 'build' element, you can use it in two different scopes, too.  You can find it as child of 'project' element or into a particular 'profile'.

<project xmlns="http://maven.apache.org/POM/4.0.0"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                      http://maven.apache.org/xsd/maven-4.0.0.xsd">
  
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.mycompany.mydepartment</groupId>
  <artifactId>myfirstapp</artifactId>
  <version>1.0</version>

   ...

  <reporting>
    <!-- Here, you can set up you reporting element -->
  </reporting>

  <profiles>
    <profile>

      <reporting>
        <!-- Here you can set up you reporting element -->
      </reporting>
    </profile>
  </profiles>
</project>


We will pay more attention when we talk about plugins configuration on the following posts.

Dependencies

It's time to introduce one of the most important skills of Maven. It's a mechanism to solve the problem about managing and handling library dependencies in our project. I will explain in more detail on following posts, but is important to know that we can use a 'dependencies' element to include in our project all the libraries that you need to compile, run or test your app.

<project>

    ... 

 <dependencies>
     <dependency>
       <groupId>org.apache.commons</groupId>
       <artifactId>commons-io</artifactId>
       <version>1.3.2</version>
    </profile>
    <dependency>
       <groupId>org.apache.commons</groupId>
       <artifactId>commons-math</artifactId>
       <version>3.0</version>
    </dependency>
  </dependencies>
</project> 

You may provide the 'artifactId', 'groupId' and 'version' of each dependency 'element' you need in your project.  You can optionally provide the scope of this dependency (compile, test, runtime ...) and type.

All these dependencies will be handled by Maven to be used in your project. It will obtain the jar from remote repository; it will install them in your local repository and it will include them on your output artifact, depending on the packaging type of your project.

The element 'dependencies' as direct child of 'project' element will perform all these actions for you. But, we can also find 'dependencies' element inside a 'dependencyManagement' element. It allows developer to centralize the configuration of dependencies (defining versions, scopes, ...), for using them later into project's 'dependencies' element.

The main advantage of this technique is that the information provided on 'dependencyManagement' could by inherited by other subprojects, and lets you  not repeating these parameters in each subproject file. 

Inheritance

One of the concepts that reduces waste in build management when you are using Maven is the inheritance between projects. In Maven, inheritance is easy defined in the POM. Inheritance concept is similar to the object oriented languages. A "parent" project, can bring its configuration to its children, if they define that they inherit from it. The parent package should be 'pom'. The elements that could be inherited by children projects are:
  • dependencies information
  • developers and contributors
  • reports lists
  • plugin lists
  • plugin executions with matching ids
  • plugin configuration
To make a project to inherit from other you should define a 'parent' element  with the identification elements of the parent.
On the following sample, you can see how 'mysubprojectapp' inherits from 'myfirstapp'.

<project xmlns="http://maven.apache.org/POM/4.0.0"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                      http://maven.apache.org/xsd/maven-4.0.0.xsd">
  
  <modelVersion>4.0.0</modelVersion>
  <artifactId>mysubprojectapp</artifactId>

  <parent>
    <groupId>com.mycompany.mydepartment</groupId>
    <artifactId>myfirstapp</artifactId>
    <version>1.0</version>
    <relativePath>../pom.xml</relativePath>
  </parent>

</project>

You will notice that child project does not declare 'groupId' or 'version'. It is also inherited from parent. If you want to override, you just have to define them.

The fact is that inheritance is implicitly applied in all poms you write. All POMs inherits from a Super POM. All default values and all the configurations that you don't write, are applied from Super POM. You can take a look at http://maven.apache.org/ref/3.0.4/maven-model-builder/super-pom.html

Aggregation

Maven provides a mechanism to divide the project into modules to get more atomicity and help us to get lower coupling in our project designs.
A project that uses aggregation is also known as multimodule project. A pom packaged project could declare a list of 'module' element inside a 'modules' element .  Each 'module' value should be the relative path to the directory that allocates the 'pom.xml' file of that module.

<project xmlns="http://maven.apache.org/POM/4.0.0"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                      http://maven.apache.org/xsd/maven-4.0.0.xsd">
  
  <modelVersion>4.0.0</modelVersion>
  <artifactId>myfirstapp</artifactId>
  <gruoupId>com.mycompany.mydepartment</groupId>
  <packaging>pom</packaging>
  <version>1.0</version>


  <modules>
    <module>api</module>
    <module>impl</module>
    <module>app</module>
  </modules>

</project>

You don't have to take care about the order of the modules, or if they have inter-dependencies between each module. If you described in each pom the dependency, Maven will know which one to build first.

Usually, projects combines aggregation with inheritance to handle all project modules. It's easy to describe versions and scopes in parent node and also use it to agreggate all the children. That way, you can call the build process in all your modules once.

It's also possible to have more than one level of inheritance and agregation. You usually will find an scenario like this:









my-project-pom 
   type: pom 
my-module-1 
   type:pom
my-module1-api
  type: jar 

my-module1-impl
  type: jar 

my-webapp1 
   type: war




my-module-2 
   type:pom

my-module2-api
   type: jar

my-module2-impl
   type: jar

my-webapp2 
   type: war 




Other information element.


You can use more elements to define the project and its environment. For instance: Organization, Licenses, Developers,Contributors, Repositories, Plugin Repositories, Distribution Management, SCM, Issue Management, Continuous Integration Management, Mailing Lists and Prerequisites.


We will cover some of these elements on the next posts to automatize some task such releasing code, publishing artifacts, programming automatic builds, etc.

No comments:

Post a Comment