Catalogue application deployment

The H3Africa catalogue is a Java web based application written with a Spring Boot framework. Contrary to Spring, Spring Boot alleviates the application configuration complexity, a known drawback of the Spring framework.

Below we describe how to deploy the catalogue in an Ubuntu virtual machine (VM).

Download the source:

Note

The deployment procedure is for system administrators or users with Unix knowledge i.e., those who know how to work with command lines via a Terminal.

Access the VM as a user with root or normal privileges. Normally, it is done via ssh command line. You may be prompted to enter a password if the access to the VM is protected.

ssh hocine@catalogue.sanbi.ac.za

The catalogue is an open source application under MIT license. It is possible to fork it and make changes to it. The source code is deposited on github. Download locally the source code

git clone https://github.com/h3abionet/h3acatalog.git

A new folder h3acatalog will be created in the current location.

Install Neo4j

Neo4j is a graph database management system, in contrast to relational databases that work with static tables i.e, the number and the name of the columns must be predefined before the creation. It is schemaless and stores the information as graphs, after modeling it to nodes and relationships.

Neo4j requires Java a priori installed. To install it run

sudo apt install default-jre default-jre-headless

To verify if the installation worked, check using a terminal if the following executable is available

java

Assuming Java is installed, it is time to install Neo4j. Add the Neo4j repository to the VM keychain and to the list of apt sources. Both commands needs root access.

wget --no-check-certificate -O - https://debian.neo4j.org/neotechnology.gpg.key | sudo apt-key add -
echo 'deb http://debian.neo4j.org/repo stable/' | sudo tee /etc/apt/sources.list.d/neo4j.list

Finally, update the repository and install Neo4j

sudo apt update
sudo apt install neo4j

Verify Neo4j is running, if not start it with the next command

sudo systemctl status neo4j
sudo systemctl start neo4j

Create an administrator user in the graph database

Before any commitment with the catalogue application, an administrator account should exist to fulfill some required tasks, for example enabling registred users.

Note

Users with roles; ARCHIVE, BIOBANK, DBAC and ADMIN can be created only by the catalogue administrator user. Users with RESEARCHER role have to use the online registration form. An automatic generated email will be sent to the registred users for email address verification (see application.properties file for mail server configuration). The users will not be able to login into the catalogue only after validation of the provided information by the catalogue administrator.

Running the catalogue for the first time will automatically create the catalogue administrator in the database with predefined values which can be changed in SetupLoadData.java in the source code. However, it is possible to create manually the catalogue administrator using Neo4j functionalities. Neo4j provides a multitude of ways to interact with its database, via command line using cypher-shell tool or from browser at this URL http://<your VM domain>:7474. In the following, we will present how, in both cases, to create a new user account. Minimum essential information are needed to create a user account; a username, an encrypted password and a specific role. There are plenty of online services to encrypt a human readable password, e.g. [BCrypt Calculator] (https://www.dailycred.com/article/bcrypt-calculator)

Open Neo4j in your preferred browser and run the following Cypher query

CREATE (:NeoUser {username: 'admin', password: '$2a$09$KyQwV3Hv0Nu4jmP753i.FOB7nFfDF3SofT1MalcohIS4ZWyysnklK', role: 'ADMIN'})

The query creates a new node with the values defined in the properties between brackets, i.e., the admin user is created.

Let's run the same query in cypher-sell. Open a terminal, and execute /usr/bin/cypher-shell. You will be asked to provide a login and a password. The default ones generated by Neo4j are neo4j/neo4j for login/password. After login in run the following script:

:begin
CREATE (:NeoUser {username: 'admin', password: '$2a$09$KyQwV3Hv0Nu4jmP753i.FOB7nFfDF3SofT1MalcohIS4ZWyysnklK', role: 'ADMIN'})
:commit

Notice how the query is delimited by a :begin and a :commit statements. The former statement marks the beginning of the query and the latter marks the end of the query. This is important as it persists the information entered in the database.

Install Apache Maven

Download the latest version of Apache Maven to the /opt folder

cd /opt/
wget http://www-eu.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz

Extract the downloaded archive and rename it to maven

sudo tar -xvzf apache-maven-3.3.9-bin.tar.gz
sudo mv apache-maven-3.3.9 maven

Setup the environment variables. Add the following lines to the ~/.bashrc file

export M2_HOME=/opt/maven
export PATH=${M2_HOME}/bin:${PATH}

Save and close the file, then reload the .bashrc

source ~/.bashrc

Run mvn -version to verify the installation has been successfully configured.

Run the catalogue

From within the catalogue source code run the following command to start the application

mvn spring-boot:run

In your preferred browser open a new tab and enter http:/<your VM domain name>:8080 to start working with the admin user created previously.

Note

Ensure that /tmp folder of your VM is writable otherwise it is possible that you will encounter the following issue when building the catalogue

[ERROR] Failed to execute goal org.springframework.boot:spring-boot-maven-plugin:1.5.15.RELEASE:run (default-cli) on project h3acatalogue: An exception occurred while running. null: InvocationTargetException: Could not open Neo4j Session for transaction; nested exception is org.neo4j.ogm.exception.ConnectionException: Error connecting to graph database using Bolt: Unable to connect to localhost:7687, ensure the database is running and that there is a working network connection to it.

Portability

A known advantage of Java applications is their portability. Although the application is implemented on a given platform, e.g Unix, the Java virtual machine (JVM) is able to run the application in a different platform, e.g Windows, without a single change to the source code or any additional configuration is required. Moreover, it is possible to pack the whole source code of the application into a single Java ARchive file (JAR), improving portability.

To create the JAR file, from within the source code folder run the following maven command. This process may take a few minutes. The clean statement deletes the old build folder and package rebuilds the application by downloading all the dependencies. The command starts the application and run the tests before the archiving. If no failure is detected the JAR file will be created and placed in the target folder.

mvn clean package

Previously we showed how to run the application from within the source code folder using Maven, let's see now how to run it with using only the created JAR file:

java -Dserver.port=3000 -Dlogging.path=/var/log/h3acatalog/ -jar path_to_jar/h3acatalog-0.0.1-SNAPSHOT.jar

The last command starts the application on port 3000 and prints logs into the folder /var/log/h3acatalog

Run the catalogue using Supervisor

Supervisor is a system for controlling and maintaining the process state. We will use Supervisor to start, stop and check the status of the catalogue application. In your terminal install Supervisor with the following command. At the time of writing, the currently available version is 3.0b2-1

sudo apt-get install supervisor

Create a new file inside /etc/supervisor/conf.d/ and give it a name with extension .conf. Add the following configuration lines to it

[program:h3acatalog]
command=/etc/alternatives/java -Dserver.port=3000 -Dlogging.path=/var/log/h3acatalog/ -jar path_to/h3acatalog-0.0.1-SNAPSHOT.jar
directory=/home/hocine/h3acatalog
user=hocine
autostart=true
autorestart=true
startsecs=10
startretries=3
stdout_logfile=/var/log/h3acatalog/stdout.log
stderr_logfile=/var/log/h3acatalog/stderr.log

The lines are straight forward and self explanatory. With the supervisor installed and configured you will be able to check the status of your process, start or stop it.

sudo supervisorctl status | start | stop

Broad access with Nginx

Nginx is a high-performance web and reverse proxy server. We will use it to intercept client HTTP requests and redirect them to the catalogue application.

The same way that supervisor was installed, use the apt tool to install Nginx.

sudo apt install nginx

Nginx can serve more than a site. To add a new site change directory to /etc/nginx/sites-available and create a new file h3acatalog. Open the newly created file and add the lines below.

upstream app_servers {
        server localhost:3000;
}

server {
        listen 80;
        server_name h3acatalog.sanbi.ac.za;
        return 301 https://h3acatalog.sanbi.ac.za$request_uri;
        server_tokens off;
}

server {
        listen 443 ssl;
        ssl_certificate /etc/nginx/ssl/sanbi.pem;
        ssl_certificate_key /etc/nginx/ssl/sanbi.key;
        ssl_session_timeout 10m;
        ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
        server_name h3acatalog.sanbi.ac.za;
        location / {
                proxy_pass http://app_servers;
                proxy_set_header Host $host;
                proxy_set_header X-Real-IP $remote_addr;
                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                proxy_set_header X-Forwarded-Host $server_name;
                proxy_set_header X-Forwarded-Proto https;
                proxy_read_timeout 1200s;
        }
}

To enable the site to run the following command from within /etc/nginx/sites-enabled. The line below creates a symlink to the original site file.

sudo ln -s /etc/nginx/sites-available/h3acatalog /etc/nginx/sites-enabled

That's it! Enjoy :-)