/ SKIL

Installing & configuring SKIL for single/multi node(s)

In this blog, we'll show how to configure SKIL to work in a single-node and multi-node setup. We'll start with the installation and configuration for single SKIL node, and then extend it to multiple nodes.

In this setup, we'll configure MySQL and Zookeeper in a seperate, independant node and link all our SKIL instances to them. It's recommended not to use the embedded Zookeeper that comes bundled with SKIL to avoid data loss, especially in production, when you stop the SKIL server.

Note

Make sure you have SKIL version 1.0.2 or later for the multi-node configuration to work.

Operating System

We're going to use a clean installation of CentOS 7.4 as the operating system. Make sure you have enough disk space to follow along. About 30GB would be enough but 100GB of free disk space is recommended.

Installing Zookeeper

First install Java 8:

sudo yum install -y java-1.8.0-openjdk-devel 

Java-install

Then execute the following commands to install Zookeeper through the RPM provided by Cloudera:

sudo yum install -y http://archive.cloudera.com/cdh5/one-click-install/redhat/6/x86_64/cloudera-cdh-5-0.x86_64.rpm
sudo yum install -y zookeeper

Zookeeper-RPM-install

Initialize and start Zookeeper:

sudo zookeeper-server-initialize 
sudo zookeeper-server start 

Initialize-and-start-zookeeper

To verify the Zookeeper installation, observe the status by executing:

zookeeper-server status 

If it's working correctly, you'll see the following status message:

JMX enabled by default 
Using config: /etc/zookeeper/conf/zoo.cfg 
Mode: standalone 

Observing-zookeeper-status

Otherwise, you'll see something like this:

JMX enabled by default 
Using config: /etc/zookeeper/conf/zoo.cfg 
Error contacting service. It is probably not running. 

You can verify the installation by connecting the Zookeeper client to the server as:

zookeeper-client

Connected-client

When it's connected, you can list the directories inside by executing:

ls /

You should observe the following output:

[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper] 
[zk: localhost:2181(CONNECTED) 1]  

listing-directories

Press Ctrl+C to disconnect from the server.

Installing MySQL

We'll use MySQL 5.5 when configuring SKIL.

Adding a MySQL YUM repository

Download and add the repository to the system. We'll select version 5.5, shortly. Open a shell window and do the following:

wget https://repo.mysql.com//mysql57-community-release-el7-11.noarch.rpm 
sudo rpm -Uvh mysql57-community-release-el7-11.noarch.rpm

Adding-the-YUM-repository

Selecting install version

Now, you need to disable the repository for MySQL 5.7 community, and enable version 5.5 instead.

yum repolist all | grep mysql 
sudo yum-config-manager --disable mysql57-community 
sudo yum-config-manager --enable mysql55-community 
yum repolist enabled | grep mysql

Selecting-mysql-version-5.5

Installing

Install MySQL using the following command:

sudo yum install mysql-community-server

Installing

Starting MySQL server

To start the server:

sudo systemctl start mysqld.service

starting-server

Checking server status

You can check the status of the server with the following command:

sudo systemctl status mysqld.service

checking-status-1

Securing initial accounts

For reference, visit https://dev.mysql.com/doc/refman/5.5/en/default-privileges.html

Execute:

mysql -u root

mysql-connecting-1

To set a password, write the following MySQL queries:

UPDATE mysql.user SET Password = PASSWORD('new_password') WHERE User = 'root';
FLUSH PRIVILEGES;

Change-password-query

The FLUSH statement causes the server to reread the grant tables. Without it, the password change remains unnoticed by the server until you restart it.

After updating the password, access MySQL shell using:

mysql -u root -p

Then you'll be prompted to enter the new password (In this case, it's "new_password")
new-password

Creating a user for SKIL with permissions to create and access databases.
Instead of configuring SKIL with the root credentials for MySQL, we'll create a new user for it. SKIL is going to create several databases. One of them stores a Flyway migrations table for the various SKIL databases. The other is for the model-history server. Similarly, other databases would be created as needed.

A single database named “skil” is also required, and must be created prior to running SKIL with MySQL.

While in the MySQL shell, write the following queries to create new user (skil) with the required permissions:

CREATE DATABASE skil_migrations;
GRANT ALL PRIVILEGES ON *.* TO 'skil'@'%' IDENTIFIED BY 'skil';

create-user-and-database

Now press Ctrl+D to log out of the current MySQL session and log in as the newly created user:

mysql -u skil -p

Type your password for this new user. (Here it's "skil".)

new-user-and-password

Create a test database (skil_test_database) to see if the permissions are all good. You can delete it immediately afterward.

CREATE DATABASE skil_test_database;
DROP DATABASE IF EXISTS skil_test_database;

Creating-and-deleting-database

Press Ctrl+C to disconnect.

Save the IP address of this machine. We'll need it to configure SKIL with this machine's Zookeeper and MySQL installation.

Allowing incoming connections from SKIL nodes

You need to open the ports so that the SKIL nodes can connect to the Zookeeper and MySQL instance. You can do that by executing the following commands:

sudo systemctl start firewalld
sudo firewall-cmd --zone=public --add-port=2181/tcp --permanent
sudo firewall-cmd --zone=public --add-port=3306/tcp --permanent
sudo firewall-cmd --reload

Allowing-outside-connections

SKIL Installation

Both SKIL installations (single and multi node) have the same steps. Only the configuration files needs to be changed after installation. So, let's get started!

Single Node

For single-node installation, execute the following commands from a shell:

sudo yum install -y https://skildistro.blob.core.windows.net/skildistro/skil-server-1.0.2-1.x86_64.rpm
sudo yum install -y https://skildistro.blob.core.windows.net/skildistro/skil-server-miniconda-1.0.2-1.x86_64.rpm
sudo yum install -y https://skildistro.blob.core.windows.net/skildistro/skil-server-spark-1.0.2-1.x86_64.rpm

Be patient with the miniconda installation. It usually takes around 5-20 minutes, depending on your internet and disk speeds.
Miniconda-installation

This will install a SKIL node in your machine.
SKIL-installation-complete

Now we'll look at the important configurations to make it work well with your setup.

Configuring the installation

For configuring the Zookeeper/MySQL settings, among others, you'll have to edit /etc/skil/skil-env.sh.

Execute the following command to start editing the configuration file:

sudo nano /etc/skil/skil-env.sh 

Changing-configurations

For MySQL, append the following:

SKIL_USE_EMBEDDED_DB=false
SKIL_DB_NAME=skil_migrations
SKIL_DB_DRIVER=com.mysql.jdbc.Driver
SKIL_DB_URL=jdbc:mysql://[Host Address]:3306/skil_migrations
SKIL_DB_USER=skil
SKIL_DB_PASSWORD=skil
MODEL_HISTORY_SERVER_LAUNCH_DEFAULT=false
ZEPPELIN_LAUNCH_DEFAULT=false

And for Zookeeper, append the following:

ZOOKEEPER_HOST=[Host Address]
ZOOKEEPER_PORT=2181 
ZOOKEEPER_EMBEDDED=false 

Replace [Host Address] with the IP address of your Zookeeper and MySQL machine
Configurations-changed

Now press Ctrl+X to save the changes.

Install the mysql-jdbc driver:

Run the following commands to install the mysql-jdbc driver:

sudo yum install -y mysql-connector-java 
sudo ln -s /usr/share/java/mysql-connector-java.jar /opt/skil/lib/mysql-connector-java.jar 

JDBC-driver-installed

Enable and start SKIL:

To start the SKIL server, execute the following:

sudo systemctl daemon-reload 
sudo systemctl enable skil 
sudo systemctl start skil 

Enabling-and-starting-SKIL

Track logs

You can run the following command to see the SKIL log:

tail -f /var/log/skil/skil.log 

Running-skil-instance

Verify your Setup

Connect to any SKIL node by going to the URL: http://<skil_host>:9008. Use the username/password as admin/admin.
logging-in

The Agents tab should show the number of agents (machines) available.
Agents-tab

Multi-Node

You can keep adding new nodes by repeating the SKIL installation steps for each one.

If you have repeated the installation steps for SKIL on multiple machines, you'll be able to see all of them in the agents tab.

Multiple-agents