Building Apache Zeppelin 0.6.0 on Spark 1.5.2 & 1.6.0 in a cluster mode

I have Ubuntu 14.04 Trusty and a multinode Hadoop cluster. Hadoop distribution is Hortonworks 2.3.4. Spark is installed through Ambari Web UI and running version is 1.5.2 (upgraded to 1.6.0).

I am going to explain how I built and set up Apache Zeppelin 0.6.0 on Spark 1.5.2 and 1.6.0


Non root account

Apache Zeppelin creators recommend not to use root account. For this service, I have created a new user zeppelin.

Java 7

Zeppelin uses Java 7. My system has Java 8, so I have installed Java 7 just for Zeppelin. Installation is in the following directory done as user zeppelin.


JAVA_HOME is added to the user’s bashrc.

export JAVA_HOME=/home/zeppelin/prerequisities/jdk1.7.0_79

Zeppelin log directory

Create zeppelin log directory.

sudo mkdir /var/log/zeppelin

Change ownership.

sudo chown zeppelin:zeppelin /var/log/zeppelin

If this is not done, Zeppelin’s log files are written in folder logs right in the current folder.

Clone and Build

Log in as user zeppelin and go to users home directory.


Clone the source code from github.

git clone incubator-zeppelin

Zeppelin has a home now.


Go into Zeppelin home and build Zeppelin

mvn clean package -Pspark-1.5 -Dspark.version=1.5.2 -Dhadoop.version=2.7.1 -Phadoop-2.6 -Pyarn -DskipTests

Build order.

zeppelin build start

7:31 minutes later, Zeppelin is successfully built.

zeppelin build success


If you try with something like the following 2 examples:

mvn clean package -Pspark-1.5 -Dspark.version=1.5.0 -Dhadoop.version=2.7.1 -Phadoop-2.7 -Pyarn -DskipTests
mvn clean package -Pspark-1.5 -Dspark.version=1.5.2 -Dhadoop.version=2.7.1 -Phadoop-2.7 -Pyarn –DskipTests

Build will succeed, but this warning will appear at the bottom of Build report:

[WARNING] The requested profile “hadoop-2.7” could not be activated because it does not exist.

Hadoop version mentioned in the maven execution must be 2.6 even though actual Hadoop version is 2.7.x.


Copy hive-site.xml from hive folder (this is done on Hortonworks distribution, users using other distribution should check where the file is located).

sudo cp /etc/hive/conf/hive-site.xml $ZEPPELIN_HOME/conf

Change ownership of the file.

sudo chown zeppelin:zeppelin $ZEPPELIN_HOME/conf/hive-site.xml

Go to Zeppelin home and create by using the template in conf directory.

cp conf/ conf/

Open it and add the following variables:

export JAVA_HOME=/home/zeppelin/prerequisities/jdk1.7.0_79
export HADOOP_CONF_DIR=/etc/hadoop/conf
export ZEPPELIN_JAVA_OPTS="-Dhdp.version="
export ZEPPELIN_LOG_DIR=/var/log/zeppelin

The variable in the third line depends on the Hortonworks build. Find your hdp version by executing

hdp-select status hadoop-client

If your Hortonworks version is 2.3.4, the output is:

hadoop-client –

Zeppelin daemon

Start Zeppelin from Zeppelin home

./bin/ start

Status after starting the daemon:

zeppelin start

One can check if service is up:

./bin/ status


zeppelin status

Zeppelin can be restarted in the following way:

./bin/ restart


zeppelin restart

Stopping Zeppelin:

./bin/ stop


zeppelin stop

Configuring interpreters in Zeppelin

Apache Zeppelin comes with many default interpreters. It is also possible to create your own interpreters. How to configure default Spark and Hive interpreters is covered in this post.

    1. If you are refering to Hortonworks distribution and use Apache Zeppelin:
      There are ways to add services by yourself to Ambari. I havent done it, just read about it.
      I have my Hortonworks distribution and running Zeppelin from command line and it works. No Ambari for Zeppelin.


