Adding service to HDP using REST API

One way of adding new service to the HDP is by using a graphical interface Ambari. In this post, it is explained how the same is done using Ambari’s REST API. The service added in this post is Pig. Pig is not a “classical” service, rather a package, but from REST API’s point of view, it is a service.

The documentation on how to do this is dated April 21 2014. I have followed it and eventually made it work. The documentation can be found here.

The cluster

I am using AWS EC2 services, operating system is Centos7.

I have an Ambari server, version 2.6.2 and an HDP cluster version 2.6.5. This should work on other versions as well.

My cluster has one NameNode, on which Ambari is installed as well, and one DataNode. The services installed are the bare minimum – HDFS, YARN, MapReduce2, Zookeeper and Hive.

The goal

Install Pig client on the NameNode and on the DataNode.

Adding Pig to the cluster

Variables

export AMBARI_SERVER=PUBLIC_IP
export MASTER_DNS=NAMENODE_PRIVATE_DNS
export SLAVE_DNS=DATANODE_PRIVATE_DNS
export CLUSTER_NAME=mincluster2

Create service on the Cluster

curl -u admin:admin -H "X-Requested-By:ambari" -i -X POST -d '{"ServiceInfo":{"service_name":"PIG"}}' 'http://'$AMBARI_SERVER':8080/api/v1/clusters/'$CLUSTER_NAME'/services'

Pig service is added to the list of services in Ambari.

Pig added - Configs
Pig added - Summary

Check for service on the cluster

curl -k -u admin:admin -H "X-Requested-By:ambari" -i -X GET 'http://'$AMBARI_SERVER':8080/api/v1/clusters/'$CLUSTER_NAME'/services/PIG'

curl - service info
The service is registered on the cluster.

Add components to the service

curl -k -u admin:admin -H "X-Requested-By:ambari" -i -X POST -d '{"RequestInfo":{"context":"Install PIG"}, "Body":{"HostRoles":{"state":"INSTALLED"}}}' 'http://'$AMBARI_SERVER':8080/api/v1/clusters/'$CLUSTER_NAME'/services/PIG/components/PIG'

Running the curl command from previous step to check if the component is added returns the following:

curl - service info after component.png
The component has been added according to the “components” element in the JSON output. The state of the service is still “UNKNOWN”.

Creating configuration is on the next page.

Adding Hive, Tez & Pig in Ambari

I have 4 Hadoop environments, all running distribution Hortonworks, versions are either 2.3.4 or 2.4. I have installed HDFS, MapReduce and YARN first and the need is to add Hive.

When installing Hive, Pig and Tez follow with it whether you want it or not.

I already have an existing MySql database (because of Ranger) and this post describes how to install Hive and use an existing MySql for metastore. Installing Hive with a new MySql is actually easier.

  1. On Ambari server, from the CLI, run the following
    sudo ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar
    

    Output:

    Using python  /usr/bin/python
    Setup ambari-server
    Copying /usr/share/java/mysql-connector-java.jar to /var/lib/ambari-server/resources
    JDBC driver was successfully initialized.
    Ambari Server ‘setup’ completed successfully.

  2. Log in to Ambari as administrator
  3. From the Actions drop down menu on the left side of the screen, click Add Service
    flume-add service
  4. Choose services
    Check services Tez, Hive and Pig. If you pick only Hive, the installation wizard will remind you that you have to set up Tez and Pig packages as well.
    choose services
  5. Assign masters
    In this case, I am installing Hive on my namenode. This can always be changed – it is possible to move services to other instances (why do you think my namenode is called md-namenode2? ;))
    assign masters
  6. Assign Slaves and Clients
    Tez Client, HCat Client, Hive Client and Pig Client are going to be installed to this host(s).
    In this case I am installing it on the same server as Hive server, on “more serious” clusters I install the clients where they belong – the client server.
    assign slaves
  7. Customize Services
    On the MySql Server used for Hive metastore run the following commands as root

    CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive';
    CREATE USER 'hive'@'%' IDENTIFIED BY 'hive';
    FLUSH PRIVILEGES;
    

     

  8. Set up connection string to the metastore
    Choose “Existing MySQL Database”hive metastore setup

    Note: If there is a problem connecting to the database when testing the connection, check also in the my.cnf on the MySql server if the following property is uncommented:

    bind-address           = 127.0.0.1

    Comment it (# in front of the line), since we are connecting to the server from other hosts than localhost.

  9. Review
    review
    If the installation details are acceptable, proceed with the installation.
  10. When the installation is complete. The installed services are now available
    service available
    Do not forget to restart the services if Ambari suggests so!

Error during installation

resource_management.core.exceptions.Fail: Applying Directory[‘/usr/hdp/2.4.0.0-169/tez/conf’] failed, looped symbolic links found while resolving /etc/tez/conf

The solution to it run the following on the Hive server (md-namenode2 in this example):

unlink /etc/tez/conf