Black Friday Deal : Up to 40% OFF! + 2 free self-paced courses + Free Ebook  - SCHEDULE CALL

- Hadoop Blogs -

What Is Hue? Hue Hadoop Tutorial Guide for Beginners

Introduction

Today Big Data is getting popular among many organizations. Hue is related to Big Data Hadoop and in this blog; we will understand the basics of Hue and the way in which it has been used with Big Data Ecosystem. When it comes to Big Data then organizations ask their developers to provide quick and profitable solutions.

Big Data Hadoop Ecosystem Component

Here the first word and tool that strikes in their mind are Apache Hadoop. The software developers may have an idea about Hadoop, but when it will be about Hadoop implementation then their expectations may be proven different as compared to the reality. 

Ecosystem Components

Hadoop is an ecosystem not a single tool. Hadoop includes a number of components that can crack the data challenges for the variably sized company medium, small, or large. If any developer is planning to learn Hadoop then he must be aware and familiar with Hadoop components too.

Hadoop consists of the following listed nine components:

  • HDFS: HDFS or Hadoop File System is the file system in which data is stored. Here, the stored data may be structured or unstructured like relational database tables or log files respectively.
  • MapReduce: MapReduce is basically a programming model that is designed to process the big data. MapReduce runs on large clusters of commodity hardware.
  • Spark: Spark is a large-scale data processing engine. It can process any amount of data in memory so is getting popular among big data experts
  • Pig: Pig is a data flow language. The Pig that is a functional language can process even huge datasets. It can access the data directly from HDFS and process it in the MapReduce clusters.
  • Hive: It is nothing but just SQL on Hadoop. Hive is a distributed data warehouse on HDFS. SQL professionals need not learn anything to use Hive. Hive queries are basically translated to MapReduce for processing.
  • Sqoop: Sqoop provides a way to transfer the data from relational databases to HDFS
  • Flume: Flume can also transfer data but is used to transfer unstructured data to HDFS
  • HBase: HBase is basically Hadoop Database and is different from HDFS. It is a column-oriented and distributed database that is built on the top of HDFS.
  • Oozie: Oozie is a scheduling workflow and is used to schedule Hadoop jobs. It can start and end Pig/MapReduce or Hive jobs and schedule them as per the availability of resources.

Hadoop is basically a collection of many components that work together to provide better and improved functionality to the system. Every day new tools are launched in the market and existing are getting improved to provide better performance to the Hadoop ecosystem.

What is Hue? Hadoop User Experience a Web UI

Hadoop Hue is an open source user experience or user interface for Hadoop components. The user can access Hue right from within the browser and it enhances the productivity of Hadoop developers. This is developed by the Cloudera and is an open source project. Through Hue, the user can interact with HDFS and MapReduce applications. Users do not have to use command line interface to use Hadoop ecosystem if he will use Hue.

Features of Hue

A lot of features are available in Hue apart from just a web interface that it provides to the Hadoop developers. Hue provides the following listed features due to which it is getting a popular tool for the Hadoop developers:

  • Hadoop API Access
  • Presence of HDFS File Browser
  • Browser and Job Designer
  • User Admin Interface
  • Editor for Hive Query
  • Editor for Pig Query
  • Hadoop Shell Access
  • Workflows can access Oozie Interface
  • SOLR searches can get a separate interface

Above-mentioned reasons make Hue a foremost choice for the Hadoop developers and are used in Hadoop cluster installation. All basic Hadoop features can be accessed through Hue and people who are not familiar with the command line interface can use Hue and access all of its functionalities. Following image shows the user interface of Hue: Apache Hue Hadoop Tutorial

Hue Components - Hue Hadoop Tutorial Guide for Beginners

Hue in itself has many components through which user can take the advantage of Hadoop ecosystem and implement it properly: 

Hue Components

 HDFS Browser

While working with Hadoop Ecosystem one of the most important factors is the ability to access the HDFS Browser through which user can interact with the HDFS files in an interactive manner. He provides such HDFS interface through which all required operations can be performed on HDFS. If you do not want to work through command line interface then it can be of much help for you.

If you are using Hue interface then click on the “File Browser” that is present in the top-right position. A file browser will be opened through this link. Following image shows this interface. For the current or default path, it will enlist all of the files along with file properties. The user can even either delete, download or upload new files from here: 

File Browser

Job Browser

Hadoop ecosystems consist of many jobs and sometimes developers may need to know that which job is currently running on the Hadoop cluster and which job has been successfully completed and which has errors. Through Job browser, you can access all of the job-related information right from inside the browser. For this there is a button in Hue that can enlist the number of jobs and their status. Following image shows the job browser screen of Hue: 

Hue Job Browser

Above image shows MapReduce type job that has been finished successfully. Along with Job ID, Application Type, Name, Status and Duration of the job is also listed with its time of submission and the name of the user that have. To show the job status, four color codes are used that are listed below:

  • For Successful Job – Green Code is Used
  • For Currently Running Jobs – Yellow color is Used
  • For Failed Jobs- Red Color is Used
  • For Manually Killed Jobs – Black Color is Used

If the user needs to access more information about any job then by clicking the job or Job ID user can access the job details. Moreover, another job-related information like in above case two subtasks were also performed for the above-listed job one is MapReduce and other is Reduce that is shown in the below image:

hue recent Tasks

So recent tasks for the job are displayed and that is MapReduce and Reduce. Here other job-related properties like metadata can also be accessed easily from the same platform. Information like the user who has submitted the job, Total execution duration of this job, the time when it was started and ended along with their temporary storage paths and tablespaces, etc can also be listed and checked through Hue job interface like shown in the below image: 

Hue Job Browser

Hive Query Editor

Now we will see the Hive Query Editor. Hive query editor allows us to write SQL Hive queries right inside the editor and the result can also be shown in the editor. Hue editor makes the querying data easier and quicker. The user can write SQL like queries and execution of these queries can produce MapReduce job by processing data and the job browser can be checked from the browser even when it is in running state. Query result can be shown in the browser. A bar chart like result has been shown in the following window:

Hive Query Editor

Such charts that are produced as the result of any query can easily be saved to the disk or can be exported to any other file easily. Not only bar chart eve you can produce many other types of charts like a pie chart, line chart and others.

Database Browsers

All of the available datastore tables can be displayed, exported and imported through Database browser. Following image shows the database tables. When you will click on any particular table then you can also access the desired information of that table. Right from within the user interface you can view the data and access it. Table data can be visualized and checked from there and you can check the column of any particular table along with its names. Following image shows the table information or metadata of the table: 

Metastore Manager

From this interface, we can browse the data and even check the actual file location of the current table.

Oozie Workflows

Hue also provides the interface for Oozie workflow. All of the past and previous workflows of Hadoop cluster can be checked through this workflow interface. Again, three colors can be used to check the workflow status:

  • Successful Jobs – Green
  • Running Jobs – Yellow
  • Failed Job – Red

Following image shows an example of this: 
Oozie Dashboard Workflows

New workflows can also be designed through this interface. An inbuilt Oozie editor is there that can be used to create new workflows just by using drag and drop interface.

Conclusion

Here, we have given the introduction to Hadoop along with a detailed description of Hue tools. It provides an easy to use user interface that can be used to process all steps of Hadoop ecosystem. You can access all Hadoop services through Hue interface. With this Hue Hadoop Tutorial guide in detail, you can start your basic work and get enough knowledge to use Hue platform quickly


     user

    JanBask Training

    A dynamic, highly professional, and a global online training course provider committed to propelling the next generation of technology learners with a whole new way of training experience.


  • fb-15
  • twitter-15
  • linkedin-15

Comments

Trending Courses

salesforce

Cyber Security

  • Introduction to cybersecurity
  • Cryptography and Secure Communication 
  • Cloud Computing Architectural Framework
  • Security Architectures and Models
salesforce

Upcoming Class

4 days 29 Nov 2024

salesforce

QA

  • Introduction and Software Testing
  • Software Test Life Cycle
  • Automation Testing and API Testing
  • Selenium framework development using Testing
salesforce

Upcoming Class

7 days 02 Dec 2024

salesforce

Salesforce

  • Salesforce Configuration Introduction
  • Security & Automation Process
  • Sales & Service Cloud
  • Apex Programming, SOQL & SOSL
salesforce

Upcoming Class

4 days 29 Nov 2024

salesforce

Business Analyst

  • BA & Stakeholders Overview
  • BPMN, Requirement Elicitation
  • BA Tools & Design Documents
  • Enterprise Analysis, Agile & Scrum
salesforce

Upcoming Class

18 days 13 Dec 2024

salesforce

MS SQL Server

  • Introduction & Database Query
  • Programming, Indexes & System Functions
  • SSIS Package Development Procedures
  • SSRS Report Design
salesforce

Upcoming Class

4 days 29 Nov 2024

salesforce

Data Science

  • Data Science Introduction
  • Hadoop and Spark Overview
  • Python & Intro to R Programming
  • Machine Learning
salesforce

Upcoming Class

11 days 06 Dec 2024

salesforce

DevOps

  • Intro to DevOps
  • GIT and Maven
  • Jenkins & Ansible
  • Docker and Cloud Computing
salesforce

Upcoming Class

2 days 27 Nov 2024

salesforce

Hadoop

  • Architecture, HDFS & MapReduce
  • Unix Shell & Apache Pig Installation
  • HIVE Installation & User-Defined Functions
  • SQOOP & Hbase Installation
salesforce

Upcoming Class

11 days 06 Dec 2024

salesforce

Python

  • Features of Python
  • Python Editors and IDEs
  • Data types and Variables
  • Python File Operation
salesforce

Upcoming Class

5 days 30 Nov 2024

salesforce

Artificial Intelligence

  • Components of AI
  • Categories of Machine Learning
  • Recurrent Neural Networks
  • Recurrent Neural Networks
salesforce

Upcoming Class

19 days 14 Dec 2024

salesforce

Machine Learning

  • Introduction to Machine Learning & Python
  • Machine Learning: Supervised Learning
  • Machine Learning: Unsupervised Learning
salesforce

Upcoming Class

32 days 27 Dec 2024

salesforce

Tableau

  • Introduction to Tableau Desktop
  • Data Transformation Methods
  • Configuring tableau server
  • Integration with R & Hadoop
salesforce

Upcoming Class

11 days 06 Dec 2024

Interviews