What are some basic Hive commands for beginners?

Basic Hive commands include creating a database (`CREATE DATABASE`), creating a table (`CREATE TABLE`), and loading data into a table (`LOAD DATA`).

How do you create a database in Hive?

Use the command `CREATE DATABASE database_name;`. For example: `CREATE DATABASE sales_db;` creates a database named 'sales_db'.

What is the syntax for querying data in Hive?

The syntax for querying data is similar to SQL. Example: `SELECT * FROM table_name WHERE condition;` retrieves data from a table based on the condition.

How can you join two tables in Hive?

Use the `JOIN` keyword. Example: `SELECT a.col1, b.col2 FROM table1 a JOIN table2 b ON a.id = b.id;` joins two tables on a common column.

What are some advanced Hive commands?

Advanced commands include `CREATE EXTERNAL TABLE`, `PARTITION BY`, `CLUSTER BY`, and `ANALYZE TABLE`. These are used for optimized data processing and storage.

How do you drop a table in Hive?

Use the command `DROP TABLE table_name;`. For example: `DROP TABLE employee;` deletes the 'employee' table.

International Womens Day : Flat 30% off on live classes + 2 free self-paced courses - SCHEDULE CALL

Select Course
Blog
Corporate Training

+1 202 599 3842

(4.8/5 ) | 1.5K+ Ratings

- Hadoop Blogs -

Frequently Used Hive Commands in HQL with Examples

Introduction

Apache Hive is a data warehouse infrastructure based on Hadoop framework that is perfectly suitable for Data summarization, Data analysis, and Data querying. The platform is largely helpful to manage voluminous datasets that reside inside distributed storage system.

You will be surprised to know that before becoming an integral part of open source Hadoop framework, Hive was originally initiated by the Facebook. Hive framework was designed with a concept to structure large datasets and query the structured data with a SQL-like language that is named as HQL (Hive query language) in Hive.

Apache Hive

Data Summarization
Data Analysis
Data Querying

Hive is getting immense popularity because tables in Hive are similar to relational databases. If you know how to work with SQL then working with Hive would be a cakewalk for you. A plenty of users are simultaneously querying data using HQL worldwide.

About HQL (Hive Query Language)

HQL is a simple SQL-like query language that is used to manage or query large datasets for enterprises working on voluminous data almost every day. This is easy to work with HQL if you know how to use SQL. The experience Hive programmers having hands-on experiences in HQL can write custom MapReduce functions to perform data analysis more sophistically.

Apache Hive framework is responsible for distributed storage.
Hive offers a complete range of tool to enable quick data ETL (Extract/Transform/Load).
IT has the capability to design structure of various data formats.
With Hive, you are free to access data from various other Hadoop frameworks like HDFS or HBase etc.

Hive Limitations

Hive framework is not suitable OLTP i.e. online transaction processing, it is suitable for OLAP only i.e. Online analytical processing.
Hive can be used for data apprehending or data overwriting but not for deletes or updates.
Hive platform could not work with sub-queries.

Hive or Pig – Which framework is better?

Hive is a data warehouse infrastructure and a declarative language like SQL suitable to manage all type of data sets while Pig is data-flow language suitable to explore extremely large datasets only. This is the reason why Hive is always given more preference over pig framework.

Hive Commands in HQL with Examples

Till the time, we have discussed on Hive basics and why it is so popular among organizations. Now, we will focus on Hive commands on HQL with examples. These are frequently used commands that are necessary to know for every Hive programmer wither he is beginner or experiences. So, let us go through each of the commands deeply so that you can quickly start your work as required.

Data Definition Language (DDL)

DDL is used to build or modify tables and objects stored in the database.Some of the examples of DDL statements are – CREATE, DROP, SHOW, TRUNCATE, DESCRIBE, ALTER statements etc.

1. Create Database in Hive

The first step when start working with databases is to create a new database. If you are not sure how to create a new database in Hive, let us help you. Open the HIVE shell and enter the command “create >” to start a new database in Hive. Let us give you a deep understanding of the concept through general syntax and example given in the screenshot Below:

Here is the actual usage of command for HIVE –

2. DROP Database in Hive

As the name suggest, DROP command is used to delete a database that has already been created earlier. In Hadoop, the database is kept at ‘restrict’ mode by default and it cannot be deleted permissions are not set by the administrator or it is empty. If you are a new user then you should change the ‘RESTRICT’ mode to ‘CASCADE’ before you delete a database.

When using DROP command then Hive may show the error ‘If exists’ that appears when the user tries to delete a database that is not available actually. Let us give you a deep understanding of the concept through general syntax and example given in the screenshot below-

3. DESCRIBE Database in Hive

The DESCRIBE command is used to check the associated metadata with the database. The command is useful when you wanted to check data volume and information on large datasets. Let us see how it works actually –

4. ALTER database in Hive

If you wanted to change the metadata associated with the databases then ALTER is just the perfect choice to get your job done within seconds. You can also use the ALTER command to modify the OWNER property and change the role of the OWNER. Here is the general syntax that should use when working with hiv Hive Commands with Examples e

5. SHOW database in Hive

Well, you wanted to check there are how many databases stored in the current schema. The good news is that you can check the same within seconds by using the SHOW command. It will give a list of databases currently exist.

6. USE database in Hive

The command is suitable to select a specific portion of the database or it may be termed as the particular database session where a set of queries can be executed. Here is the example of general syntax for your reference –

DDL command for Tables in Hive

Till the time, we have discussed DDL command for the database as you have seen earlier like how to create a database, how to delete a database, how to check the number of databases in the current schema, how to use the database and how to alter a database. I hope you must be familiar with all the basic commands well and this is time to start working with tables in Hive by using DDL commands. They are easy and simple when used in the same way as discussed in the blog, let us see how it work actually.

1. How to create a table in Hive?

Create table command is used to create a table in the already existing databaseto store data in the form of rows or columns. For example,if you wanted to create a table with the name “Employee” then important fields could be the name, address, phone number, email id, occupation etc. Also, you need to add a location to the table so that you can mention where particular table needs to store within HDFS. Hive Commands with Examples

In Hive, you also have the flexibility to copy the schema of an existing table, not the data. In other words, only structure will be copied to the new table and data can be added as per your convenience. It not only speeds up the table creation but improves the overall efficiency of a programmer too. Hive Commands with Examples

2. DROP table command in Hive

With DROP command, you have the flexibility to delete the data associated with the table. This command deletes the metadata and data only, not the structure. Data is sent to Trash and it can be recovered back in case of emergency. If you wanted to delete data permanently then add a ‘PURGE’ option along with the DROP command so that data should be shifted to the Trash anyhow. Hive Commands with Examples

3. Truncate table command in Hive

The truncate command is used to delete all the rows and columns stored in the table permanently. When you are using truncate command then make it clear in your mind that data cannot be recovered after this anyhow. Here is the general syntax for truncate table command in Hive – Hive Commands with Examples

4. Alter table commands in Hive

With the ALTER command, the structure, metadata or data of the table can be modified quickly with a simple command as shown below in the screenshot. Hive Commands with Examples

Further, there is DESCRIBE and the SHOW table command to check the metadata associated with table and number of tables available in the particular database.

Final Words:

That’s all for the day! We have discussed the basic DDL commands in the blog that help you to create a database and table perfectly. You can also perform relevant operations too as required. To know more about Hive commands in HQL with examples, you should join JanBask Training for Hadoop training and certification program right away.

Hive Related Topics:

JanBask Training

A dynamic, highly professional, and a global online training course provider committed to propelling the next generation of technology learners with a whole new way of training experience.