Hive interview questions - 3

12. What is BINARY data type in Hive ?
The BINARY data type is for storing variable-length binary data.

12. How timestamp stored in Hive ?
TIMESTAMP data type stores timestamps with nanosecond precision
the to_utc_timestamp and from_utc_timestamp functions make it possible to do timezone conversions.
DATE data type stores a date with year, month and day components

13. What are the Complex data types ?
Hive has four complex types: ARRAY, MAP, STRUCT, and UNION. ARRAY and MAP are like their namesakes in Java, whereas a STRUCT is a record type that encapsulates a set of named fields. A UNION specifies a choice of data types; values must match exactly one of these types. Complex type declarations must specify the type of the fields in the collection, using an angled bracket notation.
CREATE TABLE complex (
c1 ARRAY<INT>,
c2 MAP<STRING, INT>,
c3 STRUCT<a:STRING, b:INT, c:DOUBLE>,
c4 UNIONTYPE<STRING, INT>
);

14. How to do type Conversions in hive?
TINYINT will be converted to an INT, the reverse conversion will not occur.
The implicit conversion rules can be summarized as follows. Any numeric type can be implicitly converted to a wider type, or to a text type (STRING, VARCHAR, CHAR) . All the text types can be implicitly converted to another text type. Perhaps surprisingly, they can also be converted to DOUBLE or DECIMAL. BOOLEAN types cannot be converted to any other type, and they cannot be implicitly converted to any other type in expressions. TIMESTAMP and DATE can be implicitly converted to a text type. You can perform explicit type conversion using CAST. For example, CAST('1' AS INT) will convert the string '1' to the integer value 1. If the cast fails - as it does in CAST('X' AS INT), for example - the expression returns NULL.

15. What are the different types of Tables in Hive ?
A Hive table is logically made up of the data being stored and the associated metadata describing the layout of the data in the table.Hive stores the metadata in a relational database - and not in HDFS, hive use an embedded Derby database to store metadata information.The Metastore stores all the information about the tables, their partitions, the schemas, the columns and their types, the table locations etc. This information can be queried or modified using a thrift interface and as a result it can be called from clients in different programming languages.

16. Explain Managed Tables and External Tables ?
When you create a table in Hive, by default Hive will manage the data, which means that Hive moves the data into its warehouse directory. Alternatively, you may create an external table, which tells Hive to refer to the data that is at an existing location outside the warehouse directory. The difference between the two table types is seen in the LOAD and DROP semantics.

CREATE TABLE managed_table (dummy STRING);
LOAD DATA INPATH '/user/tom/data.txt' INTO table managed_table;
will move the file hdfs://user/tom/data.txt into Hive’s warehouse directory for the
managed_table table, which is hdfs://user/hive/warehouse/managed_table.5

An external table behaves differently. You control the creation and deletion of the data.
The location of the external data is specified at table creation time:
CREATE EXTERNAL TABLE external_table (dummy STRING)
LOCATION '/user/tom/external_table';
LOAD DATA INPATH '/user/tom/data.txt' INTO TABLE external_table;

17. What are the different types of tables available in HIve?
There are two types. Managed table and external table. In managed table both the data an schema in under control of hive but in external table only the schema is under control of Hive.

18. Is Hive suitable to be used for OLTP systems? Why?
No Hive does not provide insert and update at row level. So it is not suitable for OLTP system.

12345678910