HiveQL CREATE TABLE Statement

From GM-RKB
Jump to navigation Jump to search

A HiveQL CREATE TABLE Statement is a SQL CREATE TABLE Statement expressed in HiveQL (to create a Hive table).

  • Example(s):
    • CREATE TABLE IF NOT EXISTS mydb.employees
      (name STRING COMMENT 'Employee name', salary FLOAT COMMENT 'Employee salary', subordinates ARRAY<STRING> COMMENT 'Names of subordinates'
    • CREATE TABLE ca_employees
      AS SELECT name, salary, address FROM employees WHERE se.state = 'CA';
    • CREATE TABLE sales
      (sales_order_id BIGINT, order_date STRING)
      PARTITIONED BY (country STRING, year INT) ;
    • CREATE EXTERNAL TABLE tablename
      (tablecol STRING)
      ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/home/admin/userdata';
  • Counter-Example(s):
  • See: HiveQL DML.


References

2013

http://stackoverflow.com/questions/11053567/inserting-data-into-hive-table

    • I think the best way is:
      1. ) Copy data into HDFS (if it is not already there)
      2. ) Create external table over your CSV like this

        CREATE EXTERNAL TABLE TableName (id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION 'place in HDFS';

2012

  • http://my.safaribooksonline.com/book/databases/hadoop/9781449326944/4dot-hiveql-data-definition/creatingtables_html
    • The CREATE TABLE statement follows SQL conventions, but Hive’s version offers significant extensions to support a wide range of flexibility where the data files for tables are stored, the formats used, etc. We discussed many of these options in Text File Encoding of Data Values and we’ll return to more advanced options later in Chapter 15. In this section, we describe the other options available for the CREATE TABLE statement, adapting the employees table declaration we used previously in Collection Data Types:

      CREATE TABLE IF NOT EXISTS mydb.employees (name STRING COMMENT 'Employee name', salary FLOAT COMMENT 'Employee salary', subordinates ARRAY<STRING> COMMENT 'Names of subordinates'


  • http://my.safaribooksonline.com/book/databases/hadoop/9781449326944/5dot-hiveql-data-manipulation/_creating_tables_and_loading_them_in_one_query_html
    • You can also create a table and insert query results into it in one statement:

      CREATE TABLE ca_employees AS SELECT name, salary, address FROM employees WHERE se.state = 'CA';

      This table contains just the name, salary, and address columns from the employee table records for employees in California. The schema for the new table is taken from the SELECT clause.

      A common use for this feature is to extract a convenient subset of data from a larger, more unwieldy table.

      This feature can’t be used with external tables. Recall that “populating” a partition for an external table is done with an ALTER TABLE statement, where we aren’t “loading” data, per se, but pointing metadata to a location where the data can be found.