Previous Topic: Hive Support SummaryNext Topic: Define Hive Database Roles


Define Hive CTAS

In Hive, tables can also be created and populated by the results of a query in one create-table-as-select (CTAS) statement. The table created by CTAS is atomic, meaning that the table is not seen by other users until all the query results are populated. Therefore, other users will either see the table with the complete results of the query or will not see the table at all.

To define properties for a Hive CTAS:

  1. In the Model Explorer, right-click CTAS and click New.

    An instance of CTAS is created.

  2. Right-click the instance and click Properties.

    The Hive CTAS Editor opens.

    Note: Click New New icon in property editors to create a new object on the toolbar to create a new CTAS. Use the Enter filter text box to filter a very large list of tables to quickly locate the one that you want to define.

    Name

    Specifies the physical name of the table. Change the physical name of the table in this field.

    Database

    Specifies the database to which the CTAS belongs. Select a database from the drop-down list or click New New icon in property editors to create a new object to create one.

    Touch

    Specifies whether you want to read metadata and write it back. This is similar to causing pre/post execute hooks to fire.

    Generate

    Generates SQL during forward engineering. Clear the check box if you do not want to generate SQL.

  3. Click the General tab and work with the following options:
    Physical Name

    Detects the Physical Name of the table. Set the inheritance and override options here.

    Location

    Indicates that a specific distributed file system directory must be used to store data files.

    Table Properties

    Lets you specify additional table properties.

    Like Table

    Lets you specify the table from which to copy table definition exactly (without copying its data). The new table contains no rows.

    Like View

    Lets you specify the view from which to copy view definition exactly (without copying its data). It creates a table by adopting the schema of a view (fields and partition columns) using defaults for SerDe and file formats.

    Concatenate

    Specifies whether multiple small RC or ORC files contained in a table or partition should be merged into larger files.

    Compact

    Specifies whether data should be compacted periodically to save space and optimize data access. Select the type of compaction from the drop-down list:

    MAJOR

    Takes one or more delta files and the base file for the bucket and rewrites them into a new base file per bucket.

    MINOR

    Takes a set of existing delta files and rewrites them to a single delta file per bucket.

    Overwrite Table Properties

    Lets you specify the table properties that can be overwritten.

  4. Click the Storage tab and work with the following options:
    Fields Terminated By

    Specifies a delimiter for structure fields. You can include any single character, but the default is '\001'.

    Fields Escaped By

    Specifies the escape sequence.

    Collection Items Terminated By

    Specifies a delimiter for array items. You can include any single character, but the default is '\002'.

    Map Keys Terminated By

    Specifies a delimiter for map keys. You can include any single character, but the default is '\003'.

    Lines Terminated By

    Specifies a delimiter for map keys. You can include any single character, but the default is '\n'.

    Null Defined As

    Specifies a custom NULL format apart from the Stored As options mentioned above.

    Serde Handler Class

    Specifies the name of a Java class that implements the Hive SerDe interface.

    Stored As

    Specifies the type of file in which data is to be stored. The file can be a TEXTFILE, SEQUENCEFILE, RCFILE, or BINARY SEQUENCEFILE.

    Input Format

    Lets you specify your own Java class if you want Hive to read from a different file format. The value of this property depends on the Stored As value.

    Output Format

    Lets you specify your own Java class if you want Hive to write to a different file format. The value of this property depends on the Stored As value.

    Stored By

    Specifies the name of a Java class that implements the Hive StorageHandler interface.

    Serde Properties

    Specifies SerDe properties to be associated with the storage handler class.

  5. Work with the other tabs in the editor to access additional design features, for example:
  6. (Optional) Click the Comment tab and enter any comments that you want to associate with the object.
  7. (Optional) Click the UDP tab to work with user-defined properties for the object.
  8. (Optional) Click the Notes tab to view and edit user notes.
  9. (Optional) Click the Extended Notes tab to view or edit user notes.
  10. Click Close.

    The table is defined and the Hive Table Editor closes.

For more information on table properties, refer to Hive documentation.