Previous Topic: Define Hive MacrosNext Topic: Define Hive User IDs


Define Hive Table Properties

Use the Hive Table Editor to define table properties. This procedure assumes you are working in a physical model, with Hive defined as the target server.

To define properties for a Hive table:

  1. On the Model menu, click Tables .

    The Hive Table Editor opens.

  2. In the navigation grid, select the table that you want to define and work with the following options:

    Note: Click New New icon in property editors to create a new object on the toolbar to create a new table. Use the Enter filter text box to filter a very large list of tables to quickly locate the one that you want to define.

    Physical Name

    Specifies the physical name of the table. Change the physical name of the table in this field.

    Database

    Specifies the database to which the table belongs. Select a database from the drop-down list or click New New icon in property editors to create a new object to create one.

    Table Type

    Specifies the type of the table. Select whether the table is a managed or external table.

    Temporary

    Indicates that the table is temporary.

    Physical Only

    Specifies whether the table is suppressed from a logical model and appears in a physical model only.

    Touch

    Specifies whether you want to read metadata and write it back. This is similar to causing pre/post execute hooks to fire.

    Generate

    Generates SQL during forward engineering. Clear the check box if you do not want to generate SQL.

  3. Click the General tab and work with the following options:
    Physical Name

    Detects the Physical Name of the table. Set the inheritance and override options here.

    Location

    Indicates that a specific distributed file system directory must be used to store data files.

    Table Properties

    Lets you specify additional table properties.

    Like Table

    Lets you specify the table from which to copy table definition exactly (without copying its data). The new table contains no rows.

    Like View

    Lets you specify the view from which to copy view definition exactly (without copying its data). It creates a table by adopting the schema of a view (fields and partition columns) using defaults for SerDe and file formats.

    Concatenate

    Specifies whether multiple small RC or ORC files contained in a table or partition should be merged into larger files.

    Compact

    Specifies whether data should be compacted periodically to save space and optimize data access. Select the type of compaction from the drop-down list:

    MAJOR

    Takes one or more delta files and the base file for the bucket and rewrites them into a new base file per bucket.

    MINOR

    Takes a set of existing delta files and rewrites them to a single delta file per bucket.

    Overwrite Table Properties

    Lets you specify the table properties that can be overwritten.

  4. Click the Storage tab and work with the following options:
    Fields Terminated By

    Specifies a delimiter for structure fields. You can include any single character, but the default is '\001'.

    Fields Escaped By

    Specifies the escape sequence.

    Collection Items Terminated By

    Specifies a delimiter for array items. You can include any single character, but the default is '\002'.

    Map Keys Terminated By

    Specifies a delimiter for map keys. You can include any single character, but the default is '\003'.

    Lines Terminated By

    Specifies a delimiter for map keys. You can include any single character, but the default is '\n'.

    Null Defined As

    Specifies a custom NULL format apart from the Stored As options mentioned above.

    Serde Handler Class

    Specifies the name of a Java class that implements the Hive SerDe interface.

    Stored As

    Specifies the type of file in which data is to be stored. The file can be a TEXTFILE, SEQUENCEFILE, RCFILE, or BINARY SEQUENCEFILE.

    Input Format

    Lets you specify your own Java class if you want Hive to read from a different file format. The value of this property depends on the Stored As value.

    Output Format

    Lets you specify your own Java class if you want Hive to write to a different file format. The value of this property depends on the Stored As value.

    Stored By

    Specifies the name of a Java class that implements the Hive StorageHandler interface.

    Serde Properties

    Specifies SerDe properties to be associated with the storage handler class.

  5. Click the Bucketing and Partition tab and work with the following options:
    Bucket Columns

    Select the columns based on which you want to distribute rows across buckets.

    Bucket Sorted Columns

    Select the column based on which you want the rows to be sorted within each bucket.

    Bucket Option
    Number of Buckets

    Specifies the number of buckets to be created.

    Partition Elements
    Name

    Specifies the name of the partition.

    Physical Data Type

    Specifies the physical data type for the partition element. You can change the data type using the drop-down. The drop-down list contains the data types related to the current domain.

    Comment

    Displays the notes entered by the user. Click the Edit icon to add comments.

    Partitions

    Specifies partition specification and location once partition elements are created.

  6. Click the Skewed Options tab and work with the following options:
    Skewed Columns

    Lets you select the columns that you want to skew.

    Skewed Option

    Specifies whether a skewed table uses the list bucketing feature, which creates subdirectories for skewed values.

    Skewed Column Values

    Specifies the values based on which you want to skew the table.

  7. Click the Statistics tab to view the following options:
    Number Of Partitions

    Specifies the number of partitions the table has.

    Number Of Files

    Specifies the number of files the table has.

    Number Of Rows

    Specifies the number of rows the table has

    Raw Data Size

    Specifies the size of the original dataset.

    Total Size

    Specifies the amount of storage the original dataset takes.

    Note: The above information is fetched only when reverse engineering from database is done and you have chosen to store statistical information.

  8. Click the Volumetrics tab and work with the following options:
    Initial Row Count

    Defines row count for the table. Enter a value in the field.

    Max Rows

    Defines row sizing for the table. Enter a value in the field.

    Growth By Month

    Defines growth increments for the table. Enter a value in the field.

  9. Work with the other tabs in the editor to access additional design features, for example:
  10. (Optional) Click the Comment tab and enter any comments that you want to associate with the object.
  11. (Optional) Click the Where Used tab to view where the object is used within the model.
  12. (Optional) Click the UDP tab to work with user-defined properties for the object.
  13. (Optional) Click the History tab to view the history information for the object.
  14. (Optional) Click the Notes tab to view and edit user notes.
  15. (Optional) Click the Extended Notes to view or edit additional user notes.
  16. Click Close.

    The table is defined and the Hive Table Editor closes.

For more information on table properties, refer to Hive documentation.