Defining Databricks CTAS

The following properties are applicable to a Databricks CTAS object.

Tab

Property

Description

Additional Information

  Name Specifies the name of the CTAS  
  Database Specifies the database to which the CTAS belongs  
  Generate Specifies whether a DDL statement is generated for the CTAS during forward engineering  
General Physical Name Specifies the physical name of the CTAS  
Location Specifies the directory path where the CTAS data is stored  
Table Properties Specifies a list of key-value pairs used to tag the table definition  
Using Specifies the file format for the CTAS For example, TEXT.
Options Specifies the table options to optimize table behavior You can enter key-value pairs to define options for the table.
Version As Specifies the version of a Delta CTAS This option is part of temporal specification and references a Delta table using its version.
Timestamp As Specifies a timestamp of a Delta CTAS This option is part of temporal specification and references a Delta table at the specified point in time.
Storage Fields Terminated By Specifies a character that defines column separator For example, comma ' , '.
Fields Escaped By Defines the escape mechanism  
Collection Items Terminated By Specifies a character that defines collection item separator For example, underscore ' _ '.
Map Keys Terminated By Specifies a character to define a map key separator For example, colon ' : '.
Lines Terminated By Specifies a row separator character For example, new line ' \n '.
Null Defined As Defines a value for NULL For example, ' foonull'.
Serde Handler Class Specifies a fully-qualified class of a custom SerDe Default classes are available when Stored As is set to ORC, AVRO, or PARQUET.
Stored As Specifies the file format for the CTAS  
Input Format Specifies the input format for the CTAS Default input formats are available when Stored As is set to TEXTFILE, SEQUENCE FILE, ORC, PARQUET, AVRO, or RCFILE.
Output Format Specifies the output format of the CTAS Default output formats are available when Stored As is set to TEXTFILE, SEQUENCE FILE, ORC, PARQUET, AVRO, or RCFILE.
Stored By Specifies a non-native table format using a storage handler For example, org.apache.hadoop.hive.hbase.HBaseStorageHandler
Serde Properties Specifies a list of key-value pairs used to define SerDe properties  
Select Available Tables and Views Specifies a list of tables and views selected under From Under Available Tables and Views, select the columns that you want to include in CTAS. Then, click .
Columns Specifies the selected CTAS columns
Select Type Specifies the expression type to indicate whether duplicate rows are returned

All: Indicates that the statement returns all rows, including duplicate rows

Distinct: Indicates that the statement discards duplicate rows and returns only the remaining rows

Alias Specifies a temporary name given to a table, column, or expression present in a query based on the selected column  
Expression Specifies the expression for a selected column Available only when the selected column is a user-defined function expression
From Available Tables and Views Specifies a list of available tables and views Under Available Tables and Views, select the tables or views from which you want to select columns. Then, click .
From Specifies a list of selected tables or views from which columns are selected  
Alias Specifies an alternate name for the selected table or view  
User Defined SQL User-Defined SQL Specifies whether the SQL code used in Forward Engineering is defined using the User Defined SQL tab
SQL   Specifies the SQL code used during Forward Engineering  

Selecting User-Defined SQL check box ensures that SQL code on the User Defined SQL tab is used in Forward Engineering. Hence, if you have selected the User-Defined check box and made changes in CTAS properties then, you must alter the User Defined SQL accordingly to reflect the changes in Forward Engineering.