Thursday, 27 April 2017

Teradata - Relational Database Management System (RDBMS)

              Teradata is one of the popular Relational Database Management System. It is mainly suitable for building large-scale data warehousing applications. Teradata achieves this by the concept of parallelism. It is developed by the company called Teradata.
Features of  Teradata
Following are some of the features of Teradata −
  • ·        Unlimited Parallelism − Teradata database system is based on Massively Parallel Processing (MPP) Architecture. MPP architecture divides the workload evenly across the entire system. Teradata system splits the task among its processes and runs them in parallel to ensure that the task is completed quickly.
  • ·   Shared Nothing Architecture − Teradata’s architecture is called as Shared Nothing Architecture. Teradata Nodes, its Access Module Processors (AMPs) and the disks associated with AMPs work independently. They are not shared with others.
  •    Linear Scalability − Teradata systems are highly scalable. They can scale up to 2048 Nodes. For example, you can double the capacity of the system by doubling the number of AMPs.
  •   Connectivity − Teradata can connect to Channel-attached systems such as Mainframe or Network-attached systems.
  •   Mature Optimizer − Teradata optimizer is one of the matured optimizers in the market. It has been designed to be parallel since its beginning. It has been refined for each release.
  •     SQL − Teradata supports industry standard SQL to interact with the data stored in tables. In addition to this, it provides its own extension.
  •   Robust Utilities − Teradata provides robust utilities to import/export data from/to Teradata systems such as FastLoad, MultiLoad, FastExport and TPT.
  •   Automatic Distribution − Teradata automatically distributes the data evenly to the disks without any manual intervention.

          Teradata Architecture is based on Massively Parallel Processing (MPP) architecture. The major components of Teradata are Parsing Engine, BYNET and Access Module Processors (AMPs). The following diagram shows the high-level architecture of a Teradata Node.



The key components of Teradata are as follows −
·        Node − It is the basic unit in Teradata System. Each individual server in a Teradata system is referred as a Node. A node consists of its own operating system, CPU, memory, own copy of Teradata RDBMS software and disk space. A cabinet consists of one or more Nodes.
·       Parsing Engine − Parsing Engine is responsible for receiving queries from the client and preparing an efficient execution plan. The responsibilities of parsing engine are −
o   Receive the SQL query from the client
o   Parse the SQL query check for syntax errors
o   Check if the user has required privilege against the objects used in the SQL query
o   Check if the objects used in the SQL actually exists
o   Prepare the execution plan to execute the SQL query and pass it to BYNET
o   Receives the results from the AMPs and send to the client
·        Message Passing Layer − Message Passing Layer called as BYNET, is the networking layer in Teradata system. It allows the communication between PE and AMP and also between the nodes. It receives the execution plan from Parsing Engine and sends to AMP. Similarly, it receives the results from the AMPs and sends to Parsing Engine.
·      Access Module Processor (AMP) − AMPs, called as Virtual Processors (Vprocs) are the one that actually stores and retrieves the data. AMPs receive the data and execution plan from Parsing Engine, performs any data type conversion, aggregation, filter, sorting and stores the data in the disks associated with them. Records from the tables are evenly distributed among the AMPs in the system. Each AMP is associated with a set of disks on which data is stored. Only that AMP can read/write data from the disks.
·              Relational Database Management System (RDBMS) is a DBMS software that helps to interact with databases. They use Structured Query Language (SQL) to interact with the data stored in tables.
                  Database is a collection of logically related data. They are accessed by many users for different purposes. For example, a sales database contains entire information about sales which is stored in many tables.

Teradata - SELECT Statement

  SELECT statement is used to retrieve records from a table.

Syntax

Following is the basic syntax of SELECT statement.
SELECT 
column 1, column 2, ..... 
FROM  
tablename;

Teradata - Joins

         Join is used to combine records from more than one table. Tables are joined based on the common columns/values from these tables.
There are different types of Joins available.
  • Inner Join
  • Left Outer Join
  • Right Outer Join
  • Full Outer Join
  • Self Join
  • Cross Join
  • Cartesian Production Join

Teradata - SubQueries

         A subquery returns records from one table based on the values from another table. It is a SELECT query within another query. The SELECT query called as inner query is executed first and the result is used by the outer query. Some of its salient features are −
·        A query can have multiple subqueries and subqueries may contain another subquery.
·        Subqueries doesn't return duplicate records.
·     If the subquery returns only one value, you can use = operator to use it with the outer query. If it returns multiple values you can use IN or NOT IN.

Teradata - Table Types

  Teradata supports the following table types to hold temporary data.
  • Derived Table
  • Volatile Table
  • Global Temporary Table

Teradata - Statistics

               Teradata optimizer comes up with an execution strategy for every SQL query. This execution strategy is based on the statistics collected on the tables used within the SQL query. Statistics on the table is collected using COLLECT STATISTICS command. Optimizer requires environment information and data demographics to come up with optimal execution strategy.

Teradata - Compression

          Compression is used to reduce the storage used by the tables. In Teradata, compression can compress up to 255 distinct values including NULL. Since the storage is reduced, Teradata can store more records in a block. This results in improved query response time since any I/O operation can process more rows per block. Compression can be added at table creation using CREATE TABLE or after table creation using ALTER TABLE command.

No comments:

Post a Comment

Popular Posts