Sunday, February 25, 2018

Viva Questions For Data Warehouseing And Datamining



1 DWDM Stands For Data Warehouseing And Datamining.

2 ETL Stands For Extract Transform And Load.

3 TPS Stands For Transaction Processing System.

4 MIS Stands For Management Information System.

5 OLTP V/S OLAP

  OLTP :-  OLTP Stands For Online Transaction Processing.

  OLAP   :-  OLAP Stands For Online Analytical Processing.

6 What Is Data Warehouse?

A Data Warehouse Is A Federated Repository For All The Data That An Enterprise's Various Business Systems Collect. The Repository May Be Physical Or Logical.

7 What Is Data Mining?

Data Mining Is The Process Of Sorting Through Large Data Sets To Identify Patterns And Establish Relationships To Solve Problems Through Data Analysis. Data Mining Tools Allow Enterprises To Predict Future Trends.

8 ETL Process In Warehousing.

Extraction Of Data

During Extraction, The Desired Data Is Identified And Extracted From Many Different Sources, Including Database Systems And Applications. Very Often, It Is Not Possible To Identify The Specific Subset Of Interest, Therefore More Data Than Necessary Has To Be Extracted, So The Identification Of The Relevant Data Will Be Done At A Later Point In Time. Depending On The Source System's Capabilities (For Example, Operating System Resources), Some Transformations May Take Place During This Extraction Process. The Size Of The Extracted Data Varies From Hundreds Of Kilobytes Up To Gigabytes, Depending On The Source System And The Business Situation. The Same Is True For The Time Delta Between Two (Logically) Identical Extractions: The Time Span May Vary Between Days/Hours And Minutes To Near Real-Time. Web Server Log Files, For Example, Can Easily Grow To Hundreds Of Megabytes In A Very Short Period Of Time.

Transportation Of Data

After Data Is Extracted, It Has To Be Physically Transported To The Target System Or To An Intermediate System For Further Processing. Depending On The Chosen Way Of Transportation, Some Transformations Can Be Done During This Process, Too. For Example, A SQL Statement Which Directly Accesses A Remote Target Through A Gateway Can Concatenate Two Columns As Part Of The SELECT Statement.

The Emphasis In Many Of The Examples In This Section Is Scalability. Many Long-Time Users Of Oracle Database Are Experts In Programming Complex Data Transformation Logic Using PL/SQL. These Chapters Suggest Alternatives For Many Such Data Manipulation Operations, With A Particular Emphasis On Implementations That Take Advantage Of Oracle's New SQL Functionality, Especially For ETL And The Parallel Query Infrastructure.

9 What Is The CRISP-DM ?

CRISP-DM Stands For Cross-Industry Process For Data Mining. The CRISP-DM Methodology Provides A Structured Approach To Planning A Data Mining Project. It Is A Robust And Well-Proven Methodology. We Do Not Claim Any Ownership Over It. We Did Not Invent It. We Are However Evangelists Of Its Powerful Practicality, Its Flexibility And Its Usefulness When Using Analytics To Solve Thorny Business Issues. It Is The Golden Thread Than Runs Through Almost Every Client Engagement. The CRISP-DM Model Is Shown On The Right.

This Model Is An Idealised Sequence Of Events. In Practice Many Of The Tasks Can Be Performed In A Different Order And It Will Often Be Necessary To Backtrack To Previous Tasks And Repeat Certain Actions. The Model Does Not Try To Capture All Possible Routes Through The Data Mining Process.

You Can Jump To More Information About Each Phase Of The Process Here:

Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment


1 comment:

Generate Even Numbers in a Range In PHP.

$start = 1; $end = 20; for ($i = $start; $i <= $end; $i++) {     if ($i % 2 == 0) {         echo $i . " is even.<br>";   ...