Compare courses
Register
School of Continuing Studies

XBUS-502 Data Sources and Storage

Sep 14—21, 2019
2 daysModules info
Washington, District of Columbia, United States
USD 833
USD 416 per day
Sep 28—Oct 5, 2019
2 daysModules info
Washington, District of Columbia, United States
USD 833
USD 416 per day
Feb 8—22, 2020
2 daysModules info
Washington, District of Columbia, United States
USD 833
USD 416 per day
+2 more options

How it works

Disclaimer

Coursalytics is an independent platform to find, compare, and book executive courses. Coursalytics is not endorsed by, sponsored by, or otherwise affiliated with School of Continuing Studies.

Full disclaimer.

Description

Course Details

Before any analysis is possible, relevant data sources must be found and accessed. This class overviews common data sources including relational databases, non-relational data stores, and web-based data sources complete with hands-on examples.

Course Objectives

Upon successful completion of the course, students will:

  • Understand the pros and cons of the various types of databases
  • Access data from web-based APIs
  • Query both relational and non-relational data stores
Eli Broad College of Business

Data Mining and Management Strategies (Online)

Next dates

Oct 1, 2019—Mar 31, 2020
Online
USD 2480

Description

Managers are constantly inundated by information with data points communicated by the hour, minute and even second. There is a demand for data savvy managers with the ability to filter through the noise, optimize business performance now and identify opportunities that can make a big impact in the future. Often, the real issues and challenges facing the business are not on the surface or easily identifiable. This course will help you uncover and explore hidden patterns in the data, providing insight to predict, experiment and continuously refine strategic decisions with big business impact.

Examine techniques and algorithms for knowledge discovery in databases, from data pre-processing and transformation to model validation and post-processing. In this 100% online eight-week course, you’ll explore marketing business processes that increasingly rely on analytics, including customer acquisition, marketing segmentation and understanding customer lifetime value. Use analytical tools to develop models to support these business processes.

What You’ll Learn

Enterprise Database and Data Models

  • Key differences between data and information
  • An understanding of enterprise database environments
  • Define specific challenges with data cleansing
  • The elements that make up a data model

Extracting Data from a Database

  • The role of queries in extracting data from a database
  • How to implement advanced queries in Microsoft® Access (or other database environment) using a visual querying language
  • How to write queries using Structured Query Language (SQL)
  • Recognize the manner in which SQL supports, extracts, transforms and loads to prepare data for analytics model development

Large Scale Implementation of Hadoop® MR

  • An understanding of and differences between brute force and parallel approaches
  • Core concepts, advantages and supporting programs of ApacheTM Hadoop®
  • Identify the components of MapReduce

Getting Data: Social Networks and Geolocalization

  • Structure of a web page and how to obtain HTML files
  • The advantages of web crawlers and how to get data page by page
  • How to conduct text analysis: identifying human text, common issues, and resource libraries
  • The ethical implications of using publicly available data

Unstructured Data, Graphs and Networks

  • How to apply the right data structure for a problem
  • The differences between graph, node and edge properties
  • Define what degree means and analyze and interpret the degree distribution
  • Concept of clustering coefficient and what it can mean for your data

Clustering: Understanding the Relationship of Things

  • Concept of clustering and necessary conditions
  • Continuous and discrete distances and their different implications for clustering
  • How to use bootstrapping to find a good business solution
  • Min, max and mean merging and why it is important to understand these relations

Classifications: Putting Things Where They Belong

  • What classification does and its key components
  • The elements of classification and how to use a decision tree
  • How to apply the idea of impurity to tree induction
  • Discrete and continuous classes and their role in supporting classification

Classifications: Advanced Methods

  • Statistical and classification methods—when you would use each
  • What issues to consider when only training data is available
  • Advantages and disadvantages of Artificial Neutral Networks (ANN)
  • The limits, constraints and differences of classifiers

Curriculum

8 Week Course

Enterprise Database and Data Models

  • Course Introduction
  • Enterprise Data
  • Types of Enterprise Databases
  • Database Management Systems and Relational Database Design
  • Enterprise Data and Multidimensional Databases
  • Creating Multidimensional Data
  • Sourcing the Data in Data Cubes
  • The Role of ETL in Analytics
  • More on ETL
  • Extracting Data from a Database
  • Conceptual Database Model
  • Cardinality

Extracting Data from a Database

  • Basic Queries
  • Advanced Queries - Part 1
  • Advanced Queries - Part 2
  • Structured Query Language (SQL)
  • SQL Functionality for ETL and SQL Server

Large Scale Implementation of Hadoop® MR

  • Single vs. Parallel Approach
  • Hadoop® One Framework for Multiple Problems
  • Hadoop® Architecture and File System
  • Hadoop® Distributed File System (HDFS)
  • The MapReduce Paradigm
  • MapReduce Examples
  • MapReduce Streaming
  • Hadoop® Zoo

Getting Data: Social Networks and Geolocalization

  • How the Web Works
  • Anatomy of an HTML Page
  • Parsing
  • Web Crawlers
  • Web Spiders
  • API
  • Text Analysis
  • Ethics

Unstructured Data, Graphs and Networks

  • Nodes and Edges
  • Degree Distributions and Hubness
  • Small World Property - Degrees of Separation
  • Centrality, Betweenness, and Closeness
  • Clustering and Coefficient
  • Network Motifs
  • Modularity
  • Data Formats

Clustering: Understanding the Relationship of Things

  • The Idea Behind Clustering
  • Types of Clusters
  • Distances Between Points
  • K-Means Clustering
  • Not Every Cluster Is a Good Cluster
  • How Good Are My Clusters?
  • Hierarchical Clustering
  • Min, Max, and Mean

Classifications: Putting Things Where They Belong

  • The Idea Behind Classification
  • Reading and Interpreting a Classification Tree
  • Making a Decision Tree
  • Alternative Impurity Measures
  • Expansion to 2D
  • How Good Is My Classifier?
  • But I Only Have Training Data
  • A Brief Look at Association Rule Mining

Classifications: Advanced Methods

  • Rule-Based Classifier
  • Extracting Rules
  • Nearest Neighbors
  • Classifiers – Defined Boundaries
  • Artificial Neural Networks
  • Limits, Boundary Conditions and Choosing the Right Classifier
  • Clustering vs. Classification
  • Outlier and Anomaly Detection

Who should attend

This course is designed for professionals who want to deepen their understanding of how big data can be mined and managed to uncover information. With its exploration into relational databases and predictive modeling techniques, the course helps professionals understand how this process works effectively with various types of data.

Show more