Decision Support Systems

Similar documents
поиск Open Analytics: Popular Goods

Operating S&OP in Retail

Patrick Kelly and Lee Everts. Clothing in the South African CPI: Exclusion of clearance sales

This report examines insights for the fashion. retailing industry in Australia. By compiling top. searches from , we are able to better

June Hong Kong Jewellery & Gem Fair Visitor Survey Report

For- Credit Courses and Certificate Programs in Apparel Merchandising & Management for Industry Professionals

URGENT: VOLUNTARY RECALL OF ANSELL SANDEL WEIGHTED SAFETY SCALPEL

Undetected burglary with the highest value of property stolen, 01/01/ /12/2013

Milbon Co., Ltd. Financial Results Presentation Materials

Footwear Market Overview

Higg.org Platform Update What to Expect: Timeline and Next Steps

Fashion Pricing and Technology. Back to Table of Contents

FITS Florence International Trend School

10 YEARS AT NUMBER ONE

The year to come JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC. Anti-Pollution Enhance your well-being Summer Skin Embrace Your Years

MEDIA KIT 2018 PHOTO: ENRIQUE VEGA.

Year: nty 4-H. Proje. My goals for. this year are:

0.15ct $ ct $ ct $ ct $ ct $1, ct $1,499 DX ct $ ct $ ct $249 DX 386*

No.1. marie claire is still the. growing fashion magazine

DOWNLOAD OR READ : MOONLIGHT IN VERMONT OTHER GREAT STANDARDS JAZZ PLAY ALONG VOLUME 54 JAZZ PLAY ALONG SERIES PDF EBOOK EPUB MOBI

Clothing in Performance 2019 Class Syllabus

2018 Nefful Product Discontinued List

Machine Learning. What is Machine Learning?

2016 Nefful Product Discontinued List

June Hong Kong Jewellery & Gem Fair Exhibitor Survey Report

About the Report. Booming Women Apparel Market in India

Hazardous Chemical Communication Program

2018 Nefful Product Discontinued List

Shanghai Italy Economic Relations ICE Promotion Plan 2011

China International Gold, Jewellery & Gem Fair Shanghai Visitor Survey Report

Trustees of Dartmouth College

Tunisian POW mail from WWII

FERC EL Settlement Agreement

2016 Editorial Calendar /16/15


The finest allergy-free earrings in the world. 100% made in the U.S.A.

PEAK HONG KONG PUBLISHING LIMITED A DIVISION OF SCMP

HY121: Introduction to Medieval History: Vikings and Normans [7.5cr] Dr Colmán Etchingham Dr Michael Potterton. Syllabus

State of the Nation Australia s Changing Demographics

9/1/2016. ECON 302, Introduction 1 INTRODUCTION ECON 302

FF: Fashion Design-Art (See also AF, AP, AR, DP, FD, TL)

CREATING THE ARTIST ARCHIVES

Background on China Textile Safeguards National Cotton Council December 2005

CIE Advanced Class Schedule 2019

First Quarter of Fiscal 2017 Supplementary Material

CLOs in 2017 and Beyond

D I S C O U N T FLYER PRODUCT ONLY good to December 24th 2014

MEDIA KIT MEXICO & LATIN AMERICA PHOTO: JASON KIBBLER.

CLARK COUNTY ARTISTS GUILD August 2018 Edition clarkcountyartistsguild.com

Ezra C. Fitch Watchmaker, Salesman, Inventor, Businessman. The man who carried Waltham into the 20 th Century

COURSE SYLLABUS. Course Description: Instructional Philosophy: Goals: Notification: Make up Policy: Late Work:

Balanced Assessment Elementary Grades Package 1 Dale Seymour Publications Correlated To I Get It! Math Grades 3-5 Modern Curriculum Press

Balanced Assessment Elementary Grades Package 1 Dale Seymour Publications Correlated To I Get It! Math Grades 3-5 Modern Curriculum Press

PERFINS of Great Britain. Regional Die Makers

Ruth Lenore Hovermale Papers - Accession 193

November 2015 WELCOME TO Nick Blunden

Subject : Apparel Merchandising. Unit 1 Introduction to apparel merchandising. Quadrant 1 e-text

ESSENCE OF LOVE. 0.45ct, CDC* 0.17ct $1, ct, CDC* 0.30ct $2, ct, CDC* 0.60ct $4,499 DD Kt $499 Wedding Band

The Resurgence of the CLO Market

Mme. Maharaj School of Cosmetology & Hair. Train to become a Professional Cosmetologist

PROPINSIGHT A Detailed Property Analysis Report

Guidelines for organising exhibitions in the Atrium Gallery at LSE

Canadian Diamond. Canadian Origin.

Instructor: Sarah Bennett 1052 LeBaron Office hours: by appointment AESHM DEPARTMENT. T C 325. Patternmaking II.

EXHIBITION GUIDELINES

CIE Advanced Class Schedule 2017

Customer Feedback Summary

No.1. marie claire is still the. growing fashion magazine

BRAG Wall Exhibition application pack

REALITY IS HIGHLY OVERRATED May 3 May 23, 2019

JOB INFORMATION PACK GALLERY ASSISTANTS (CASUAL)

Bob Jones High School Department of Family & Consumer Sciences

EASTERN KENTUCKY UNIVERSITY HAZARD COMMUNICATION PROGRAM SUMMARY COMPLIANCE MANUAL. Table of Contents

10. $499 $599 $999 $1,699 $3, $3,499 $ $ $1, $1,899

College of Charleston Archives Student Records Listed Alphabetically From cards complied by Ruth Rugheimer

CBAC / WJEC Tasg Asesu wedi ei Rheoli / Controlled Assessment Task

Outbreak Investigation

Session 10. Sourcing and Supplier Management Practices

UNIVERSITY OF HAWAII Community Colleges ENVIRONMENTAL HEALTH AND SAFETY OFFICE HAZARD COMMUNICATION PROGRAM

Millinery Courses. Registration Form. About us. Payment Options. Our Courses. More Information. Course Name:... Course Date:... Full Name:...

SUMMER PROGRAM 2017 COURSE DESCRIPTION

CALIFORNIA DEPARTMENT OF CORRECTIONS AND REHABILITATION OFFICE OF BUSINESS SERVICES, STRATEGIC ACQUISITIONS UNIT RFQ #CDCR-1317-SAU

Chemical Inspection and Regulation Service (CIRS)

Proposed Uniform Policy

PA R T T W O N E W I N I T I A T I V E S

INDUSTRY OVERVIEW. No. of establishments 117 (manufacturing) March ,257 (import and export) December 2000

FASHION STUDIES (CODE - 053)

FCS Middle School Dress Code

Guidelines for organising exhibitions in the Atrium Gallery at LSE

Canadian Diamond. Canadian Origin. HOW OUR. Each masterfully cut. is laser engraved with. a unique tracking number. and is accompanied by

Politicians - Manufacturers, biscuits

Choose I am Canadian an eternal bond with this. great nation. For your guarantee of Canadian origin, look for the. stamped on

Charles W. Eisemann Center Forrest & Virginia Green Mezzanine-Gallery Policies & Procedures for Exhibiting

on the Lists of China.

Sampling Process in garment industry

Consumer Sciences & Design Technologies Department Disciplines: FASH, FCS, HRM, ID, NF

CASPER COLLEGE COURSE SYLLABUS THEA 2160:01 STAGE MAKEUP

Review of Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Total Macros Pyralids

Hypothetical PR Scenarios. Situation Statement: (opportunity) Cloe s Fashions, an Illinois department store

Experience a new dimension

Transcription:

Decision Support Systems 2011/2012 Week 1. Lecture 1

Outline Course presenta,on Decision Support Systems An Overview

Decision Support Systems: The Course

Faculty Francisco Melo (fmelo@inesc- id.pt) Office hours: Monday, 9h30 11h00, Wednesday, 14h00 15h30 (S. Polivalente, Pav. Informá,ca II) ContacAng me by e- mail: E- mail mainly for logis,c issues Otherwise, use office hours For news, keep an eye on the webpage: hsp://fenix.ist.utl.pt/disciplinas/sad/2011-2012/1- semestre/

Bibliography Main: J. Han, M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufman Publishers, 2 nd edi,on, 2001. (there is a new edi7on of 2011, but I ll s7ck to the previous one)

Bibliography (cont.) Auxiliary: R. Kimball, M. Ross. The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. Wiley Computer Publishing, 2nd edi,on, 2002. T. Mitchell. Machine Learning, McGraw Hill, 1997. J. Smola and B. Scholkopf. Learning with Kernels: Support Vector Machines, Regulariza7on, Op7miza7on and Beyond. MIT Press, 2002.

Classes Lectures: Slide presenta,ons of textbook material (slides will be available online) Lecture notes (when needed) (will also be available online) Lab Sessions: Groups of 3 (preferrably) SQL Server 2008 Exercises and prac,cal tasks Begin on September 26 th (but keep an eye on the webpage)

Grading Three components: Data warehousing project: 30% of final grade Consists of 4 homeworks that follow lab sessions Data mining project: 30% of final grade Consists of 4 homeworks that follow lab sessions Final examina,on: 40% of final grade

Grading (cont.) The final grade is given by: NF = 0.3 NDW + 0.3 NDM + 0.4 NE To pass the course you must verify all condi,ons below: NDW 9.5 NDM 9.5 NE 9.5 All grades are posted in the course website

Grading (cont.) Projects/homework assignments should be completed in groups of three students Project grades will be given individually upon discussion, if deemed necessary by the faculty Although discussion between groups is allowed, it should always be kept in general terms. Students are not allowed to show, share or discuss specific soluaons, neither physically nor electronically. Similarly, you may consult references (both in paper and online) for ideas about how to tackle specific problems. However, soluaons delivered should result from original work by the students.

Important Dates Data- warehousing Project: 1. Issued: Sep. 25 Due: Oct. 3 (at the end of lecture) 2. Issued: Oct. 2 Due: Oct. 10 (at the end of lecture) 3. Issued: Oct. 9 Due: Oct. 17 (at the end of lecture) 4. Issued: Oct. 16 Due: Oct. 24 (at the end of lecture)

Important Dates Data- mining Project: 1. Issued: Oct. 23 Due: Oct. 31 (at the end of lecture) 2. Issued: Nov. 6 Due: Nov. 14 (at the end of lecture) 3. Issued: Nov. 20 Due: Nov. 28 (at the end of lecture) 4. Issued: Dez. 4 Due: Dez. 12 (at the end of lecture)

Important Dates ExaminaAons: Jan. 07, 2012 Jan. 31, 2012 (Recurso)

Syllabus Introduc,on (Chap. 1) Data pre- processing (Chap. 2) Data warehousing (Chap. 3) Mul,dimensional data model Data warehouse architecture Online analy,cal processing (OLAP) Data cube computa,on (Chap. 4)

Syllabus (cont.) PaSern mining (Chap. 5) Itemset mining Mining associa,on rules Clustering (Chap. 7) k- means Hierarchical methods Expecta,on- maximiza,on Supervised learning (Chap. 6) Decision tree learning Bayesian learning Learning sets of rules Ar,ficial neural networks Support vector machines Model selec,on

Decision Support Systems: What Is This All About? or DSS in 60 minutes

Databases: Storing and Accessing Data Database: A computerized system to maintain informa,on and make it available on demand. Database Management System Database Applica,on programs End users

Example (Rel. Database) Primary key Suppliers S# SNAME STATUS CITY S1 Smith 20 London S2 Jones 10 Paris S3 Blake 30 Paris S4 Clark 20 London S5 Adams 30 Athens Parts ASributes P# PNAME COLOR WEIGHT CITY P1 Nut Red 12 London P2 Bolt Green 17 Paris P3 Screw Blue 17 Rome P4 Screw Red 14 London P5 Cam Blue 12 Paris P6 Cog Red 19 London Shipments S# P# QTY S1 P1 300 S1 P2 200 S1 P3 400 S1 P4 200 S1 P5 100 S1 P6 100 S2 P1 300 S2 P2 400 S3 P2 200 S4 P2 200 S4 P4 300 S4 P5 400 Rela,on Tuple

Example (Query) Suppliers S# SNAME STATUS CITY S1 Smith 20 London S2 Jones 10 Paris S3 Blake 30 Paris S4 Clark 20 London S5 Adams 30 Athens P# PNAME COLOR WEIGHT CITY P1 Nut Red 12 London P2 Bolt Green 17 Paris P3 Screw Blue 17 Rome P4 Screw Red 14 London P5 Cam Blue 12 Paris P6 Cog Red 19 London Parts Shipments S# P# QTY S1 P1 300 S1 P2 200 S1 P3 400 S1 P4 200 S1 P5 100 S1 P6 100 S2 P1 300 S2 P2 400 S3 P2 200 S4 P2 200 S4 P4 300 S4 P5 400 Which city has the largest amount of shipping suppliers? SELECT PX.CITY, MAX(PX.TOTAL) FROM ( SELECT CITY,SUM(QTY) AS TOTAL FROM SUPPLIERS NATURAL JOIN PARTS NATURAL JOIN SHIPMENTS GROUP BY CITY) AS PX;

Another (Larger) Example Customer CUSTID NAME ADDRESS AGE INCOME CREDITCAT CUSTCAT C1 Maria Silva Av. Liberdade, n. 123 31 60,000.00 1 3 Item ITEMID NAME BRAND CATEGORY TYPE PRICE MADEIN SUPPLIER COST I3 hi- res- tv Mochiba High resolu,on TV 988.00 Japan NikoX 600.00 I8 laptop Cell Laptop computer 1,369.00 USA Cell 983.00 Employee EMPID NAME CATEGORY GROUP SALARY COMMISSION E55 Santos, Manuel home entretainment manager 100,000.00 2% WorksAt Purchases EMPID BRANCHID TRANSID CUSTID EMPID DATE TIME PAYMENT AMOUNT E55 B1 T100 C1 E55 21/03/2011 14:35 VISA 1,357.00 Branch BRANCHID NAME ADDRESS B1 Colombo Av. Lusíada ItemsSold What is the pair of items most TRANSID ITEMID QTY frequently bought together per T100 I3 1 branch/,me of year?

InterpreAng Data In the presence of huge amoungs of data, how can we extract informaaon that is: non- trivial implicit previously unknown poten,ally useful

InterpreAng Data (cont.) Interpreta,on of data can benefit from: Informa,on- friendly ways to represent data Data Warehousing Automated methods to extract informa,on Data mining

RepresenAng Data

RepresenAng Data Examples of data representa,on: Month Net profit Jan. 10,974.00 Feb. 5,944.00 Mar. 4,846.00 Apr. 2,056.00 May 2,250.00 Jun. 3,896.00 Jul. 3,366.00 Aug. 4,936.00 Sep. 4,786.00 Oct. 3,000.00 Nov. 3,566.00 Dec. 2,376.00

Database Example Sales NAME COLOR SIZE QTY skirt dark S 2 skirt dark M 5 skirt dark L 1 skirt pastel S 11 skirt pastel M 9 skirt pastel L 15 skirt white S 2 skirt white M 5 skirt white L 3 dress dark S 2 dress dark M 6 dress dark L 12 dress pastel S 4 dress pastel M 3 dress pastel L 3 dress white S 2 dress white M 3 dress white L 0 shirt dark S 2 shirt dark M 6 shirt dark L 6 pants white L 2

Sales Database Example NAME COLOR SIZE QTY skirt dark S 2 skirt dark M 5 skirt dark L 1 Measure aaributes: ASributes that measure some quan,ty and can be aggregated upon. skirt pastel S 11 skirt pastel M 9 skirt pastel L 15 skirt white S 2 skirt white M 5 skirt white L 3 dress dark S 2 dress dark M 6 dress dark L 12 dress pastel S 4 dress pastel M 3 dress pastel L 3 dress white S 2 dress white M 3 dress white L 0 shirt dark S 2 shirt dark M 6 shirt dark L 6 pants white L 2

Sales Database Example NAME COLOR SIZE QTY skirt dark S 2 skirt dark M 5 skirt dark L 1 Measure aaributes: ASributes that measure some quan,ty and can be aggregated upon. skirt pastel S 11 skirt pastel M 9 skirt pastel L 15 skirt white S 2 skirt white M 5 skirt white L 3 dress dark S 2 Dimension aaributes: ASributes that define dimensions on which measure asributes can be viewed. dress dark M 6 dress dark L 12 dress pastel S 4 dress pastel M 3 dress pastel L 3 dress white S 2 dress white M 3 dress white L 0 shirt dark S 2 shirt dark M 6 shirt dark L 6 pants white L 2

Database Example (cont.) SIZE NAME all COLOR dark pastel white TOTAL skirt 8 35 10 53 dress 20 10 5 35 shirt 14 7 28 49 pants 20 2 5 27 TOTAL 62 54 48 164 This is a cross- tabula,on. Is is not a relaaon! Sales NAME COLOR SIZE QTY skirt dark S 2 skirt dark M 5 skirt dark L 1 skirt pastel S 11 skirt pastel M 9 skirt pastel L 15 skirt white S 2 skirt white M 5 skirt white L 3 dress dark S 2 dress dark M 6 dress dark L 12 dress pastel S 4 dress pastel M 3 dress pastel L 3 dress white S 2 dress white M 3 dress white L 0 shirt dark S 2 shirt dark M 6 shirt dark L 6 pants white L 2

AggregaAon Data can be aggregated across different dimensions: COLOR NAME all SIZE S M L TOTAL skirt 15 19 19 53 dress 8 12 15 35 shirt 23 8 18 49 pants 18 6 3 27 TOTAL 64 45 55 164

AggregaAon (cont.) Data can be aggregated at different granularity: 2 5 3 1 11 COLOR dark pastel white all 4 7 6 12 29 2 8 5 7 22 8 20 14 20 62 34 35 10 7 2 54 21 10 8 28 5 48 77 53 35 49 27 164 all skirt dress shirt pants all NAME 4 9 42 large 16 18 45 small medium 3- dimensional cuboid

Sales NAME COLOR SIZE QTY skirt dark S 2 skirt dark M 5 skirt dark L 1 skirt pastel S 11 skirt pastel M 9 skirt pastel L 15 skirt white S 2 skirt white M 5 skirt white L 3 dress dark S 2 dress dark M 6 dress dark L 12 dress pastel S 4 dress pastel M 3 dress pastel L 3 dress white S 2 dress white M 3 dress white L 0 shirt dark S 2 shirt dark M 6 shirt dark L 6 pants white L 2 AggregaAon (cont.) We can define a hierarchy over asribute values along specific dimensions NameCat NAME skirt dress shirt pants CATEGORY womenswear womenswear menswear menswear NAME CATEGORY Less general (more asribute values are possible) More general

AggregaAon (cont.) Data can be aggregated at different resolu,on: CATEGORY SIZE S M L TOTAL womenswear 23 31 34 88 menswear 41 14 21 76 TOTAL 64 45 55 164 Roll- up NAME SIZE S M L TOTAL skirt 15 19 19 53 dress 8 12 15 35 shirt 23 8 18 49 pants 18 6 3 27 TOTAL 64 45 55 164 Drill- down

Data Warehousing Provides architectures and tools to: Organize Understand Use data, assis,ng in strategic decision- making

Analyzing Data

Branch BRANCHID NAME ADDRESS B1 Colombo Av. Lusíada Analyzing data Customer CUSTID NAME ADDRESS AGE INCOME CREDITCAT CUSTCAT C1 Maria Silva Av. Liberdade, n. 123 31 60,000.00 1 3 Item ITEMID NAME BRAND CATEGORY TYPE PRICE MADEIN SUPPLIER COST I3 hi- res- tv Mochiba High resolu,on TV 988.00 Japan NikoX 600.00 I8 laptop Cell Laptop computer 1,369.00 USA Cell 983.00 Employee EMPID NAME CATEGORY GROUP SALARY COMMISSION E55 Santos, Manuel home entretainment manager 100,000.00 2% WorksAt Purchases EMPID BRANCHID TRANSID CUSTID EMPID DATE TIME PAYMENT AMOUNT E55 B1 T100 C1 E55 21/03/2011 14:35 VISA 1,357.00 ItemsSold Is there any rela,on between TRANSID ITEMID QTY customers incomes and the average T100 I3 1 price of laptops they purchase?

Branch BRANCHID NAME ADDRESS B1 Colombo Av. Lusíada Analyzing data Customer CUSTID NAME ADDRESS AGE INCOME CREDITCAT CUSTCAT C1 Maria Silva Av. Liberdade, n. 123 31 60,000.00 1 3 Item ITEMID NAME BRAND CATEGORY TYPE PRICE MADEIN SUPPLIER COST I3 hi- res- tv Mochiba High resolu,on TV 988.00 Japan NikoX 600.00 I8 laptop Cell Laptop computer 1,369.00 USA Cell 983.00 Employee EMPID NAME CATEGORY GROUP SALARY COMMISSION E55 Santos, Manuel home entretainment manager 100,000.00 2% WorksAt Purchases EMPID BRANCHID TRANSID CUSTID EMPID DATE TIME PAYMENT AMOUNT E55 B1 T100 C1 E55 21/03/2011 14:35 VISA 1,357.00 ItemsSold TRANSID ITEMID QTY StaAsAcs is our friend! T100 I3 1

Analyzing Data (cont.) Analyzing data beyond obvious is hard Some challenges: Incomplete data Noisy data Inconsistent data

Example: Outlier Missing value? What about noise?

Example: Noisier data (trend less clear)

Example:

ExtracAng Useful InformaAon What if you want to predict your missing value? Missing value?

Extract Useful InformaAon (cont.) We can use the observed data to build a model and use it to predict:

Machine Learning Machine learning/data mining: Discipline that devises methods to extract useful informa,on (rela,ons) from data The more data you have, the more you can learn

But Accoun,ng for the noise in the model can really make a difference!

If You Had to Predict which of the two predicaons would you prefer? WHY?

There s No Free Lunch! You always assume something about the data LEARNING BIAS

Data Mining Analyze data to extract useful (implicit) informa,on: Frequent paserns [HK, Chap. 5] Clusters [HK, Chap. 7] Rela,ons (func,ons) [HK, Chap. 6] [HK] J. Han, M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufman Publishers, 2 nd edi,on, 2001.

That s It