102年第2學期-1766 類別資料分析 課程資訊

課程分享

選課分析

本課程名額為 70人,已有10人選讀,尚餘名額60人。

評分方式

評分項目 配分比例 說明
平時考成績 40 學習態度(包括出缺席)、作業成績、平時考成績、課堂討論與互動
期中考 30 紙筆測驗+PPT口頭報告
期末考 30 紙筆測驗+PPT書面報告

授課教師

林雅俐

教育目標

類別資料分析主要針對類別型(二元、多元分類和順序尺度)目標變數,以常用之敘述統計(次數或比率)和統計圖 (如圓餅圖和柱狀圖)為基礎,透過機率分配和抽樣分配,進階至母群體之參數推論、二維和三維交叉分析以及目標變數之決策樹和預測模型建立,配合課程內容使用統計軟體SAS Enterprise Guide(SAS EG)、Enterprise Miner(SAS EM)、和SPSS進行資料分析,從統計的角度連結資料採礦方法,如關聯分析(購物籃案例)、類神經網路(詐欺案例)、決策樹(信用狀態案例)和邏輯斯迴歸模型(信用狀態案例),藉此從海量資料和高維度大型資料庫中挖掘潛藏的有用資訊,提供管理階層決策輔助之用。 Categorical Data Analysis(CDA) mainly focus on the analysis of categorical response (or target) variables. Graphical bar charts and pie charts, frequency tables, two-way and three-way contingency tables are used to describe the association among the qualitative target and predictor variables. It is applicable to a wide variety of academic disciplines, from the natural and social sciences to the humanities, government and business. In addition, patterns in the data may be modeled in a way that accounts for randomness and uncertainty in the observations, and are then used to draw inferences about the process or population being studied. This course also introduces the methods in data mining through the statistical point of view. Students will learn the ability to analyze massive and complicated data and will be able to turn the raw data into valuable information using association rules, neural network, decision tree, and logistic regression model in both the software SAS Enterprise Guide and Enterprise Miner.

課程概述

Categorical data analysis that deals with qualitative or discrete quantitative data is one of the most important statistical tools nowadays. In recent years, this tool plays a fundamental role on analyzing polychotomous data, particularly in the social and health sciences. This course introduces statistical theories and models for analyzing categorical data. The main topics cover : (1) likelihood-based inferences on measures of association for two-dimensional and three-dimensional contingency tables under different assumptions. (2) generalized linear (mixed) models with emphasis on binary (Poisson) regression and logit models. (3) Repeated categorical data modeling, such as generalized estimating equation approaches and quasi-likelihood methods. (4) Asymptotic results and other advanced topics.

課程資訊

參考書目

1. *Agresti, A. and Franklin, C.(2009), Statistics—The Art and Science of Learning from Data, 2nd edition, Pearson Education, Inc. (東華書局/新月圖書代理
2. *曾淑峰、林志弘、翁玉麟(2012年9月),資料採礦應用—以SAS Enterprise Miner為工具,梅霖文化事業有限公司 (ISBN: 978-986-6511-60-8)
3. *Slaughter, S.J. and Delwiche, L.D., 蔡宏明、蔡秉諺譯(2011年11月),SAS Enterprise Guide實用工具書,梅霖文化事業有限公司 (ISBN: 978-986-6511-58-5)
4. 邱皓政*,量化研究與統計分析—SPSS資料分析範例,五南圖書股份有限公司,2010年10月五版.