当前位置: 代码迷 >> 数据仓库 >> 求解:数据仓库与数据挖掘题1,该怎么解决
  详细解决方案

求解:数据仓库与数据挖掘题1,该怎么解决

热度:177   发布时间:2016-05-05 16:11:00.0
求解:数据仓库与数据挖掘题1
不好意思是英文版的。如果分嫌少可以在加!请各位达人帮忙,谢了!
每题10分!

一、Data   warehouse   design
(1)   Enumerate   three   classes   of   schemas   that   are   popularly   used   for   modeling   data   warehouses.
(2)   Draw   a   snowflake   schema   diagram   for   the   Big_University   data   warehouse   which   consists   of   four   dimensions:   student,   course,   semester   and   instructor,   and   two   measures:   count,   and   avg_grade,   where   avg_grade   is   the   actual   grade   of   student   in   the   lowest   concept   layer,   whereas   in   the   higher   concept   layers,   avg_grade   is   the   average   grade   for   the   given   student,   course,   semester   and   instructor.
(3)   Starting   with   the   base   cuboid   (student,   course,   semester,   instructor),   what   specific   OLAP   operations   should   be   performed   in   order   to   list   the   average   grade   of   each   student   taken   the   course   of   “CS”,   eg,   roll   up   from   “semester”   to   “year”?
(4)   If   each   dimension   contains   5   layers(including   all),   eg,   student   <   major   <   status   <   university   <   all,   then   how   many   cuboids   in   this   data   cube   (   including   base   cuboid   and   apex   cuboid)?

二、Data   cube   computation
Suppose   a   base   cuboid   has   3   dimensions,   (A,   B,   C),   with   the   number   of   cells   shown   below:   |A|   =   1,000,000,   |B|   =   100,   and   |C|   =   1,000.   Suppose   each   dimension   is   partitioned   evenly   into   10   portions   for   chunking.
(1)   Assuming   each   dimension   has   only   one   level,   draw   the   complete   lattice   of   the   cube.
(2)   If   each   cube   cell   stores   one   measure   with   4   bytes,   what   is   the   total   size   of   the   computed   cube   if   the   cube   is   dense?
(3)   If   the   cube   is   very   sparse,   describe   an   effective   multidimensional   array   structure   to   store   the   sparse   cube.
(4)   State   the   order   for   computing   the   chunks   in   the   cube   which   requires   the   least   amount   of   space,   and   compute   the   total   amount   of   main   memory   space   required   for   computing   the   2-D   planes.

三、Mining   association   rules
Suppose   we   have   the   following   transactional   data.
    TID     Items_bought
    T100     {K,   A,   D,   B}
    T200     {D,   A,   C,   E,   B}