数据挖掘方面重要会议的最佳paper集合,后续将陆续分析一下内容:
主要有KDD、SIGMOD、VLDB、ICML、SIGIR
KDD (Data Mining) | ||
2013 | Simple and Deterministic Matrix Sketching | Edo Liberty, Yahoo! Research |
2012 | Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping | Thanawin Rakthanmanon, University of California Riverside; et al. |
2011 | Leakage in Data Mining: Formulation, Detection, and Avoidance | Shachar Kaufman, Tel-Aviv University; et al. |
2010 | Large linear classification when data cannot fit in memory | Hsiang-Fu Yu, National Taiwan University; et al. |
Connecting the dots between news articles | Dafna Shahaf & Carlos Guestrin, Carnegie Mellon University | |
2009 | Collaborative Filtering with Temporal Dynamics | Yehuda Koren, Yahoo! Research |
2008 | Fastanova: an efficient algorithm for genome-wide association study | Xiang Zhang, University of North Carolina at Chapel Hill; et al. |
2007 | Predictive discrete latent factor models for large scale dyadic data | Deepak Agarwal & Srujana Merugu, Yahoo! Research |
2006 | Training linear SVMs in linear time | Thorsten Joachims, Cornell University |
2005 | Graphs over time: densification laws, shrinking diameters and possible explanations | Jure Leskovec, Carnegie Mellon University; et al. |
2004 | A probabilistic framework for semi-supervised clustering | Sugato Basu, University of Texas at Austin; et al. |
2003 | Maximizing the spread of influence through a social network | David Kempe, Cornell University; et al. |
2002 | Pattern discovery in sequences under a Markov assumption | Darya Chudova & Padhraic Smyth, University of California Irvine |
2001 | Robust space transformations for distance-based operations | Edwin M. Knorr, University of British Columbia; et al. |
2000 | Hancock: a language for extracting signatures from data streams | Corinna Cortes, AT&T Laboratories; et al. |
1999 | MetaCost: a general method for making classifiers cost-sensitive | Pedro Domingos, Universidade Técnica de Lisboa |
1998 | Occam's Two Razors: The Sharp and the Blunt | Pedro Domingos, Universidade Técnica de Lisboa |
1997 | Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Di... | Foster Provost & Tom Fawcett, NYNEX Science and Technology |
SIGMOD (Databases) | ||
2013 | Massive Graph Triangulation | Xiaocheng Hu, The Chinese University of Hong Kong; et al. |
2012 | High-Performance Complex Event Processing over XML Streams | Barzan Mozafari, Massachusetts Institute of Technology; et al. |
2011 | Entangled Queries: Enabling Declarative Data-Driven Coordination | Nitin Gupta, Cornell University; et al. |
2010 | FAST: fast architecture sensitive tree search on modern CPUs and GPUs | Changkyu Kim, Intel; et al. |
2009 | Generating example data for dataflow programs | Christopher Olston, Yahoo! Research; et al. |
2008 | Serializable isolation for snapshot databases | Michael J. Cahill, University of Sydney; et al. |
Scalable Network Distance Browsing in Spatial Databases | Hanan Samet, University of Maryland; et al. | |
2007 | Compiling mappings to bridge applications and databases | Sergey Melnik, Microsoft Research; et al. |
Scalable Approximate Query Processing with the DBO Engine | Christopher Jermaine, University of Florida; et al. | |
2006 | To search or to crawl?: towards a query optimizer for text-centric tasks | Panagiotis G. Ipeirotis, New York University; et al. |
2004 | Indexing spatio-temporal trajectories with Chebyshev polynomials | Yuhan Cai & Raymond T. Ng, University of British Columbia |
2003 | Spreadsheets in RDBMS for OLAP | Andrew Witkowski, Oracle; et al. |
2001 | Locally adaptive dimensionality reduction for indexing large time series databases | Eamonn Keogh, University of California Irvine; et al. |
2000 | XMill: an efficient compressor for XML data | Hartmut Liefke, University of Pennsylvania |
1999 | DynaMat: a dynamic view management system for data warehouses | Yannis Kotidis & Nick Roussopoulos, University of Maryland |
1998 | Efficient transparent application recovery in client-server information systems | David Lomet & Gerhard Weikum, Microsoft Research |
Integrating association rule mining with relational database systems: alternatives and implications | Sunita Sarawagi, IBM Research; et al. | |
1997 | Fast parallel similarity search in multimedia databases | Stefan Berchtold, University of Munich; et al. |
1996 | Implementing data cubes efficiently | Venky Harinarayan, Stanford University; et al. |
VLDB (Databases) | ||
2013 | DisC Diversity: Result Diversification based on Dissimilarity and Coverage | Marina Drosou & Evaggelia Pitoura, University of Ioannina |
2012 | Dense Subgraph Maintenance under Streaming Edge Weight Updates for Real-time Story Identification | Albert Angel, University of Toronto; et al. |
2011 | RemusDB: Transparent High-Availability for Database Systems | Umar Farooq Minhas, University of Waterloo; et al. |
2010 | Towards Certain Fixes with Editing Rules and Master Data | Shuai Ma, University of Edinburgh; et al. |
2009 | A Unified Approach to Ranking in Probabilistic Databases | Jian Li, University of Maryland; et al. |
2008 | Finding Frequent Items in Data Streams | Graham Cormode & Marios Hadjieleftheriou, AT&T Laboratories |
Constrained Physical Design Tuning | Nicolas Bruno & Surajit Chaudhuri, Microsoft Research | |
2007 | Scalable Semantic Web Data Management Using Vertical Partitioning | Daniel J. Abadi, Massachusetts Institute of Technology; et al. |
2006 | Trustworthy Keyword Search for Regulatory-Compliant Records Retention | Soumyadeb Mitra, University of Illinois at Urbana-Champaign; et al. |
2005 | Cache-conscious Frequent Pattern Mining on a Modern Processor | Amol Ghoting, Ohio State University; et al. |
2004 | Model-Driven Data Acquisition in Sensor Networks | Amol Deshpande, University of California Berkeley; et al. |
2001 | Weaving Relations for Cache Performance | Anastassia Ailamaki, Carnegie Mellon University; et al. |
1997 | Integrating Reliable Memory in Databases | Wee Teck Ng & Peter M. Chen, University of Michigan |
ICML (Machine Learning) | ||
2013 | Vanishing Component Analysis | Roi Livni, The Hebrew University of Jerusalum; et al. |
Fast Semidifferential-based Submodular Function Optimization | Rishabh Iyer, University of Washington; et al. | |
2012 | Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring | Sungjin Ahn, University of California Irvine; et al. |
2011 | Computational Rationalization: The Inverse Equilibrium Problem | Kevin Waugh, Carnegie Mellon University; et al. |
2010 | Hilbert Space Embeddings of Hidden Markov Models | Le Song, Carnegie Mellon University; et al. |
2009 | Structure preserving embedding | Blake Shaw & Tony Jebara, Columbia University |
2008 | SVM Optimization: Inverse Dependence on Training Set Size | Shai Shalev-Shwartz & Nathan Srebro, Toyota Technological Institute at Chicago |
2007 | Information-theoretic metric learning | Jason V. Davis, University of Texas at Austin; et al. |
2006 | Trading convexity for scalability | Ronan Collobert, NEC Labs America; et al. |
2005 | A support vector method for multivariate performance measures | Thorsten Joachims, Cornell University |
1999 | Least-Squares Temporal Difference Learning | Justin A. Boyan, NASA Ames Research Center |
SIGIR (Information Retrieval) | ||
2013 | Beliefs and Biases in Web Search | Ryen W. White, Microsoft Research |
2012 | Time-Based Calibration of Effectiveness Measures | Mark Smucker & Charles Clarke, University of Waterloo |
2011 | Find It If You Can: A Game for Modeling Different Types of Web Search Success Using Interaction Data | Mikhail Ageev, Moscow State University; et al. |
2010 | Assessing the Scenic Route: Measuring the Value of Search Trails in Web Logs | Ryen W. White, Microsoft Research |
2009 | Sources of evidence for vertical selection | Jaime Arguello, Carnegie Mellon University; et al. |
2008 | Algorithmic Mediation for Collaborative Exploratory Search | Jeremy Pickens, FX Palo Alto Lab; et al. |
2007 | Studying the Use of Popular Destinations to Enhance Web Search Interaction | Ryen W. White, Microsoft Research; et al. |
2006 | Minimal Test Collections for Retrieval Evaluation | Ben Carterette, University of Massachusetts Amherst; et al. |
2005 | Learning to estimate query difficulty: including applications to missing content detection and dis... | Elad Yom-Tov, IBM Research; et al. |
2004 | A Formal Study of Information Retrieval Heuristics | Hui Fang, University of Illinois at Urbana-Champaign; et al. |
2003 | Re-examining the potential effectiveness of interactive query expansion | Ian Ruthven, University of Strathclyde |
2002 | Novelty and redundancy detection in adaptive filtering | Yi Zhang, Carnegie Mellon University; et al. |
2001 | Temporal summaries of new topics | James Allan, University of Massachusetts Amherst; et al. |
2000 | IR evaluation methods for retrieving highly relevant documents | Kalervo J?rvelin & Jaana Kek?l?inen, University of Tampere |
1999 | Cross-language information retrieval based on parallel texts and automatic mining of parallel text... | Jian-Yun Nie, Université de Montréal; et al. |
1998 | A theory of term weighting based on exploratory data analysis | Warren R. Greiff, University of Massachusetts Amherst |
1997 | Feature selection, perceptron learning, and a usability case study for text categorization | Hwee Tou Ng, DSO National Laboratories; et al. |
1996 | Retrieving spoken documents by combining multiple index sources | Gareth Jones, University of Cambridge; et al. |
推荐一个网站,感谢作者的努力搜集,主要是各种顶级会议的最佳论文集合。
http://jeffhuang.com/best_paper_awards.html