原文:5 free e-books for machine learning mastery
作者:Serdar Yegulalp 翻译:赖信涛 责编:仲培艺
There are few subjects in computing as fascinating, or intimidating, as machine learning. Let's face it -- you can't master machine learning in a weekend, and at the very least it requires a good grasp of the underlying mathematical principles.
That said, if you have the math chops, you'll want to augment your use of machine learning frameworks (there are plenty to pick from) with a good understanding of the theory behind them.
[ The InfoWorld review roundup: AWS, Microsoft, Databricks, Google, HPE, and IBM machine learning in the cloud. | Get a digest of the day's top tech stories in the InfoWorld Daily newsletter. ]Here are five high-quality, free-to-read texts that provide introductions to and explanations of machine learning's ins and outs. Some have code examples, but most focus on formulas and theory; in principle, they can be applied to any number of languages, frameworks, or problems.
A Course in Machine Learning
The gist: A highly readable text designed to provide an extremely beginner-friendly approach to the topic. The book is a work in progress -- some sections are still marked TODO -- but what it lacks in completeness, it makes up in sheer accessibility.
Target audience: Anyone with a good grasp of calculus, probability, and linear algebra. No expertise in any specific language is required.
Code content: Some pseudocode; the majority of what's presented is concepts and formulas.
The Elements of Statistical Learning
The gist: A 500-plus-page text that covers what the authors describe as "learning from data," the processes of employing statistics that are the underpinnings for machine learning. It's been through two editions and 10 printings since 2001, for good reason -- it covers a massive amount of territory and isn't limited to any one field.
Target audience: Those who already have a good foundation in math and statistics and don't need a lot of hand-holding to translate their math skills into good code.
Code content: None. This isn't a software development text; this is about foundational concepts around machine learning.
Mentioned in this article
Bayesian Reasoning and Machine Learning
The gist: Bayesian methods are behind everything from spam filters to pattern recognition, so they constitute a major field of study for machine-learning mavens. This text walks through all the major aspects of Bayesian statistics, and how they apply to common scenarios in machine learning.
Target audience: Anyone with a good grasp of calculus, probability, and linear algebra.
Code content: Lots! Each chapter contains both pseudocode and links to a toolkit of actual code demos. That said, the code is not in Python or R, but is code for the commercial MATLAB environment, although GNU Octave can work as an open source substitute.
Gaussian Processes for Machine Learning
The gist: Gaussian processes are part of the family of analyses used by Bayesian methods. This text focuses on how Gaussian concepts can be used in common machine learning methods like classification, regression, and model training.
Target audience: Roughly the same as "Bayesian Reasoning and Machine Learning."
Code content: Most of the code featured in the book is pesudocode, but like "Bayesian Reasoning and Machine Learning," the appendices include examples for MATLAB/Octave.
Machine Learning
The gist: A collection of essays on different and highly specific aspects of machine learning. Some are more general and philosophical; others are focused on specific problem domains, such as "Machine Learning Methods for Spoken Dialogue Simulation and Optimization."
Target audience: Intended for lay readers as well as the more technically inclined.
Code content: Virtually none, although formulas abound. Read for flavor.
计算机中有一些领域非常令人着迷,或令人畏惧,机器学习就是这样。精通机器学习并非一朝之事,至少,你需要花一些时间掌握必备的数学知识。
也就是说,如果你数学很好,那么就会更加理解机器学习框架背后的原理,使用起来也会得心应手。
下面介绍5本高质量的、免费阅读的电子书,主要是对机器学习的介绍和解释。其中有一些有代码示例,但是一般都是专注于公式和理论的,这些原理可以应用到各种语言、框架和问题。
A Course in Machine Learning
要点:为初学者准备的初涉机器学习的高质量文档。此书仍在撰写中——有一些章节依然标记着TODO——但是其高可读性完全可以弥补这部分不足。
目标读者:任何掌握微积分、概率论和线性代数的人都可以阅读此书,不需要有任何编程语言专长。
代码内容:有一些伪代码,不过此书大部分用来展示的东西还是原理和公式。
The Elements of Statistical Learning
要点:超过500页的文本,据作者称,具体陈述了如何“从数据中学习”,对机器学习岗位需求的急剧升高显示了这个领域的热门程度。此书自2001年已经出版过两个版本并印刷了10次,此书还有一大好处:跨度很大,不局限于一个领域。
目标读者:统计学和数学基础较好的、不需要将自己的数学形式转换成代码的人。
代码内容:没有。这并不是一本软件开发的书,而是关于机器学习的理论基础。
Bayesian Reasoning and Machine Learning
要点内容: Bayesian(贝叶斯)方法是所有有关模式识别和垃圾过滤的基础,所以逐渐形成了一个特殊的领域。此书涵盖Bayesian统计的各个主要方面,阐述了它是如何应用的。
目标读者:任何有微积分、概率论和线性代数基础的人。
代码内容:很多!每一个章节都有伪代码和工具的链接,以及一些demo。而且,代码并不是Python或R语言的,而是商业MATLAB环境,GNU Octave也可以作为一个开源的替代品。
Gaussian Processes for Machine Learning
重点内容:高斯处理也是贝叶斯方法的一部分。本书集中讨论如何在一般机器学习方法中使用高斯原理,例如分类、回归和模型训练等。
目标读者:大致和Bayesian Reasoning and Machine Learning差不多。
代码内容:书中使用的代码大多是伪代码,但是和ayesian Reasoning and Machine Learning一样,有些MATLAB/Octave代码。
Machine Learning
重点内容:一个论文集,包括很多不同方面、内容深奥的机器学习知识。其中一些比较抽象,另一些专注于特定的问题,比如“模拟对话的机器学习方法”等。
目标读者:想要在这方面深入学习的人。
代码内容:有一些公式,没有代码。