当前位置: 代码迷 >> 综合 >> Principles of Machine Learning -- Before You Start 翻译

Principles of Machine Learning -- Before You Start 翻译

热度:60   发布时间:2023-09-18 15:13:54.0

全世界都在学习AI,当然我也不能例外。自动驾驶、人脸识别、遍地的机器人。。。So,今天起,我将开始着手翻译Principles of Machine Learning全书,全书共7个章节加一个导读,如果中间掺杂有实验,我也会和大家一起来完成。那么现在,让我们开始机器学习的旅程吧!


Welcome to the principles of Machine Learning! My name is Cynthia Rudin. >> And I’m Steve Elston. >> Now machine learning is everywhere. This is the time for machine learning;

it’s becoming mainstream, it’s in the search engines we use every day, it’s in the bank teller machines reading our checks, it’s in our smart phone assistance like Cortana, it’s – you know,

jobs in machine learning are in every industry and we are thrilled to be able to give you an instruction to machine learning in this course. So let’s Steven and I introduce ourselves first.

So I am an associate professor of computer science and electrical and computer engineering at Duke, and an associate professor of statistics at MIT, and my main expertise is in machine learning and data mining.

My lab is called “the prediction analysis” lab. And I have a PhD from Princeton University, and a lot of my work that I do is applied in machine learning and it’s applied to problems in the electric power history, in healthcare,

and in computational criminology. >>

Hi and I’m Steve Elston. I’m a co-founder and principle consultant at a data science consultancy at Seattle called Quantia Analytics. I’ve been working in predictive analytics and machine learning for several decades now.

I’ve been a long-term R S/SPLUS Python user and developer, started using S when it was a Bell labs project and of course more – you know, in recent decade moved to R like everybody else.

I’m currently an advisor on Azure machine learning and some other analytics products to Microsoft, and I’ve worked in a variety of industries:

payment fraud prevention, telecommunication, capital markets including things like market credit risk models, clearing, and collateral management,

and also worked in several industrial areas such as forecasting for logistics management.

And I have a PhD also from Princeton University and mine is in geophysics. >> Now when I first learned about machine learning, I thought it was magic.

A way for computers to predict the future, just by seeing the past. And you know, it’s a way for computers to learn on their own how to solve problems that I can’t solve, and that’s exactly what’s going on.

Computers are learning, just from observing what’s happened in the past. But it’s nothing like magic. Now machine learning, in addition to being a really useful toolbox for industrial applications,

it also gives you a perspective about the way your mind works. So let’s say that I asked you why you could learn and why a computer can’t, right, what would you say?

Would you say that it’s because you’ve seen more of the world than a computer has?

I mean, I think that’s not particularly true anymore, because we have lots of pictures and video and sound now that we could feed to any computer.

Is it because there are more connections in your brain than in a computer? Well that might be part of it, but lots of creatures with much smaller brains than my computer can still learn,

so that’s not it. Maybe you could argue that a brain is more flexible in some ways than a computer; maybe you could think your brain is somehow more open to identifying new types of patterns than your computer,

and that’s why you can learn perhaps.

The interesting thing is that actually that’s not quite the way it is; in fact, it’s sort of the opposite.

Your brain is really good at identifying only certain kinds of patterns; in fact, these are the types of patterns that it’s expecting.

The fact that humans can learn is not so much a consequence of so much of the human brain being flexible, as it is of the human brain being inflexible,

being wired to identify exactly the types of patterns that it comes across, right. Natural images, real sounds, patterns of behavior… these are – you know, these are things that we’re really good at identifying. Humans are absolutely awful at identifying patterns in large databases,right, we can’t – we just can’t learn in some settings, and what enables us to learn in the settings we can learn in is the way that our brains are wired. It’s the structure in our mind; it’s not the flexibility, it’s the limited flexibility.

It’s just that structure. Okay so what is the field of machine learning exactly? It completely revolves around setting up structures in the computer that limit its flexibility and allow it to learn.

Okay, setting up these structures is really a form of statistical modelling, and that’s what we’re going to do in this course. And once you can teach a computer to learn, there are a huge number of applications that you can use it on.

>> So, let’s talk about a few of the applications that we’ll use both for our demos and for the labs that you’re going to do hands-on in this course. So first off, we’re going to do a classification example,

and we’ll be coming back to this in several points in the course – actually each of these and these examples,

and so we’re going to work on classifying diabetes patients who have been in a hospital for treatment and we want to classify the ones who are at high risk that they’re going to be readmitted to the hospital;

that is, that somehow their treatment or the follow up to their treatment or something isn’t likely to be sufficient and they’re going to wind up being re-hospitalized, which is, as you can imagine a serious problem.

It’s expensive, it’s dangerous for the patients, etc. so there’s a lot of reasons why this is an important area. We’re going to look at forecasting; forecasting for demand is used all over the place from warehouse management to power generation.

In particular, we’re going to look at forecasting demand for rented bicycles, and so that will be an – again, an application we’ll come back to at several points in this course.

A lot of these things are done in clustering and segmentation, and we’re going to look at segmenting people by their income level, and that’s an –

again, an analog for lots of different things that are done and everything from political science to marketing. And finally, we’re going to look at how a recommender works;

we’re going to use a restaurant database of Mexican restaurants and compute some recommendations for some of the customers who have written reviews for these Mexican restaurants. >>

Okay now as I mentioned, humans are lousy at finding patterns in large databases, and so here are some of the applications that we’re working on in my lab that use large databases and machine learning,

and in all of these applications, the answer is really in the data. It really is, and by providing the computer with the proper machine learning structure to find important patterns, we can really make headway into societal problems.

For instance, we’ve been looking at power grid failures and personalized advertising, and healthcare applications.

>> So, why would you want to continue with this course? What should you expect to get out of this course? Well first off, it’s going to be a hands-on introduction to machine learning.

We have some great labs laid out here, there’s going to be demos – so you’re going to gain some practical experience at working with data and applying machine learning algorithms of various types to those data.

We’re going to look at actually all the major focus areas in machine learning, so we’ll cover a wide variety of algorithms,

methods and techniques. We’re going to use Azure machine learning quite a bit for demos and for your labs; and why actually we’re doing this, it’s not only a great environment,

but it’s also a great learning environment because a lot of the tedious stuff is kind of taking care for you, so there’s a lot of things you won’t have to spend time when you do your weekly labs.

Nonetheless, we’ll do a significant amount of data cleaning and visualization using R and/or Python, you can pick which path you’re on. So we’ll be working – you can be building some skills with that. And we hope that as you go along here as you work on these examples as you listen to the theory lectures, you start to build some intuition around analytics and machine learning and how it all fits together and mostly given intuition of what’s a useful result, what’s adding value, and what’s going in the direction you or say your boss wants you to go. And we’re going to minimize the math; there’s not going to be any heavy theories, so if you remember a little bit of calculus and some minimal linear algebra, you should be good to go here.

So what are we going to cover specifically in this course? So the first module, we’re going to discuss an introduction to classification, and classification is – in the history of machine learning is kind of where machine learning grew out of largely. 

Then we’re going to talk about regression, and regression is also – many regression methods that are important in machine learning and they have even a much longer history in statistics going back to the late 19th century. 

We’re going to then talk about how do you – once you have improved machine learning models, how do you evaluate the performance? How do you know what to do to improve that performance? 

We’re going to then look at some more modern powerful methods like tree and ensemble learning methods and if you don’t know what that means, stay tuned you’ll find out a lot about it. 

And we’re going to look at optimization-based learning methods such as sport vector machines and neural networks. And we’ll finish up with clustering and recommenders. >> 

So, as you’re taking this course, we hope you will take some steps to get the most out of it to maximize your learning experience.

So overall, think about the fact that this course is going to be over 6 weeks, we have one module per week over those 6weeks so you can kind of plan your time and your work that way. 

For each module, we have lectures, demos, and labs; and the labs derive from the lectures and the demos and they are for you to do on your own to reinforce key learning concepts.

And you’ll perform the labs using – as I already mentioned, Azure machine learning, but also either R or Python, and I suggest you decide if you’re going to use R or Python.

Every lab has the same materials or the same steps in either language; it doesn’t matter in terms of the learning experience. If you’re very ambitious of course you can try both, but for most people just doing one or the other is going to be just great. So some of you want to get the certificate from this course, so what do you need to know? 

First off, you need a 70% score to pass and get the certificate, and that score is divided between assessments at the end of each of the 6 modules, and the final exam. 

So each module assessment that – or all those module assessments together are half your grade, 50% of your grade, and on each question for the assessment, you actually get two tries so if you mess it up the first time don’t panic, 

you get another chance. The other half of your grade is a final exam at the end of the class. This one you only get one try per question, but by then you’ve been through the lectures, you’ve seen all the demos,

and you’ve done all the labs, and so you should – you know, be in a great position to ace that. 

So we hope you get a lot out of this course, and we’re looking forward to presenting it and I think it’s going to be really great informative class to get yourself bootstrapped into the wonderful world of machine learning!

欢迎来到机器学习的原理!我叫辛西娅·鲁丁。>>,我是Steve Elston。>>现在机器学习到处都是。这是机器学习的时代;





在计算犯罪学。> >

大家好,我是史蒂夫·埃尔斯顿。我是西雅图一家名为Quantia Analytics的数据科学咨询公司的联合创始人和首席顾问。我从事预测分析和机器学习已经有几十年了。

我是一个长期的R S/SPLUS Python用户和开发人员,开始使用S时是在贝尔实验室的一个项目中,当然后来-你知道,在最近十年,像其他人一样转移到R。

我现在是Azure机器学习的顾问和微软的其他分析产品,我在很多行业工作过: 例如支付欺诈预防,电信,资本市场包括市场信用风险模型,清算和抵押品管理,并在多个工业领域工作,如物流管理预测。

























我们将使用一家墨西哥餐馆的餐馆数据库,并为一些为这些墨西哥餐馆撰写评论的顾客提供一些建议。> >














我们将研究基于优化的学习方法,比如运动向量机和神经网络。我们将以聚类和推荐结束。> >










