AlphaGo2.0版已自我學(xué)習(xí) 目標(biāo)用于科學(xué)醫(yī)學(xué)領(lǐng)域
AlphaGo2.0版已自我學(xué)習(xí) 目標(biāo)用于科學(xué)醫(yī)學(xué)領(lǐng)域
5月23日,當(dāng)今世界圍棋第一人柯潔九,23日下午在這里執(zhí)黑289手以四分之一子的微弱劣勢(shì)負(fù)于計(jì)算機(jī)圍棋程序"阿爾法圍棋",在圍棋"人機(jī)大戰(zhàn)"三番棋中以0:1落后。
AlphaGo團(tuán)隊(duì)在賽后接受媒體采訪,對(duì)于新版本的AlphaGo進(jìn)行解讀。目前AlphaGo新版本變得更加強(qiáng)大,實(shí)現(xiàn)了自我學(xué)習(xí)。
Q: 這次的AlphaGo是純凈版的AlphaGo嗎?也就是說(shuō),它是否是完全不依賴人類大師的棋譜來(lái)自我學(xué)習(xí)的?
Demis Hassabis: I’m not sure if I understand the question correctly, but… You know… obviously the version… AlphaGo initially learns from human games, and then…most of its learning now is from its own play against itself. So…but of course to truly test what it knows, we have to play against human experts, because we don't know playing the game against itself is not going to expose its weaknesses, because it will obviously fix those during the self-play. So we really have to test it against the world’s best players.
我不太確定我是否正確理解了這個(gè)問(wèn)題。當(dāng)然在最初的版本中,AlphaGo從人類棋譜中學(xué)習(xí),后來(lái)到現(xiàn)在它大部分的學(xué)習(xí)材料都來(lái)自于自我對(duì)弈的棋譜。但是當(dāng)然為了真正地測(cè)試它的所學(xué),我們必須和人類高手對(duì)弈,因?yàn)槲覀儾恢涝谧晕覍?duì)弈的過(guò)程中它是否會(huì)顯露出它的缺點(diǎn),因?yàn)轱@然它在自戰(zhàn)過(guò)程中會(huì)避開不足。所以我們必須和世界上最優(yōu)秀的棋手們對(duì)弈以測(cè)試它。
David Silver: Perhaps I could just add to that. One of the innovations of AlphaGo-Master, is that it actually relies much more on learning from itself. So in this version, AlphaGo has actually become its own teacher, learning from moves which are taken from examples of its own searches, that relies much less actually on human data than previous versions. And one of our goals in doing so is to make it more and more general so that its principal can be applied to other domains beyond Go.
我補(bǔ)充一下。AlphaGo-Master的一大創(chuàng)新就是它更多地依靠自我學(xué)習(xí)。在這個(gè)版本中,AlphaGo實(shí)際上成為了它自己的老師,從它自己的搜索中獲得的下法中學(xué)習(xí),和上一個(gè)版本相比大幅減少了對(duì)人類棋譜的依賴。我們這樣做的目標(biāo)之一就是是它變得更為通用,從而能被應(yīng)用在圍棋以外的領(lǐng)域上。
Q:我想知道Master的版本是V25,那么現(xiàn)在和柯潔對(duì)弈的AlphaGo是不是一個(gè)更新的版本?另外我想知道這是我們最后一次見(jiàn)到AlphaGo嗎?AlphaGo未來(lái)會(huì)成為一個(gè)工具,幫助職業(yè)棋手繼續(xù)提升自己的技術(shù),還是從此就會(huì)和我們說(shuō)再見(jiàn)?
David Silver: So maybe I can answer the first part to that question, regarding the technology inside AlphaGo. So AlphaGo-Master is a new version of AlphaGo, and we worked very hard to improve the fundamental algorithm that is used in AlphaGo. In fact, it turns out that the algorithm often matters more than the amount of data, or the amount of compute that actually goes into it. And if you get the algorithms right to make them general and powerful enough, then they can really progress very rapidly. So in fact in AlphaGo-Master, actually uses 10 times less computation, and is trained in match in weeks rather than months, compare to the version that played against Lee Sedol last year. So it is a different version, and is at least in self-play performance considerably stronger. And we are here to find out if indeed it’s stronger as it seems in self-play, or if it has weaknesses that can be exposed.
我可以回答問(wèn)題的第一部分,關(guān)于AlphaGO內(nèi)部的技術(shù)問(wèn)題的。AlphaGo-Master是一個(gè)全新版本的AlphaGo,我們非常努力地工作,改進(jìn)了AlphaGo的基礎(chǔ)算法。事實(shí)證明,算法常常比數(shù)據(jù)的多少或者運(yùn)算力更重要。當(dāng)你把算法弄對(duì)使它們足夠通用和強(qiáng)大,它們運(yùn)行的速度是非??斓摹K允聦?shí)上AlphaGo-Master用了和去年挑戰(zhàn)李世石的那個(gè)版本相比來(lái)說(shuō)十分之一的計(jì)算能力,用了幾周在棋盤上訓(xùn)練而不是幾個(gè)月。所以這是一個(gè)不同的版本,至少在自我對(duì)弈中它表現(xiàn)的更為強(qiáng)大了。我們來(lái)這里就是為了看看它是否真的像在自戰(zhàn)中所表現(xiàn)的那樣強(qiáng)大,還是它依然存在能被暴露出來(lái)的弱點(diǎn)。
Demis Hassabis: And as far as the second part of the question, I’ll just answer that. And later on in the event we will be announcing the next steps for AlphaGo. So I don't want to say anything in advance of that, but we will be talking about that later in the week. But one thing I want to say is that, just like with the last version of AlphaGo where we published all the technical details and results of the AlphaGo program in the Nature article, in the scientific journal Nature. And we published all the details and that allowed other companies, you know… Tencent and Japanese companies, to make their own versions of AlphaGo, and some of them are very strong now as well, I’m sure you all know, playing online, probably 9 Dan level. And we plan to publish more details of the new version of AlphaGo in the next few months. So we will review those technical details, and then again other teams and academic labs will be able to implement their versions of this AlphaGo-Master architecture.
至于第二部分的問(wèn)題,由我來(lái)回答。今后在這個(gè)峰會(huì)上我們會(huì)公布AlphaGo的下一步計(jì)劃,所以在那之前我不想多說(shuō),我們會(huì)在這周稍后談到。但是有一件事是我想說(shuō)的,我們?cè)凇蹲匀弧冯s志中公布了上一個(gè)版本AlphaGo的技術(shù)細(xì)節(jié)和成果,這允許了其他的公司,比如騰訊和一些日本公司開發(fā)了他們自己版本的AlphaGo,這些程序中有一些已經(jīng)很強(qiáng)大了,我相信你們都知道,它們?cè)诰W(wǎng)上下棋,有著大概9段的水平。我們也計(jì)劃在幾個(gè)月內(nèi)公布更多關(guān)于新版AlphaGo的技術(shù)細(xì)節(jié)。我們會(huì)回顧這些技術(shù)細(xì)節(jié),然后其他的團(tuán)隊(duì)和實(shí)驗(yàn)室將會(huì)能夠再次構(gòu)建他們自己的AlphaGo-Master框架。
Q: 當(dāng)越來(lái)越多頂尖棋手不愿意和AlphaGo對(duì)弈時(shí),我們是否會(huì)考慮到用AlphaGo和AlphaGo對(duì)弈?
Demis Hassabis: We want to use AlphaGo, as I said, as a tool for the Go community to improve their knowledge about the game. We hope to, you know, release some details about the architecture we are using, maybe also some of the games that AlphaGo plays against itself. So we maybe will make some announcement about this later in the week. But don't forget, the reason, ultimately, we are developing these technologies is also to use them more widely in areas of science and medicine, and to try and help human experts in those areas. So we have lot of work ahead of us in the coming years.
就像我所說(shuō)的,我們希望AlphaGo會(huì)是一個(gè)供圍棋界提高他們對(duì)于這個(gè)游戲的認(rèn)知的工具。我們會(huì)公布我們所使用的程序架構(gòu)的細(xì)節(jié),也可能還會(huì)公布一些AlphaGo自我對(duì)弈的棋譜,這周稍后會(huì)正式宣布。但是別忘了,我們發(fā)展這些科技的最終目的是為了在科學(xué)和醫(yī)學(xué)領(lǐng)域更廣闊地應(yīng)用它們,也為了給人類專家提供幫助。所以在接下來(lái)幾年我們還有很多工作要做。
棋局回顧:
·人機(jī)大戰(zhàn)首局柯潔執(zhí)黑先行 在傳統(tǒng)開局中求變化·AlphaGo中盤階段顯示實(shí)力 柯潔遇考驗(yàn)陷入長(zhǎng)考·AlphaGo大局清晰占主動(dòng) 柯潔孤注一擲圖謀大龍·柯潔官子階段苦覓逆轉(zhuǎn)良機(jī) AlphaGo144手略意外
嘉賓講棋:
·黨毅飛、范蔚菁解析人機(jī)大戰(zhàn) 柯潔 VS AlphaGo(1) ·黨毅飛、范蔚菁解析人機(jī)大戰(zhàn) 柯潔 VS AlphaGo(2) ·黨毅飛、范蔚菁解析人機(jī)大戰(zhàn) 柯潔 VS AlphaGo(3) ·黨毅飛、范蔚菁解析人機(jī)大戰(zhàn) 柯潔 VS AlphaGo(4) ·黨毅飛、范蔚菁解析人機(jī)大戰(zhàn) 柯潔 VS AlphaGo(5) ·黨毅飛、范蔚菁解析人機(jī)大戰(zhàn) 柯潔 VS AlphaGo(6)
繼續(xù)閱讀與本文標(biāo)簽相同的文章