2017年GitHub中最爲流行的30個開源機器學習項目

2018-01-06 12:23:00.0

選自Mybridge

2017 年裏哪些機器學習項目最受人關注？Mybridge 爲我們整理了一份 Top 30 列表，以下所有項目均附有 GitHub 鏈接。

我們對比了近 8800 個開原機器學習項目，並挑選了其中最好的 30 個列舉於此。這是一個非常具有競爭力的列表，其中包含 2017 年 1 月-12 月份開源的各類優秀機器學習庫、數據集和應用。Mybridge AI 通過流行度、參與度和新鮮程度來對它們進行評級。先給你一個直觀印象：它們的 GitHub 平均 stars 是 3558 個。

開源項目對於數據科學家而言非常有意義，我們可以通過閱讀源代碼，在前人的基礎上構建更加強大的項目。現在，你可以盡情嘗試一下這些去年最佳項目了。

No.1

FastText：快速文本表示/分類庫，來自 Facebook（GitHub 11,786 stars）

鏈接：https://github.com/facebookresearch/fastText

參考內容：Facebook發佈新版fastText：拓展至移動端，加入教程

又及 Muse：多語言無監督/監督詞嵌入，基於 FastText（GitHub 695 stars）

鏈接：https://github.com/facebookresearch/MUSE

No.2

Deep-photo-styletransfer：康奈爾大學 Fujun Luan 論文《Deep Photo Style Transfer》的代碼與數據（GitHub 9747 stars）

鏈接：https://github.com/luanfujun/deep-photo-styletransfer

No.3

face recognition：最簡單的 Python 命令行面部識別 API，來自 Adam Geitgey（GitHub 8672 stars）

鏈接：https://github.com/ageitgey/face_recognition

參考內容：基於Python的開源人臉識別庫：離線識別率高達99.38%

No.4

Magenta：機器智能音樂與藝術生成器（GitHub 8113 stars）

鏈接：https://github.com/tensorflow/magenta

參考內容：谷歌Magenta項目是如何教神經網絡編寫音樂的？

No.5

Sonnet：基於 TensorFlow 的神經網絡庫（GitHub 5731 stars），來自 DeepMind 成員 Malcolm Reynolds

鏈接：https://github.com/deepmind/sonnet

參考內容：DeepMind開源Sonnet：可在TensorFlow中快速構建神經網絡

No.6

deeplearn.js：來自 Google Brain 團隊 Nikhil Thorat 的網頁端硬件加速機器學習庫（GitHub 5462 stars）

鏈接：https://github.com/PAIR-code/deeplearnjs

參考內容：谷歌開源DeepLearn.js：可在網頁上實現硬件加速的機器學習

No.7

Fast Style Transfer：TensorFlow 快速風格轉換，來自 MIT 的 Logan Engstrom（GitHub 4843 stars）

鏈接：https://github.com/lengstrom/fast-style-transfer

No.8

Pysc2：星際爭霸 2 學習環境，來自 DeepMind Timo Ewalds 等人（GitHub 3683 stars）

鏈接：https://github.com/deepmind/pysc2

No.9

AirSim：基於虛幻引擎的開源自動駕駛模擬器，由微軟研究院 Shital Shah 等人提出（GitHub 3861 stars）

鏈接：https://github.com/Microsoft/AirSim

No.10

Facets：機器學習數據集可視化工具，來自 Google Brain（GitHub 3371 stars）

鏈接：https://github.com/PAIR-code/facets

參考內容：谷歌開源機器學習可視化工具 Facets：從全新角度觀察數據

No.11

Style2Paints：AI 漫畫線稿上色工具，來自蘇州大學（GitHub 3310 stars）

鏈接：https://github.com/lllyasviel/style2paints

參考內容：Style2paints：專業的AI漫畫線稿自動上色工具

No.12

Tensor2Tensor：用於廣義序列-序列模型的工具庫，來自 Google Brain 的 Ryan Sepassi（GitHub 3087 stars）

鏈接：https://github.com/tensorflow/tensor2tensor

參考內容：一個模型庫學習所有：谷歌開源模塊化深度學習系統Tensor2Tensor

No.13

CycleGAN and pix2pix in PyTorch：基於 PyTorch 的圖像-圖像轉換工具，來自 UC Berkeley 在讀博士朱儁彥（GitHub 2847 stars）

鏈接：https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix

參考內容：你來手繪塗鴉，人工智能生成「貓片」：edges2cats圖像轉換詳解

No.14

Faiss：用密集向量高效相似性搜索與聚類的工具庫，來自 Facebook（GitHub 2629 stars）

鏈接：https://github.com/facebookresearch/faiss

No.15

Fashion-mnist：一個類似於 MNIST 的時尚產品數據集，來自 Zalando Tech 的 Han Xiao（GitHub 2780 stars）

鏈接：https://github.com/zalandoresearch/fashion-mnist

No. 16

ParlAI：用於在各種公開可用的對話數據集上訓練與評估 AI 模型的框架，來自 Facebook 的 Alexander Miller（GitHub 2578 stars）

鏈接：https://github.com/facebookresearch/ParlAI

參考內容：Facebook開源人工智能框架ParlAI：可輕鬆訓練評估對話模型

No.17

Fairseq：來自 FAIR 的序列到序列工具包（GitHub 2571 stars）

鏈接：https://github.com/facebookresearch/fairseq

參考內容：Facebook提出全新CNN機器翻譯：準確度超越谷歌而且還快九倍（已開源）

No.18

Pyro：使用 Python 和 PyTorch 進行深度通用概率編程，來自 Uber AI Labs（GitHub 2387 stars）

鏈接：https://github.com/uber/pyro

參考內容：Uber與斯坦福大學開源深度概率編程語言Pyro：基於PyTorch

No.19

iGAN：基於 GAN 的交互圖像生成器（GitHub 2369 stars）

鏈接：https://github.com/junyanz/iGAN

參考內容：伯克利大學和Adobe開源深度學習圖像編輯工具 iGAN

No.20

Deep-image-prior：使用神經網絡進行圖像恢復，同時無需學習過程，來自 Skoltech 的 Dmitry Ulyanov（GitHub 2188 stars）

鏈接：https://github.com/DmitryUlyanov/deep-image-prior

No.21

Face classification：基於 Keras CNN 模型與 OpenCV 的實時面部檢測和表情/性別分類，訓練與 fer2013/imdb 數據集（GitHub 1967 stars）

鏈接：https://github.com/oarriaga/face_classification

No.22

Speech to Text WaveNet：使用 DeepMind 的 WaveNet 和 TensorFlow 構成的端到端句級英語語音識別，來自 Kakao Brain 的 Namju Kim（GitHub 1961 stars）

鏈接：https://github.com/buriburisuri/speech-to-text-wavenet

參考內容：DeepMind WaveNet，將機器合成語音水平與人類差距縮小50%

No.23

StarGAN：用於多領域圖像-圖像轉換的統一生成對抗網絡（GitHub 1954 stars）

鏈接：https://github.com/yunjey/StarGAN

No.24

MI-agents：Unity 機器學習智能體，來自 Unity3D 的 Arthur Juliani（GitHub 1658 stars）

鏈接：https://github.com/Unity-Technologies/ml-agents

No.25

Deep Video Analytics：分佈式可視化搜索和可視化數據分析平臺，來自康奈爾大學的 Akshay Bhat（GitHub 1494 stars）

鏈接：https://github.com/AKSHAYUBHAT/DeepVideoAnalytics

No.26

OpenNMT：Torch 上的開源神經機器翻譯（GitHub 1490 stars）

鏈接：https://github.com/OpenNMT/OpenNMT

參考內容：哈佛大學NLP組開源神經機器翻譯工具包OpenNMT：已達到生產可用水平

No.27

Pix2PixHD：使用條件 GAN 合成和處理 2048×1024 分辨率的圖像，來自英偉達 AI 科學家 Ming-Yu Liu（GitHub 1283 stars）

鏈接：https://github.com/NVIDIA/pix2pixHD

No.28

Horovod：分佈式 TensorFlow 訓練框架，來自 Uber 工程團隊（GitHub 1188 stars）

鏈接：https://github.com/uber/horovod

參考內容：詳解Horovod：Uber開源的TensorFlow分佈式深度學習框架

No.29

AI-Blocks：強大而直觀的 WYSIWYG 界面，可讓任何人創建機器學習模型（GitHub 899 stars）

鏈接：https://github.com/MrNothing/AI-Blocks

No.30

Voice Conversion with Non-Parallel Data：基於 TensorFlow 的深度神經網絡語音轉換（語音風格轉換），來自 Kakao Brain 團隊的 Dabi Ahn（GitHub 845 stars）

鏈接：https://github.com/andabi/deep-voice-conversion

原文鏈接：https://medium.mybridge.co/30-amazing-machine-learning-projects-for-the-past-year-v-2018-b853b8621ac7

文章來源：機器之心

喜歡這篇文章嗎？快分享吧！