SGPT: 意味検索のためのGPTによる文章埋め込み

SGPT: GPT Sentence Embeddings for Semantic Search 2022/02 https://arxiv.org/abs/2202.08904 モデル・コード https://github.com/Muennighoff/sgpt 提案手法の構造図．左(a)がSGPTクロスエンコーダ，右(b)がSGPTバイエンコーダ． BEIRでのre-ranking 性…

2022-02-23

WebGPT: web検索を操作して引用付きで質問に回答

DeepLearning NLP 論文読み GPT

WebGPT: Browser-assisted question-answering with human feedback 2021/12 https://arxiv.org/abs/2112.09332 人間用web検索環境(左)とモデル用(右) ELI5で，人間のデモに対してWebGPTが好まれた割合(左)とELI5の正解データに対してWebGPTが好まれた割合(…

2022-02-04

data2vec: 画像，音声，言語を同じ手法で事前学習できる一般的フレームワーク

DeepLearning Pre-Training Self-Supervised Transformer 論文読み

Data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language 2022/01 https://ai.facebook.com/research/data2vec-a-general-framework-for-self-supervised-learning-in-speech-vision-and-language 音声、自然言語処理、…

2022-02-04

競技プログラミングのコード生成で上位54%以内を達成，AlphaCode

DeepLearning NLP Programming Transformer コード生成論文読み

Competition-Level Code Generation with AlphaCode 2022/02 https://storage.googleapis.com/deepmind-media/AlphaCode/competition_level_code_generation_with_alphacode.pdf 自然言語からコード生成するシステムAlphaCodeを提案モデルは巨大な(最大41B)…

2022-01-17

単語埋め込みによる分散表現の学習を提案した論文を読んだ

DeepLearning NLP 論文読み

言語処理関係の深層学習モデルでは，単語埋め込みを言語モデルと同時に学習する手法をよく使用するが，それを最初に提案した論文だと思う． wikipediaで単語埋め込みの項を見ていて，今日的な形の手法で最初のはこれのようだった．違っていたらご指摘くださ…

2022-01-17

畳み込みベースの翻訳モデル ConvS2S

DeepLearning NLP 論文読み

Convolutional sequence to sequence learning https://arxiv.org/abs/1705.03122 2017 ICML 畳み込みとattnを使って翻訳モデルを提案 RNNを使わないenc-decで，入力系列を畳み込みでencode，decoderは過去の正解系列の畳み込みとencodeされた文脈の内積attn…

2022-01-13

21,000クラスを検出可能な物体検出手法 Detic

Detecting Twenty-thousand Classes using Image-level Supervision https://arxiv.org/abs/2201.02605 2022/01 faster RCNN ベースの２段階モデルで候補領域提案後に分類部分を訓練する方法で，imagenetの21,000クラスの物体検出が可能なDetector with imag…

2022-01-12

一般化カテゴリ発見

DeepLearning 教師なし学習 Self-Supervised

Generalized Category Discovery https://arxiv.org/abs/2201.02609 2022/01 一般化カテゴリ発見というタスクを提案ラベル付けされた画像集合とされていない画像集合から、ラベルなし集合の全画像を分類するタスク．ラベルなし画像は既存カテゴリor新規カテ…

2022-01-01

モバイルUI要素に説明を付与する widget captioning

DeepLearning ImageCaptioning 論文読み Transformer

Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements https://arxiv.org/abs/2010.04295 EMNLP 2020 android App UIの要素に説明を付与するタスク, widget captioningを提案スクリーンリーダー等のアクセシビ…

2021-12-29

Attentionのみのモデルで翻訳タスク大幅改善, Transformer

DeepLearning NLP Transformer 論文読み

Attention Is All You Need https://arxiv.org/abs/1706.03762 2017/06, NeurIPS 2017 RNNもCNNも使用せずattentionによりWMT2014英->独翻訳で28.4BLEU達成．前のsotaから2ポイント改善 RNNは自己回帰のため1サンプル内での並列化は不可能(かつ系列長が異な…

2021-12-29

長文に強い相対位置埋め込みを持つモデル RoFormer

DeepLearning Transformer 論文読み NLP

RoFormer: Enhanced Transformer with Rotary Position Embedding https://arxiv.org/abs/2104.09864 2021/04 相対位置埋め込みを回転行列で表現したtransformer．各tokenに対する積として実行し，意味上は各tokenベクトルを回転させる効果になるトークン間…

2021-12-27

クロスモーダル事前学習不要のVQAモデル, Multimodal Bitransformer

DeepLearning Vision-Language Transformer

Supervised Multimodal Bitransformers for Classifying Images and Text https://arxiv.org/abs/1909.02950 2019 Architecture VQAにおいて，個別に事前学習済みの画像encoder, text encoderを組み合わせてBERTベースモデルでSAすることで，VilBERTのような…

2021-12-24

Jigsaw: 大規模言語モデルのコード生成に前/後処理を追加し精度改善

Jigsaw: Large Language Models meet Program Synthesis https://arxiv.org/abs/2112.02969 ICSE'22, 2021/12/06 大規模事前学習言語モデル(GPT-3, Codex．PTLMと呼ぶ)は自然言語からコード生成可能であるが，変数名変換とAST-to-AST変換による後処理モジュ…

2021-12-24

vision分野で多様な下流タスクに適用できる基礎モデルFlorence

DeepLearning Pre-Training Vision-Language Transformer

Florence: A New Foundation Model for Computer Vision 2021/11/22 https://arxiv.org/abs/2111.11432 Fig.2 Overview of building Florence 画像ドメインで多様な下流タスク(分類、検索、オブジェクト検出、VQA、画像キャプション、ビデオ検索、アクション…

2021-02-24

ニュース記事・画像からキャプション生成，Transform and Tell

Transform and Tell: Entity-Aware News Image Captioning paper https://arxiv.org/abs/2012.00364 Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao github https://github.com/alasd…

2021-02-08

物体検出結果のタグを利用して視覚-言語6タスクでSoTA更新，OSCAR

DeepLearning ImageCaptioning Transformer NLP Pre-Training

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks paper https://arxiv.org/abs/2004.06165 github https://github.com/microsoft/Oscar データセット COCO etc. project まとめどんなもの？言語embedding，画像の物体検出特徴に…

2021-02-08

任意クラスの分類器を生成できるzero-shot転移モデルCLIP

Learning Transferable Visual Models From Natural Language Supervision paper https://cdn.openai.com/papers/Learning_Transferable_Visual_Models_From_Natural_Language.pdf github https://github.com/openai/CLIP データセット WebImageText(WIT) pr…

2020-12-21

言語モデルでEOSを予測すると汎化性能悪化するらしい

DeepLearning 論文読み NLP

The EOS Decision and Length Extrapolation paper https://arxiv.org/abs/2010.07174 Benjamin Newman, John Hewitt, Percy Liang, Christopher D. Manning github https://github.com/bnewm0609/eos-decision データセット Dyck-(k, m), SCAN, WMT2009 pro…

2020-12-17

UI画像からコード生成，pix2code

DeepLearning ImageCaptioning UIデザイン pix2code コード生成論文読み

pix2code: Generating Code from a Graphical User Interface Screenshot paper https://arxiv.org/abs/1705.07962 Tony Beltramelli github https://github.com/tonybeltramelli/pix2code データセット githubで公開 project https://uizard.io/research/#p…

2020-12-08

教師なしプログラミング言語翻訳 TransCoder

DeepLearning コード生成論文読み教師なし学習

Unsupervised Translation of Programming Languages paper https://arxiv.org/abs/2006.03511 Marie-Anne Lachaux, Baptiste Roziere, Lowik Chanussot, Guillaume Lample github https://github.com/facebookresearch/TransCoder データセット Google BigQ…

2020-10-05

胸部X線データセットのドメインシフト検証，Can we trust deep learning based diagnosis? The impact of domain shift in chest radiograph classification

DeepLearning 論文読み MedicalImaging

Can we trust deep learning based diagnosis? The impact of domain shift in chest radiograph classification paper https://arxiv.org/abs/1909.01940 Eduardo H. P. Pooch, Pedro L. Ballester, Rodrigo C. Barros github データセット ChestX-ray14, C…

2020-10-01

'use strict';はエラーが発生しやすい項目をエラーにする指定

javascript node.js

(初心者向け) JavaScript の Strict モードの概要 qiita.com JavaScript の機能としてはエラーではないが、落とし穴になる項目をエラーにしてバグの発生を抑える。 JavaScript の最適化を困難にする誤りを修正する。将来の ECMAScript で予定されている構…

2020-09-26

pytorchのモデル/オプティマイザのzero_grad()の違い

DeepLearning Pytorch Python

pytorhにoptimizerのzero_grad()とモデルのzero_grad()があることに気づいたので違いを調べた．optimizer がmodel.paremeters()を対象にしている場合，つまり全パラメタを対象にしている場合は両者は同じもの． Optimizer.zero_grad() と nn.Module.zero_gra…

2020-09-26

gccは３つの主要なステップからなる

Unix Linux 本 Programming

gccは３つの主要なステップからなる 1. Cのソースをアセンブリに変換 gccはC compilerを呼び出し，それが.cをターゲットマシンのアセンブリコードの.sに変換する C compilerは前処理，lexical 解析，parse，code生成などのフェーズを持つ 2. assembly codeを…

2020-09-17

プログラミング言語間の翻訳，Tree-to-tree Neural Networks for Program Translation

DeepLearning LSTM TreeStructure 論文読みコード生成

Tree-to-tree Neural Networks for Program Translation paper https://arxiv.org/abs/1802.03691 Xinyun Chen, Chang Liu, Dawn Song NeulIPS 2018 github データセット project まとめどんなもの？プログラミング言語間の翻訳にdeepを使用した初の研究．…

2020-09-11

階層構造のためのLSTM，Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

DeepLearning 論文読み LSTM TreeStructure

Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks paper https://arxiv.org/abs/1810.09536 Yikang Shen, Shawn Tan, Alessandro Sordoni, Aaron Courville github https://github.com/yikangshen/Ordered-Neurons データセッ…

2020-09-10

2次以上の特長の相互作用が可能なattention, X-Linear Attention Networks for Image Captioning

DeepLearning 論文読み ImageCaptioning

X-Linear Attention Networks for Image Captioning paper https://arxiv.org/abs/2003.14080 Yingwei Pan, Ting Yao, Yehao Li, and Tao Mei github https://github.com/JDAI-CV/image-captioning データセット COCO project まとめどんなもの？ image cap…

2020-09-10

図を記述するプログラム生成，Learning to Infer Graphics Programs from Hand-Drawn Images

DeepLearning コード生成論文読み pix2code

Learning to Infer Graphics Programs from Hand-Drawn Images paper http://papers.nips.cc/paper/7845-learning-to-infer-graphics-programs-from-hand-drawn-images https://arxiv.org/abs/1707.09627 (長い版) github https://github.com/ellisk42/TikZ …

2020-09-09

GUIコード生成のレビュー論文，Front End Development Automation Tool: Missing Features?

DeepLearning UIデザイン pix2code コード生成論文読み

Front End Development Automation Tool: Missing Features? paper https://ieeexplore.ieee.org/document/9033956 Hasitha Hiran Walpola, Guhanathan Poravi github データセット project まとめどんなもの？ GUI画像からコード生成するモデルのサーベイ…

2020-09-09

スケッチからスタイルを考慮したDSL生成，CSSSketch2Code: An Automatic Method to Generate Web Pages with CSS Style

DeepLearning UIデザイン pix2code コード生成論文読み

CSSSketch2Code: An Automatic Method to Generate Web Pages with CSS Style paper https://dl.acm.org/doi/abs/10.1145/3292448.3292455 github データセット project まとめどんなもの？ webページのスケッチ(スクショではない)からDSLを生成するencoder…