Discussion:
[Fonc] DeepAPI
Anatoly Levenchuk
2016-05-30 12:57:35 UTC
Permalink
In 2012-2013 FONC was studied API search (<method finder> in VPRI jargon,
http://www.vpri.org/pdf/rn2013002_locatr.pdf,
http://www.vpri.org/pdf/m2009012_fnd_sine.pdf).



There appear a new approach to do it: http://arxiv.org/abs/1605.08535



Deep API Learning



<http://arxiv.org/find/cs/1/au:+Gu_X/0/1/0/all/0/1> Xiaodong Gu,
<http://arxiv.org/find/cs/1/au:+Zhang_H/0/1/0/all/0/1> Hongyu Zhang,
<http://arxiv.org/find/cs/1/au:+Zhang_D/0/1/0/all/0/1> Dongmei Zhang,
<http://arxiv.org/find/cs/1/au:+Kim_S/0/1/0/all/0/1> Sunghun Kim

(Submitted on 27 May 2016)

Developers often wonder how to implement a certain functionality (e.g., how
to parse XML files) using APIs. Obtaining an API usage sequence based on an
API-related natural language query is very helpful in this regard. Given a
query, existing approaches utilize information retrieval models to search
for matching API sequences. These approaches treat queries and APIs as
bag-of-words (i.e., keyword matching or word-to-word alignment) and lack a
deep understanding of the semantics of the query.
We propose DeepAPI, a deep learning based approach to generate API usage
sequences for a given natural language query. Instead of a bags-of-words
assumption, it learns the sequence of words in a query and the sequence of
associated APIs. DeepAPI adapts a neural language model named RNN
Encoder-Decoder. It encodes a word sequence (user query) into a fixed-length
context vector, and generates an API sequence based on the context vector.
We also augment the RNN Encoder-Decoder by considering the importance of
individual APIs. We empirically evaluate our approach with more than 7
million annotated code snippets collected from GitHub. The results show that
our approach generates largely accurate API sequences and outperforms the
related approaches.



Best regards,

Anatoly
Brian Rice
2016-05-30 16:46:10 UTC
Permalink
This is an interesting technique, but it does rely on a large repository of
existing practice so it's suitable for APIs with a significant amount of
human effort poured into using them if I understand correctly. Of course,
credit the authors for explaining this tidily in their "Threats to
Validity" section.

The goal (to paraphrase) is to solve the NxM problem which is more or less
how to pair up (arbitrary?) agents with mutually unmatched surface areas.
In 2012-2013 FONC was studied API search («method finder» in VPRI jargon,
http://www.vpri.org/pdf/rn2013002_locatr.pdf,
http://www.vpri.org/pdf/m2009012_fnd_sine.pdf).
There appear a new approach to do it: http://arxiv.org/abs/1605.08535
Deep API Learning
Xiaodong Gu <http://arxiv.org/find/cs/1/au:+Gu_X/0/1/0/all/0/1>, Hongyu
Zhang <http://arxiv.org/find/cs/1/au:+Zhang_H/0/1/0/all/0/1>, Dongmei
Zhang <http://arxiv.org/find/cs/1/au:+Zhang_D/0/1/0/all/0/1>, Sunghun Kim
<http://arxiv.org/find/cs/1/au:+Kim_S/0/1/0/all/0/1>
(Submitted on 27 May 2016)
Developers often wonder how to implement a certain functionality (e.g.,
how to parse XML files) using APIs. Obtaining an API usage sequence based
on an API-related natural language query is very helpful in this regard.
Given a query, existing approaches utilize information retrieval models to
search for matching API sequences. These approaches treat queries and APIs
as bag-of-words (i.e., keyword matching or word-to-word alignment) and lack
a deep understanding of the semantics of the query.
We propose DeepAPI, a deep learning based approach to generate API usage
sequences for a given natural language query. Instead of a bags-of-words
assumption, it learns the sequence of words in a query and the sequence of
associated APIs. DeepAPI adapts a neural language model named RNN
Encoder-Decoder. It encodes a word sequence (user query) into a
fixed-length context vector, and generates an API sequence based on the
context vector. We also augment the RNN Encoder-Decoder by considering the
importance of individual APIs. We empirically evaluate our approach with
more than 7 million annotated code snippets collected from GitHub. The
results show that our approach generates largely accurate API sequences and
outperforms the related approaches.
Best regards,
Anatoly
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org
Loading...