Koders, Krugle, Codase, Google Code Search

I found a Google Talk on a topic that's related to the motto of my blog - "good coders code, great reuse".

In this talk, professor Tao Xie speaks about his research on using public code repositories together with code search engines for finding common API usage patterns and anti-patterns.

His research software uses the following four code search engines.

He suggests to view Raphael Volz's analysis for more information about these search engines.

Tao has developed three tools, which use the aforementioned search engines:

  • PARSEWeb for finding API usage patterns,
  • XWeb for finding forgotten exception handlers, and
  • NEGWeb for finding misuses of API calls.

See the code mining project website for more information.

The lecture is done in a very academic manner and it's very hard to follow. Be sure that you are really interested in this topic before watching it.

Some excerpts from the lecture:

  • [04:26] A problem with data mining on source code is that it might not have enough data points (usages of API) to discover common patterns.
  • [04:58] It is crucial to have a lot of data points to get good results out of data mining
  • [08:37] Google Code Search indexes publicly hosted SVN and CVS repositories.
  • [09:20] Example of searching for C stdlib's fopen usage on Google Code Search (query: "lang:C file:.c$ fopen\s*\("
  • [11:08] Example of the same search on Krugle.
  • [16:40] Code search engines return partial code samples. Various heuristics are used for type inference.
  • [22:05] Example of integrating Tao's PARSEWeb into Eclipse.
  • [28:15] Interesting idea of constructing and issuing multiple queries to find more code samples.
  • [36:20] A study showed that a proper deallocation of resources after an exception resulted in 17% performance increase.

I'd like to hear some comments on websites that you use for finding code examples!

Comments

Utopiah Permalink
July 17, 2008, 05:37

Well I send you an email with this video few weeks ago (June the 8th) and Im glad to see you've put up your sum-up online. Why ?
Because regarding easy access to information, learning video are great BUT
accessing a specific part without meta-data is hard. By providing your
sum-ups you actually provide anchors for the videos. It helps for
direct access but actually it also help for *memorizing* as after seeing
a video once, when someone read your sum-up (as Im doing now) it will re-activate
their memories (cf
http://www.wired.com/medtech/health/magazine/16-05/ff_wozniak?currentPage=2 ).

Cheers,
Utopiah.

July 17, 2008, 13:52

reāli noderīga lieta! paldies, nezināju.

March 20, 2014, 02:31

Useful information. Fortunate me I discovered your site unintentionally, polerowanie aluminium and I'm stunned why this accident didn't came about earlier! I bookmarked it

Amy Hamilton Permalink
April 19, 2014, 13:54

SymbolHound is also a substitute to Google Code Search. However, about Krugle I have a complaint about its relevancy that is not very good. I request that please post more information here so that I can write on Google code search. If not, Essay Writing Services are here to help me.

Leave a new comment

(why do I need your e-mail?)

(Your twitter name, if you have one. (I'm @pkrumins, btw.))

Type the first letter of your name: (just to make sure you're a human)

Please preview the comment before submitting to make sure it's OK.

Advertisements