post 'good coders code, great reuse' to del.icio.us post 'good coders code, great reuse' to digg post 'good coders code, great reuse' to reddit subscribe to 'good coders code, great reuse' posts via feed
good coders code, great reuse

Good judgement comes from experience, and experience comes from bad judgement.

Fred Brooks

Video Lectures 13 May 2008 08:00 am
1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...

theorizing from data video talk by peter norvigHere is a video lecture by Google’s Director of Research - Peter Norvig. The full title of this lecture is “Theorizing from Data: Avoiding the Capital Mistake“.

In 1891 Sir Arthur Conan Doyle said that “it is a capital mistake to theorize before one has data.” These words still remain true today.

In this talk Peter gives insight into what large amounts of data can do for problems in language understanding, translation and information extraction. The talk is accompanied with a bunch of examples from various Google services.

Moments from the lecture:

  • [00:35] Peter Norvig came to Google from NASA in 2001 because that’s where the data was.
  • [01:30] Peter says that the way to make progress in AI (Artificial Intelligence) is to have more data. If you don’t have data you won’t make progress just with fancy algorithms.
  • [04:40] In 2001 a meta study of several different algorithms for disambiguating words in sentences showed that the worst algorithms performed better than the best algorithms if they were trained with a larger word database. Link to original meta study paper: Scaling to Very Very Large Corpora for Natural Language Disambiguation
  • [06:30] It took at least 30 years to go from a linguistic text collection of 1 million words (10^6 words, Brown Corpus) to what we now have on Internet (around 100 trillion words (10^14 words)).
  • [06:55] Google harvested one billion words (10^12) from the net, counted them up and published them to Linguistics Data Consortium. Announcement here, you can buy 6 DVDs of the words here (the price is $150).
  • [10:00] Example: Google Sets was the first experiment done using large amounts of data. It’s a clustering algorithm which returns a group of similar words. Try “dog and cat” and then “more and cat” :)
  • [11:55] Example: Google Trends shows popularity of a search terms based on data collected over time of searches performed by users.
  • [13:15] Example: Query refinement suggestions.
  • [13:40] Example: Question answering.
  • [15:30] Principles of machine reading - concepts, relational templates, patterns.
  • [16:32] Example of learning relations and patterns with machine reading.
  • [18:40] Learning classes and attributes (for example, computer games and their manufacturers).
  • [21:18] Statistical Machine Translation (See Google Language Tools).
  • [24:25] Example of Chinese to English machine translation.
  • [26:27] Main components of machine translation are Translation Model, Language Model and Decoding Algorithm.
  • [29:35] More data helps!
  • [29:45] Problem: How many bits to use to store probabilities?
  • [31:10] Problem: How to reduce space used for storing words from training data during translation process?
  • [35:25] Three turning points in the history of development of information.
  • [37:00] Q and A!

There were some interesting questions in Q and A session:

  • [37:15] Have you applied any of the theories used in stock markets to language processing?
  • [38:08] Are you working on any tools to assist writers?
  • [39:50] How far you off from automated translation without disfluencies?
  • [41:58] 1) Is GOOG-411 service actually used to gather a huge corpus of spoken data. 2) Are there any advances on other data than text?
  • [43:50] Would the techniques you described in your talk work in speech-to-text processing?
  • [44:50] Will there be any services for fighting comment and form spam?
  • [46:00] Do you also take information like what links do users click into account when displaying search results?
  • [47:22] How do you measure difference between someone finding something, and someone being satisfied what they found?
  • [49:23] When doing machine translation, how can you tell that you’re not learning from a website which was already translated with another machine translation service?
  • [50:49] How do you take into account that one uses slang, the other does not, and does it affect your translation tools?
  • [51:40] Can you speak a little about methods in OCR (Optical Character Recognition)?

The question at 44:50 got me very interested. The person asked if Google was going to offer any services for fighting spam. Peter said that it was an interesting idea, but it was better to ask Matt Cutts.

Having a hacker’s mindset, I started thinking, what if someone emailed their comments through Gmail? If the comment was spam, Gmail’s spam system would detect it and label the message as being spam. Otherwise the message would end up in Inbox folder. All the messages in Inbox folder could then be posted back to the website as good comments. If there were false positives, you could go through the spam folder and move the non-spam messages back to Inbox. What do you think?

Have fun!

No Comments Comments | Email Post Email 'Theorizing from Data by Peter Norvig (Video Lecture)' to a friend | Print Post Print 'Theorizing from Data by Peter Norvig (Video Lecture)' | Permalink Permalink to 'Theorizing from Data by Peter Norvig (Video Lecture)' | Trackback Trackback to 'Theorizing from Data by Peter Norvig (Video Lecture)'
(Popularity: 3%) 252 Views

Did you like this page? Subscribe to my posts!

SecurityVideo Lectures 01 May 2008 03:55 pm
1 Star2 Stars3 Stars4 Stars5 Stars (5 votes, average: 5 out of 5)
Loading ... Loading ...

defcon logo post iconHere is something for all you hackers out there reading my blog: all the videos from the previous year’s biggest and greatest hacker conference — DefCon 15!

I found these videos via this post on Roy/SAC’s blog. He bought a full set of DVDs for several hundred dollars and uploaded them to Google Video! I sincerely appreciate his effort!

Total of more than 200 videos!

For your convenience, here is the full DefCon 15 session listing:
Download Full DefCon 15 Session Listing (.pdf).

You’re welcome to comment here on lectures you found intriguing and liked the most!

Have fun!

Comments (6) Comments | Email Post Email 'Videos from Defcon 15 Hacker Conference' to a friend | Print Post Print 'Videos from Defcon 15 Hacker Conference' | Permalink Permalink to 'Videos from Defcon 15 Hacker Conference' | Trackback Trackback to 'Videos from Defcon 15 Hacker Conference'
(Popularity: 38%) 12,371 Views

Did you like this page? Subscribe to my posts!

Video Lectures 23 Apr 2008 11:25 pm
1 Star2 Stars3 Stars4 Stars5 Stars (4 votes, average: 5 out of 5)
Loading ... Loading ...

python design patterns video lecturesIn my previous post about learning Python programming through video lectures I stopped at three lectures on Design Patterns. This time I continue from there.

If you don’t know what a Design Pattern is, think of it as a simple solution to a specific problem that occurs very frequently in software design.

For example, suppose you use a bunch of unrelated pieces of code. It is a nice idea to bring the unrelated pieces of code together in a unified interface. This design pattern is called Facade. There are a bunch of patterns like this one!

The three lectures are given by Alex Martelli who works as “Über Tech Lead” for Google.

Python Design Patterns, Part I

Alex briefly covers the history and main principles of Design Patterns and quickly moves to discussing Structural and Behavioral DPs in Python.

Interesting ideas from the lecture:

  • [03:24] The name “Design Patterns” was first used by Christopher Alexander, an architect, who abstracted the idea of building buildings as building them using well known patterns which can be applied to the same problem over and over again without ever doing it the same way twice.
  • [05:30] Design Patterns are mostly applied to Object Oriented programming because it’s the most widely spread programming paradigm nowadays.
  • [08:36] Design Patterns are not invented, they are discovered.
  • [10:00] Alex says that the original book Design Patterns by the Gang of Four should be read only when you are a master of DPs.
  • [13:10] Three classical categories of DPs are - Creational (deal with object instantiaton), Structural (deal with composition of objects) and Behavioral (deal with interaction of objects).
  • [14:05] “Program to an interface, not to an implementation.”
  • [17:00] Use inheritance only when absolutely necessary, otherwise use “hold or wrap” principle.
  • [18:30] Never have more than one dot - Law of Demeter.
  • [18:50] Inheritance cannot restrict, use wrapping to restrict.
  • [21:41] In most of the cases when you need a single instance of something in Python, use a module instead of a class.
  • [22:23] Otherwise, just make 1 instance (without enforcing one).
  • [22:59] Singleton is also called “Highlander”.
  • [24:50] There is basically no way to support subclassing well in Singleton.
  • [25:45] Monostate is also called “Borg”.
  • [27:00] Python’s data overriding helps in Monostate Design Pattern.
  • [29:00] Each Python’s type/class is essentially a factory.
  • [32:06] Python does a “two-phase object construction”.
  • [35:30] Adapter Design Pattern (it tweaks the interface to your needs).
  • [41:22] Facade Design Pattern (it provides a simple subset of a complex functionality).
  • [47:25] Bridge Design Pattern (it abstracts interface from the implementation).
  • [49:30] Decorator Design Pattern (it transparently modifies some functionality.).
  • [50:24] Proxy Design Pattern (sounds the same as decorator just for access control).
  • [51:21] Q and A!

Python Design Patterns, Part II

In this lecture Alex discusses behavioral patterns. Unlike the first part, he goes in depth of some of the patterns and explains how they can be implemented in Python.

Interesting ideas from the lecture:

  • [02:25] Template Method is a great pattern with a lousy name, a better name is “self-delegation”.
  • [03:43] Example of Template Method Design Pattern (text pagination).
  • [08:50] Template Method Rationale.
  • [09:45] The “Hollywood Principle” - “don’t call us, we’ll call you”
  • [12:05] In Python you can also override data.
  • [13:10] Example of Template Method in Queue.Queue.
  • [14:05] If you are a good Python programmer, use Queue in threaded applications.
  • [17:45] Customizing Queue.
  • [19:30] Example of Template Method in cmd.Cmd.cmdloop.
  • [21:22] Example of Template Method in asyncore.dispatcher.
  • [22:30] Variant of Template Method - Mixin (not presented in Gang of Four book). It’s a class to be multiply-inherited from and supplies organizing methods only.
  • [25:50] Template Method in DictMixin class.
  • [26:45] Example of DictMixin usage.
  • [29:00] Hooks can be factored out in another class. Two examples of this from Python’s stdlib are HTML’s formatter vs. writer, SAX’s parser vs. handler
  • [32:40] Hook method introspection example of cmd.Cmd.docmd.
  • [33:30] There are three kinds of Template Methods - plain, factored into separate classes, and introspective.
  • [34:35] Example of all three kinds of Template Methods used in unittest.TestCase.
  • [36:17] State and Strategy Design Patterns. Very similar classes in what they do. They both factor out object’s behavior.
  • [40:40] Ring buffer example done via State Design Pattern.
  • [43:35] Q and A!

Python Design Patterns, A Recap

This video lecture was presented at Google Developers day. It is a short version of the previous two video lectures. It starts with an example of Facade Design Pattern, moves on to history and all the types of design patterns.

I did not write out the interesting moments from this lecture as it was a subset of previous two lectures.

If you liked these lectures, check out this geek song about another commonly used design pattern - Model-View-Controller Song :)

Even though these were Python design patterns, to understand some of them I used Perl Design Patterns website!

Were there any interesting points in the lectures that caught your attention?

Comments (4) Comments | Email Post Email 'Learning Python Design Patterns Through Video Lectures' to a friend | Print Post Print 'Learning Python Design Patterns Through Video Lectures' | Permalink Permalink to 'Learning Python Design Patterns Through Video Lectures' | Trackback Trackback to 'Learning Python Design Patterns Through Video Lectures'
(Popularity: 42%) 12,519 Views

Did you like this page? Subscribe to my posts!

Video Lectures 15 Apr 2008 02:25 pm
1 Star2 Stars3 Stars4 Stars5 Stars (3 votes, average: 4.67 out of 5)
Loading ... Loading ...

introduction to sqlite database rdbmsIf you have been following my blog, you might have noticed that almost all of my projects use the SQLite database engine.

My projects are relatively tiny, low traffic and data is mostly queried, not written. Such characteristics make SQLite the perfect database for my projects.

If you did not know, the SQLite database is self contained within a single file! There are no configuration woes, no network security to worry about, no hundreds pages of documentation. It’s just a single file!

See Distinctive Features of SQLite and Appropriate Uses for SQLite pages to find other points when SQLite is a good fit and when not.

Here is the lecture by Richard Hipp, the author of SQLite:

Here are some interesting facts from the lecture:

  • [02:50] SQLite is designed to be embedded, it’s less than 250 KB in size.
  • [08:00] Uncommon SQLite uses (this got me most interested): stand-in for client-server DBMS during testing/debugging. Local database caching. Implementing complex data structures. Sorting large amounts of data. Configuration files. IPC via database. Application file formats.
  • [14:06] SQLite is very convenient to use as a tool to teach basics of SQL, as it just works.
  • [19:32] Unusual features of SQLite: SQLite ignores data types for columns (you can store string in an integer column, for example). SQLite does type affinity on data inserted in columns. Table ’sqlite_master’ stores information about tables. Attaching to multiple databases simultaneously via ATTACH command. You can join or copy across multiple open databases (for example, hot backup the database).
  • [24:40] Anatomy of an SQL database engine.
  • [27:00] SQLite compiles queries to byte code (can be viewed via EXPLAIN statement) to be executed in a virtual machine.
  • [28:20] Observations of SQLite: trouble with licensing. A register based virtual machine is much easier to generate code for which is optimal than a stack based VM. Dynamic typing in databases is a really good thing. Regression tests allow rewriting large parts of SQLite without minor version releases.
  • [36:30] Q and A!
  • [36:35] Is there ORM tool available for SQLite?
  • [39:30] How is dynamic typing better than static typing in databases?
  • [41:32] What did you mean by ‘complex data types’?
  • [43:15] Why is a register based virtual machine better than a stack based?
  • [44:22] Why does SQLite only parse foreign keys but not enforce them?
  • [46:08] Is SQLite an in-memory database?
  • [46:50] What’s the future of SQLite?
  • [48:10] My SQLite DB got corrupt, what do I do?
  • [49:30] When does the DB roll back in case of power failure?
  • [50:30] What happens if there is a second power failure while rolling back the queries from previous power failure?

A few notes from me.

The usage of ‘manifest typing‘ really confused me in this lecture, because I, and most of the people I have talked to, uses this term for ’static typing’. The author of SQLite uses it to mean ‘dynamic typing‘. Don’t know why…

An SQLite database can be managed via the sqlite (or sqlite3) command line tool or GUI tool such as SQLite Browser (primitive), SQLiteSpy (advanced) and SQLite Manager (as a FireFox Add-on).

Finally, here are a few articles you should read if you are interested in more advanced SQLite details:

I hope you enjoyed it and have fun using SQLite for your next project!

Comments (1) Comments | Email Post Email 'Video Lecture On My Favorite DBMS - SQLite' to a friend | Print Post Print 'Video Lecture On My Favorite DBMS - SQLite' | Permalink Permalink to 'Video Lecture On My Favorite DBMS - SQLite' | Trackback Trackback to 'Video Lecture On My Favorite DBMS - SQLite'
(Popularity: 18%) 2,997 Views

Did you like this page? Subscribe to my posts!

Video Lectures 01 Apr 2008 10:10 am
1 Star2 Stars3 Stars4 Stars5 Stars (5 votes, average: 4.2 out of 5)
Loading ... Loading ...

guy l. steele jr. growing a language java acm talkI found a really exciting video lecture by Guy L. Steele that I’d like to share with you. The title of the lecture is “Growing a Language“.

The main thing Guy Steele asks during the lecture is “If I want to help other persons to write all sorts of programs, should I design a small programming language or a large one?” He answers that he should build neither a small, nor a big language. He needs to design a language that can grow. A main goal in designing a language should be to plan for growth. The language must start small, and the language must grow as the set of users grows.

As an example, he compares APL and Lisp. APL did not allow its users to grow the language in a “smooth” way. Adding new primitives to the language did not look the same as built-in primitives, this made users the language hard to grow. In Lisp, on the other hand, new words defined by the user look like language primitives, language primitives look like user defined words. It made language users easily extend the language, share their code, and grow the language.

Mr. Steele also prepared a PDF of his talk. Download it here (mirror, just in case: here).

He currently works at Sun Microsystems and he is responsible for research in language design and implementation strategies. His bio page on Sun Microsystems page says: “He has been praised for an especially clear and thorough writing style in explaining the details of programming languages.” This lecture really shows it.

I understood what he was up to from the very beginning of the lecture. Only after the first ten minutes Guy revealed that “his firm rule for this talk is that if he needs to use a word of two or more syllables, he must define it.”

Another thing Guy Steele shows with this talk is how a small language restricts the expressiveness of your thoughts. First you must define a lot of new words to be able to express yourself clearly and quickly.

Should a programming language be small or large? A small programming language might take but a short time to learn. A large programming language may take a long, long time to learn, but then it is less hard to use, for we then have a lot of words at hand — or, I should say, at the tips of our tongues — to use at the drop of a hat. If we start with a small language, then in most cases we can not say much at the start. We must first define more words; then we can speak of the main thing that is on our mind. […] If you want to get far at all with a small language, you must first add to the small language to make a language that is more large.

He gives many more interesting points how languages should be grown. Just watch the lecture!

He defined the following words during the lecture: woman, person, machine, other, other than, number, many, computer, vocabulary, language, define, program, definition, example, syllable, primitive, because, design, twenty, thirty, forty, hundred, million, eleven, thirteen, fourteen, sixteen, seven, fifty, ago, library, linux, operating system, cathedral, bazaar, pattern, datum, data, object, method, generic type, operator, overloaded, polymorphic, complex number, rational number, interval, vector, matrix, meta.

Comments (1) Comments | Email Post Email 'Growing a Language by Guy Steele' to a friend | Print Post Print 'Growing a Language by Guy Steele' | Permalink Permalink to 'Growing a Language by Guy Steele' | Trackback Trackback to 'Growing a Language by Guy Steele'
(Popularity: 14%) 2,749 Views

Did you like this page? Subscribe to my posts!

Page 1 of 212»