Paper: Question Asking as Program Generation (62/365)

I am very bored by language models that just pump the training data through something and then predict the word following a partially revealed sentence. I mean, I don’t learn anything from them apart from a lot of technical wizardry. So, I am trying to look for papers that lean towards true comprehension.

This paper “Question Asking as Program Generation” is from NIPS 2017. They have collected 605 questions that humans have asked in the context of several partially revealed battleship game board with a view to gain information about the positions of the hidden ships. So each of these questions comes with a context (the game board). The authors aim to create a system to predict the question given a context.

Questions as Programs

In order to generate questions we need something compositional. The authors turn to programming. In this case, lisp programs. Every question is written as a lisp program. For example, How many tiles have the same colour as tile 2F? is written as (setSize (colouredTiles (colour 2F))). These are converted by hand.

They have a limited set of allowed predicates and constructions as the domain is limited to this battleship board. As a result, they define a simple context-free grammar that covers all possible valid programs.

Question Usefulness

In the context of the battleship game, the human subjects were asked to come up with ‘useful’ questions. They were given plenty of training. It’s great having a grammar to generate valid programs but how many of them are actually useful? To that end, they define several features. I’ll describe these and skip over the probability model that is optimized.

The first is the Expected Information Gain value of a question. That is, how much uncertainty is reduced in knowing the actual board state after the answers (averaged over multiple contexts) to that question. So, questions which reveal more (on average) about the actual state get a higher score.

The second is complexity. Favoring the above feature only can lead to very long questions. So, this feature is thrown in to favor shorter questions. This measures the complexity of a question based on the grammar.

Another feature is the answer type. That is whether it is a yes/no question or a location or a colour.

Conclusion

What’s interesting is that they were able to generate questions (i.e. programs) given a context. It’s interesting to tackle a problem like this. It is similar to say a robot learning language with respect to its senses. Learning can be very quick because the language has to be computed unlike unsupervised language models.

Posted in Uncategorized | Tagged , , | Leave a comment

A History of Natural Language Processing

Last weekend, I gave a talk at the Bangalore Computer Vision Meetup group on “A history of Natural Language Processing”. I tried to compile together a perspective of the subject that may be unfamiliar to many. Take a look here:  BCVM | History and Evolution of NLP | Mr. Naren Sundaravaradan, Digital Aristotle

Posted in language | Tagged , | Leave a comment

DSL for Generative Models – Next Steps (61/365)

I’m going to sketch out the next few things to plan and code for the DSL library. I have so far provided the ability to describe the network but I haven’t yet provided a way to describe the distributions. For instance, in the HMM Model, we ought to be able to say what the distribution of Symbols is and so on. I’ll start with the ability to pick two distributions: Dirichlet and Multinomial. We can cover many models with just these two. When I provide a way to specify what type of distribution each node is, I should be able to change the distributions at will without affecting the network; for instance, using a continuous response as opposed to a discrete response in a HMM.

After this, I will want to create a function in the Gibbs module that can take in a Reader and sample the distributions. In the case of HMM, this would mean sampling the Transition distributions and the Symbols distributions by reading the network to figure out their priors and support.

Finally, with the sampled distributions and a Reader I will write a sampler that produces a new Reader. In the case of HMM, this means sampling the new Topic variables. These steps cover the (uncollapsed) Gibbs sampling technique.

Looking ahead even further, I intend to write a method to compute the density of the observed variables (having marginalized out the latent variables). I will do this using the annealed importance sampling method as described in this paper “Evaluation Methods for Topic Models” by Hanna M. Wallah et. al. In the case of the HMM, this amounts to computing the probability of Symbol given Symbols and Transition while marginalizing out the Topic variables.

Posted in Uncategorized | Leave a comment

Lexicographical Ordering (60/365)

> import Data.Ord (comparing)

This is completely random but I thought it was a neat use of laziness. You are familiar with lexicographical ordering? Haskell’s compare of lists implements this.

ghci> [1,2] < [1,3]
  True

ghci> [1,3,4] < [1,3,4,5]
  True

ghci> [2,4] < [2,3]
  False

Note that this favors shorter strings.

ghci> [1,2] < [1,2,3]
  True

ghci> [1,3] < [1,3,4]
  True

For whatever reason, I wanted to favor longer strings. How can we do this? First, note that the above is equivalent to doing the comparison after appending an infinite list of zeros to each operand (assuming we are using only positive numbers).

ghci> let aug xs = xs ++ cycle [0]
ghci> comparing aug [1,2] [1,3]
  LT

ghci> comparing aug [1,3,4] [1,3,4,5]
  LT

ghci> comparing aug [2,4] [2,3]
  GT

If I, instead, append an infinite list of a large number I can get what I want.

ghci> let aug xs = xs ++ cycle [9999999]
ghci> comparing aug [1,2] [1,2,3]
  GT

ghci> comparing aug [1,3] [1,3,4]
  GT
Posted in Uncategorized | Tagged , | Leave a comment

Learning a Language (59/365)

I want to start another aside because this is something I spend a significant amount of time thinking about. That is, how do you learn a new language when you are approaching 30? Living in India, I think it’s a shame that people communicate only in English at work (if you are in the tech industry) because India is one of the few countries that still speaks in a vast number of tongues (though the number is perishing at an alarming rate).

Auto drivers in Bangalore speak at least four languages with ease: Hindi, Kannada, Tamil, and Telugu. And, English is a given. Everyone ought to speak as many languages as possible without inhibitions and should be encouraged to do so even if all you manage to speak is a “good morning” in that language. There is just so much to gain including new friends and perspectives and traditions and in the most beautiful way – through all these differences – it makes us aware of just how similar (not identical) we all are.

That brings me back to how one can learn a language quickly. I’ve always been fascinated with individual words in different languages. Part of that has to do with simply wanting to know the origin of a word and you end up tracing it back to different countries and sometimes end up recounting the history of two countries along the way. Ponder on the words “algebra” or “philosophy”.

I studied Hindi in school fifteen years ago and I hated it. It was boring and tiresome. There was no joy in the learning process at all. Now, I wish I could speak every language in the world. At this point I point you to this wonderful book by Anthony Burgess “Language Made Plain”. I want to start by making some random notes on the easy and difficult things I am facing as I learn Hindi once again

  1. Learning a new alphabet is a problem. I really which all languages used a common alphabet. For me, this means, if I have to properly learn Kannada there is little point in me trying to read the script. That will take time.

  2. Find out as many audio/video resources as possible. Once I refreshed the basics I scoured the web for children stories to listen to and try to listen to them while going and coming from work.

  3. I find listening is the easiest and I have shown huge improvements on this in a short time. Speaking is harder mainly because I don’t get the chance to practice it. It takes time to construct a grammatically correct sentence. Forget writing for now.

  4. Learning/picking up vocabulary is easiest. I used to think this is the main problem but it’s negligible compared to the problem of sentence structure, gender modifications, and more importantly learning phrases in the zeitgeist.

  5. I need a way to measure my progress easily and to make sure improvement is not stalled. I haven’t yet found a good way. What I do now is while listening to some audio I write down words I don’t recognize and then look them up later. But again, right now my main issue is with figuring out how to improve my spoken form given that I am unable to practice out in the open regularly.

  6. I like writing. My idea right now is to find a good and simple way to exploit this as an alternative to having to find people daily to practice on. What I want is a way to quickly exercise my ability to create short and grammatically correct sentences in various tenses.

Anyway, I’ll make a few posts now and then about approaches do and do not work for me when learning a language.

Posted in Uncategorized | Tagged , | Leave a comment

DSL for Generative Models (58/365)

I’ve now cleaned up (in befb0f3cca0c212e368497e86f030aa96355be18) the Reader and Writer interfaces and added it to Statistics.GModeling.Gibbs. I’ve removed references to Support and simply parameterized using a key type k and value type v.

> data Reader k v = Reader
>   {
>     -- | Number of available indices
>     size :: Int
>     -- | Read the value at the given key
>   , readn :: Int -> k -> v
>     -- | Create a copy for writing only
>   , copy :: IO (Writer k v)
>   }
> 
> data Writer k v = Writer
>   {
>     -- | Write the value at the given key
>     writen :: Int -> k -> v -> IO ()
>     -- | Create a read-only copy
>   , readOnly :: IO (Reader k v)
>   }

I’ve also simplified the type of Indexed and added an implementation of Reader and Writer for HMM in Statistics.GModeling.Models.HMM.

Posted in Uncategorized | Tagged , , | Leave a comment

DSL for Generative Models (57/365)

I am putting together what I have so far in a repository here. So far, (133e22dc979d988706aafe52a346cee004f70ca5) it contains

  1. Statistics.GModeling.DSL
  2. Statistics.GModeling.Models.HMM
  3. Statistics.GModeling.Models.LDA
  4. Statistics.GModeling.Models.FixedTreeLDA

Will continue building the pieces in upcoming posts.

Posted in Uncategorized | Tagged , , | Leave a comment