a picture of me in front of a whiteboard...

Kui Tang

Machine Teacher

When you are perfectly free to feel stuck or not stuck, then you're unstuck.
Alan Watts

I am currently a second-year PhD student at Columbia University, where I am advised by Profs. David Blei and Tony Jebara. My recent work includes approximate inference in graphical models of both directed and undirected varieties, convex optimization, probabilistic word vector embeddings, and probabilistic models for matchings, permutations, and graph structures. Earlier in my undergrad, I worked on performance optimization of multithreaded code with Prof. Martha Kim.

Machine learning will grow increasingly fundamental to technology and society as the quantity and complexity of our data, and our ambition to make use thereof, increase without bound. The field requires algorithms that scale to big data, are practical and easy to use outside the academy, and are grounded in firm mathematical principles, so that their accuracy and speed can be depended upon for mission-critical applications. I aim to develop such algorithms for complex, structured models and data, motivated by challenging real-world problems.

My collaborators include Nicholas Ruozzi, David Belanger, David Sontag, Krzysztof Choromanski, Adrian Weller, and Melanie Kambadur.


See also DBLP and Google Scholar. Email me if a link is broken or for a copy of anything not linked.

  1. K. Tang, N. Ruozzi, D. Belanger, T. Jebara. "Bethe Learning of Conditional Random Fields via MAP Decoding". Artificial Intelligence and Statistics (AISTATS) 2016. [pdf] [bib] [code]
  2. A. Weller, K. Tang, D. Sontag, T. Jebara. "Approximating the Bethe Partition Functions". Uncertainty in Artificial Intelligence (UAI) 2014. [pdf] [code] [bib]
  3. M. Kambadur, K. Tang, M. Kim. "ParaShares: Finding the Important Basic Blocks in Multithreaded Programs." Euro-Par 2014. [pdf] [bib]
  4. K. Tang, A. Weller and T. Jebara. "Network Ranking with Bethe Pseudomarginals". NIPS Workshop on Discrete Optimization in Machine Learning 2013. [pdf] [bib]
  5. K. Choromanski, T. Jebara and K. Tang. "Adaptive Anonymity via b-Matching". Neural Information Processing Systems (NIPS) 2013. [pdf] [supp] [code] [bib]
  6. M. Kambadur, K. Tang, J. Lopez, and M. Kim. "Parallel scaling properties from a basic block view". International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS) 2013. [pdf] [bib]
  7. M. Kambadur, K. Tang, and M. Kim. "Parallel Block Vectors: Collection, Analysis, and Uses". IEEE Micro, 33(3):86-94, 2013. [pdf] [code] [bib]
  8. M. Kambadur, K. Tang, and M. Kim. "Harmony: Collection and Analysis of Parallel Block Vectors". International Symposium for Computer Architecture (ISCA), Jun. 2012. [pdf] [code] [bib]


See also my GitHub page for projects not listed here.
mexcpp: Easier MATLAB and C++ Integration
This library makes it easier to write MEX extensions by providing a natural C++ syntax to read and write MATLAB structures with no runtime penalty thanks to template metaprogramming. Arays, cells, and structures are all supported. [github]
Polynomial Time Inference of Bethe Partition Function and Marginals
Code to minimize the Bethe free energy up to additive error via discrete optimization, as well as continuous methods to minimize the Bethe free energy over various polytopes (but with no guarantees). See our UAI 2014 paper. [tarball]
Adaptive Anonymity via b-Matching
Safely release a sensitive database for analysis by censoring selected entries. We compute the censored entries by b-matching, which results in more released data compared to the state of the art, while maintaining the same privacy guarantee. See our NIPS 2013 paper. [tarball] [instructions] [b-match solver documentation]
Profile multithreaded programs: measure the number of times each line of code was run and how many threads ran it at the same time. This can be used to detect concurrency bottlenecks and analyze global program efficiency. See our ISCA and Micro papers. [homepage]


Below are notes on various mathematical topics, course projects, and unpublished manuscripts.

Review Notes
To review for finals, I like to condense the content of a course into 10-20 pages of notes. The following may be useful references:
Quasi Monte Carlo Inference for Log Gaussian Cox Processes
(With Jessica Forde and Liam Paninski, 2013) We give a self-contained introduction to quasi Monte Carlo, a deterministic substitute to Monte Carlo, and then use it for a Bayesian inference for a computationally difficult problem, including embedding quasi Monte Carlo samples into a Metropolis-Adjusted Langevin algorithm. Encouraging convergence rate speedup results are presented. [paper] [slides]
Inferring Direct and Indirect Functional Connectivity Between Neurons From Multiple Neural Spike Train Data
(With Ben Shababo and Frank Wood, 2012) We aim to model the functional connectivity of neural microcircuits by accounting for (1) unobserved neurons and (2) long-run connections, two problems not yet adequately addressed in the literature. Encouraging simulation and lab-data results are analyzed. [paper]
Personalized Emotion Classification with Latent Dirichlet Allocation
(2012) Work on sentiment analysis of text has focused on scalar values of sentiment. We instead represent emotions multidimensionally as topic vectors and learn a personalized emotion lexicon. We take a semi-supervised approach with latent Dirichlet allocation (LDA) and a derivative, SubjLDA, initializing our model with a psychologically validated seed corpus. We demonstrate significant improvements to the baseline. [paper]

Invited Talks

Some talks I have given, with video if available.

Adaptive Anonymity via b-Matching
(UMass Amherst Machine Learning and Friends Lunch, February 2014) [slides]
Statistical Machine Learning with Bayesian Networks
(Columbia Data Science Society, November 2013; hackNY Masters, New York University, September 2013.) I teach the fundamentals of Bayesian networks, MCMC inference, and probabilistic programming to undergraduates and masters students with little previous exposure to statistics or machine learning. [IPython notebook] [github]
The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks
(Machine Learning Reading Group, Columbia University, June 2012.) I walk through the proof by G.F. Cooper that posterior inference in Bayesian networks is NP-hard. The proof reduces from 3SAT using an elegant gadget graph. [slides]