For decades, researchers have developed advanced methods which allow us to analyze quantitative data (think survey) in sophisticated ways; from clustering & factor models to predictive Bayesian analysis. Perhaps the largest impact AI is already having on market research is enabling these battle-tested quantitative methods to be used on data which is qualitative in nature — namely video, audio, and text.
While there have been crude methods for quantifying these type of data in the past, a new generation of deep learning algorithms are capable of doing this with dramatically increased accuracy and speed. At that heart of these algorithms is the idea of “encoding” — that is, the act of converting qualitative data into a quantitative vector.
A simple way to conceptualize “encoding” is this: Imagine we start with an open-ended question. Now imagine we construct a set of 200 binary ‘quant’ questions which aim to surface the same information as the open-ended question. Given a set of responses to the open-ended question, we can think of “encoding” as doing two things. First, identifying the best 200 quant ‘questions’ to capture all of the qualitative information found in the responses. Then, for each qualitative response, computing what the answers to those 200 quantitative questions would have been. The numerical vector containing these 200 ‘answers’ is the ‘encoded’ response.
In this conceptualization, we can think of those open-ended responses as video, audio, or text because encoding is possible agnostic of data type. The main line separating current approaches to encoding is between supervised and unsupervised approaches.
Supervised approaches start by having a human decide what the ‘questions’ are ahead of time, and then assemble a ‘training set’ which has example pairs of qualitative data and human-specified answers to the corresponding ‘questions.’ A good example of this is encoding facial expressions into the emotions they express. In this case, humans identify the ‘questions’ of which emotions a person’s face might express, create a ‘training’ set which contained pairs of faces & the emotions they expressed, then use this to train a model which encodes a picture of a face into the emotions it expressed.
Supervised encoding models have the advantage that they are relatively easy to build, however, they require human labor to develop training sets and are limited by their ability to only encode in the way they were trained.
Unsupervised models do not require a human to identify the ‘questions’ or tag a training set — the only data they require to learn is the ‘qualitative’ data which they aim to encode. Unsupervised models simultaneously learn both the ‘questions’ which best capture the qualitative information & what the answers to those ‘questions’ should be for a given input (like a video or sentence). While crude, unsupervised models have been around for some time (like LDA for topic analysis & k-means for simple clustering), deep-learning based approaches (like auto-encoders) are enabling meaningful encoding of qualitative data previously thought impossible.
These new capabilities, which enable the deep quantification of qualitative data, are already enabling researchers to bring the methodological rigor of quantitative methods into the world of qualitative research but I am confident this is only the beginning.