- This event has passed.
Yong-Yeol Ahn, An implicit statistical bias of word2vec model and why it may be a good thing
August 17, Tuesday, 2021 @ 10:30 AM - 12:00 PM KST
Abstract:
Neural language models have revolutionized how we model text data as well as a broad range of machine learning methods, even beyond methods for natural language processing. One of the first, simplest, and most widely used methods is the skip-gram negative sampling model, or simply word2vec, which allows us to obtain vector representations of elements (e.g., nodes in a graph) given their sequences (e.g., random walk trajectories from a graph). A less known property of the word2vec method is that it is a biased estimator and does not optimize the “correct” language model likelihood function. In this talk, I will explain this bias and surprising implications in the representation learning of graphs and mobility trajectories.
Bio:
Yong-Yeol (YY) Ahn is an Associate Professor at Indiana University School of Informatics, Computing, and Engineering and a Visiting Professor at MIT. Before joining Indiana University, he worked as a postdoctoral research associate at the Center for Complex Network Research at Northeastern University and as a visiting researcher at the Center for Cancer Systems Biology at Dana-Farber Cancer Institute after earning his PhD in Statistical Physics from KAIST in 2008. His research focuses on developing and applying network science and machine learning methods to a wide range of complex social and biological systems, including large-scale social phenomena, health, inequality, neuroscience, culture, and science of science. He is a recipient of several awards including Microsoft Research Faculty Fellowship and LinkedIn Economic Graph Challenge.