David Ken

I am currently a Data Scientist at Capital One, where I am a technical lead focused on training and deploying LLMs for customer call understanding. Recently, I have led publications in COLING and EMNLP. Previously, I've worked as a Lead Data Scientist at Wells Fargo and a Data Science Fellow at Insight Data Science. Prior to working as a Data Scientist in industry, I was a Lab Manager and Postdoctoral Associate in the Physics Department at Duke. I received a PhD from Caltech in Materials Science, where I studied glassy materials and their mechanics. I love complex networks: from wooden puzzles and grains of sand, to glasses and neural networks. What these diverse subjects have in common is their ability to remember. Piles of sand, just like our brains and neural networks can have a memory of the past through their complex fabric of contacts!

I believe that these networks will transform how we understand the world. Outside of research, I use machine learning to make models that try to: recommend games to groups, predict NBA player performance, and learn what perfumes smell like. In my free time, I am an avid baller - currently repping a pair of the Westbrook Why Not?'s. I also enjoy taking pictures and traveling (outside of basketball).

P.S. If you're wondering why my name appears as "Chen" and "Ken" in various places, it is because I recently got married and my wife and I decided to combine our names into a new one!

Current projects

Online fragrance recommendation engine based on a distributed memory model

NLP model is trained using "paragraph vectors" (see: https://cs.stanford.edu/~quocle/paragraph_vector.pdf) denoting perfumes/colognes, along with a word corpus consisting of over 140k user reviews scraped from basenotes.com (pulled on 4/18). Model was implemented in gensim. The code and a tutorial/background is here: https://github.com/dzchen314/deep-perfumes-potion-app. Try it out at potionfinder.com!

Online game recommendation engine based on implicit feedback and bayesian personalized ranking

Matrix factorization model is trained on Steam user data of over 100 Million users, utilizing game-ownership data as implicit feedback. The loss function is optimized using a bayesian personalized ranking, which achieves an AUC of 98% on a dataset that is >99.6% sparse. Try it out at ready-player2.com and use your Steam ID. Or try it with mine: snowmen314

Research interests

3D imaging the shear jamming process in hydrogel spheres

3D laser optical fluorescence tomography. 3D contact networks, e.g. fabric tensors. Image processing and contact mechanics for statistical contact force analysis.

Rheology and rate-stiffening of suspensions and granular composites

Discontinuous shear thickening of cornstarch and water. Novel granular composites with discontiuous rate-stiffening properties.

Force chains in granular materials

Rheology of grains around an intruder. Dynamics of grains during shear/impact. Statistical nature of granular force networks and contact fabrics. Machine learning approach to particle contact detection.

Fundamental deformation mechanisms in glasses

Size effect of ductility in metallic glasses. Mechanisms for plasticity in nano-metallic glasses. Molecular dynamics simulations of deformation.

Fabrication and testing of nanostructures

Electroplating and sputtering of nanoscale/hierarchical structures and laminates for tensile/compressive mechanical testing at room and cryogenic temperatures.

Atomic structure of glasses

In-situ tomography and diffraction of atomic neighborhoods in metallic glasses under hydrostatic pressure. Molecular dynamics simulations of structure and fractal properties in glasses.

Recent publications in peer-reviewed journals

Contact: dzchen314 at gmail dot com

Charlotte NC