David Chen

I am currently a Senior Data Scientist at Wells Fargo, where I am leading a team of data scientists to apply algorithms and NLP/NLU models for automating and understanding customer complaints, with a particular interest in phone transcript data and explainable AI + knowledge bases. Previously, I worked as a Data Scientist Fellow at Insight Data Science. Before that, I was a Lab Manager and Postdoctoral Associate in the Physics Department at Duke. I received a PhD from Caltech in Materials Science, where I studied glassy materials and their mechanics. I love complex networks: from wooden puzzles and grains of sand, to glasses and neural networks. What these diverse subjects have in common is their ability to remember. Piles of sand, just like our brains, can have a memory of the past through their complex fabric of contacts!

I believe that these networks will transform how we understand the world. Outside of research, I use machine learning to make models that try to: recommend games to groups, predict NBA player performance, and learn what perfumes smell like. In my free time, I am an avid baller - currently repping a pair of the Westbrook Why Not?'s. I also enjoy taking pictures and traveling (outside of basketball).

Current projects

Online fragrance recommendation engine based on a distributed memory model

NLP model is trained using "paragraph vectors" (see: https://cs.stanford.edu/~quocle/paragraph_vector.pdf) denoting perfumes/colognes, along with a word corpus consisting of over 140k user reviews scraped from basenotes.com (pulled on 4/18). Model was implemented in gensim. The code and a tutorial/background is here: https://github.com/dzchen314/deep-perfumes-potion-app. Try it out at potionfinder.com!

Online game recommendation engine based on implicit feedback and bayesian personalized ranking

Matrix factorization model is trained on Steam user data of over 100 Million users, utilizing game-ownership data as implicit feedback. The loss function is optimized using a bayesian personalized ranking, which achieves an AUC of 98% on a dataset that is >99.6% sparse. Try it out at ready-player2.com and use your Steam ID. Or try it with mine: snowmen314

Research interests

3D imaging the shear jamming process in hydrogel spheres

3D laser optical fluorescence tomography. 3D contact networks, e.g. fabric tensors. Image processing and contact mechanics for statistical contact force analysis.

Rheology and rate-stiffening of suspensions and granular composites

Discontinuous shear thickening of cornstarch and water. Novel granular composites with discontiuous rate-stiffening properties.

Force chains in granular materials

Rheology of grains around an intruder. Dynamics of grains during shear/impact. Statistical nature of granular force networks and contact fabrics. Machine learning approach to particle contact detection.

Fundamental deformation mechanisms in glasses

Size effect of ductility in metallic glasses. Mechanisms for plasticity in nano-metallic glasses. Molecular dynamics simulations of deformation.

Fabrication and testing of nanostructures

Electroplating and sputtering of nanoscale/hierarchical structures and laminates for tensile/compressive mechanical testing at room and cryogenic temperatures.

Atomic structure of glasses

In-situ tomography and diffraction of atomic neighborhoods in metallic glasses under hydrostatic pressure. Molecular dynamics simulations of structure and fractal properties in glasses.

Recent publications in peer-reviewed journals

Contact: dzchen314 at gmail dot com

Charlotte NC