Daniel (Lin-Kit) Wong

Hello! I am a sixth-year PhD student in the Computer Science Dept. at Carnegie Mellon.
I am advised by Professor Greg Ganger and am a member of the Parallel Data Laboratory.

I am a systems builder and hacker with a focus on systems design and distributed systems.

I now work on ML for caching. I spent my initial years at CMU building machine learning systems. In Summer ‘19, I interned at Google, working on scheduling for model parallelism in TensorFlow. Before my PhD, I worked on everything from graph clustering for bioinformatics to systems security, and interned at Dropbox and Facebook.

Fall ‘22: I'm always keen to talk about problems in my areas of interest (distributed systems, ML for Systems, large-scale storage, cloud computing, Systems for ML, security, neuroscience). Reach out if you have problems, insights, or data to share!

Résumé (Oct ‘22) | Publications (9 papers, 2 patents)
Research during my PhD at CMU
  • Ongoing (Fall ‘22):
    • ML-driven admission and prefetching for flash caches Spring ‘20 - Present

      Daniel Wong, Carson Molder, Daiyaan Arfeen, Daniel Berger, Nathan Beckmann, Greg Ganger
      (and Meta collaborators including Hao Wu, Qing Zhao, Jimmy Lu, and Sathya Gunasekar)

  • Keen to explore:
    • Applications of clustering & dimensionality reduction for time series and graphs.

      I'm keen to explore interpretable machine learning methods that find correlations in time series and graphs, with an especial interest in visualizations, interpretability and causality. This applies for both systems and neuroscience datasets.

    • Areas I have a soft spot for / past background: neuroscience, systems security, HCI, psychology.
  • Past projects:
    • CANDStore: High availability in cheap distributed key value storage. Spring ‘19 - Summer ‘20

      Thomas Kim, Daniel Lin-Kit Wong, Gregory R. Ganger, Michael Kaminsky, David G. Andersen. ACM SoCC 2020 [article].

    • 10-708 (PGM) project: Dimensionality reduction on neuroscience datasets. Spring ‘20
      Stitching neural population recordings (electrophysiological) from different days.
      Adam Smoulder, Sami Horn, Daniel Wong
    • Co-optimizing scheduling and device placement in TensorFlow with deep RL for automatic model parallelism.
      Google Summer ‘19 intern, Fall ‘19 Student Researcher

      Daniel Wong, Peter Ma^, Sudip Roy*, Yanqi Zhou*
      ^Google Platforms Performance, *Google Brain (ML for Systems)

      Related papers accepted to NeurIPS [pdf] and IEEE Micro. 2 patents.

    • Selective-Backpropagation: Accelerating Deep Learning by Focusing on the Biggest Losers. Fall ‘18 - Fall ‘19

      Angela H. Jiang, Daniel L.-K. Wong, Giulio Zhou, David G. Andersen, Jeffrey Dean, Gregory R. Ganger, Gauri Joshi, Michael Kaminksy, Michael Kozuch, Zachary C. Lipton, Padmanabhan Pillai [preprint]

    • Mainstream: Dynamic Stem-Sharing for Multi-Tenant Video Processing. Fall ‘17 - Spring ‘18

      Angela Jiang, Daniel Lin-Kit Wong, Christopher Canel, Ishan Misra, Michael Kaminsky, Michael A. Kozuch, Padmanabhan Pillai, David G. Andersen, Gregory R. Ganger. USENIX ATC 2018. [PDF]
      Part of the Intel Science and Technology Center for Visual Cloud Systems (ISTC-VCS).

  • Past explorations (keen to revisit if things change)
    • Transient failures (grey failures). Fall ‘19

      How can we balance initiating recovery quickly and overreacting to transient failures?

    • Affordable robustness to failures in distributed storage. Spring ‘19 - Fall ‘19

      3-way cross-region replication is expensive and slow. It helps mitigate rare risks like a hurricane taking out a data center, but why pay that price for common events like equipment failures? Can we detect and predict correlated failures?

      Outcome: I performed simulations based on theoretical modelling and presented a poster on transient failures at PDL Retreat 2019. Although there was strong interest from industry who were also grappling with this problem, this project was put on hold because of a lack of real world data to model the failures. I would be keen to revisit this project. Hit me up if you are able to offer any datasets!
    • Speeding up evolutionary neural architecture search with adaptive weight sharing. Summer ‘18 - Fall ‘18
Teaching & Coursework at CMU
Highlights from life before starting my PhD

I'm a tinkerer at heart, and am always on the lookout for novel challenges to work on. In seeking opportunities, I aim to optimise for learning and to do meaningful, impactful work. I bask in the energy of synergistic collaborations, and the opportunity they give me to wade into new domains and learn from cool people.

I'm a software engineer and have a relentless urge to automate and optimize all parts of my work process.

I enjoy musicals, singing and karaoke (car karaoke setup), cooking, Singaporean food, skiing & snowboarding, gliding, long scenic drives (and walks), waterfalls, baking, rock climbing, ice skating, computer games, scuba diving, and last but not least, good nigiri. I did my undergraduate studies at the University of Cambridge and am a member of Churchill College. I grew up in Singapore, am a 华中子弟, and am a proud alumnus of my high school computer club EC3 (where I learnt to code and hack stuff together.)

Get in touch: | [same username]@cmu.edu | LinkedIn | Facebook | Twitter | Mastodon | Keybase | PGP key

My stuff: Quora | GitHub | Google Scholar

More about me: Publications | Biography