The Five Best Things: Dec 5, 2020

Trees, Tigers, Transformers

I’m back! It was good to take a break for Thanksgiving, but it made the current week feel twice as long. Anyone else feel the same way? Add to the list of weird time dilations this year.

Building upon the attention mechanism I explained last time, let’s dig into transformer models today. I have the perfect development to explain how important transformers are to the state of the art in Machine Learning. I truly think that the Attention is All You Need paper is going to be the most important research work of the 21st century.

Next week I will explore how these transformer models have enabled the development of very large language models like OpenAI’s GPT-3 and Google’s Meena.

The Five Best Things

  1. Deepmind: AlphaFold - a solution to a 50-year-old grand challenge in biology

    • Earlier this week, Google’s Deepmind research lab published the results from AlphaFold 2 at the Critical Assessment of protein Structure Prediction (CASP) conference. AlphaFold2 is a machine learning model that predicts a protein’s 3D structure. Heretofore, predicting protein structure relies on bespoke experimental approaches. With AlphaFold 2, the desired prediction accuracy of 90% is achieved much faster (in hours vs days). This enables dependent research in a variety of fields to now proceed exponentially faster. And the breakthrough comes from the use of the attention mechanism.

    • I am not a biologist. My layperson understanding is that this is a big frickin’ deal that will likely speed up advances in a whole host of biological research - understanding cell behavior, drug development, therapeutics etc. We have a ton of genomic data sitting around that we have been unable to do much with so far, now we can. Exciting!

    • How did the attention mechanism help? Due to the fact that multiple attention blocks can be computed in parallel, you can arrive at the desired training accuracy with transformer models an order-of-magnitude faster. Whereas, architectures that rely on Recurrent Neural Networks (RNNs) and Long Short Term Memory (LSTM) blocks pay a penalty due to their sequential dependencies. This parallelism that is exposed by attention blocks in transformers, is precisely why they work excellently on chip architectures such as GPUs, TPUs and WSEs, that are designed for embarrassingly parallel workloads. The only limit is the amount of memory on chip, and the bandwidth between chips, once you run out of on-chip memory.

  2. a16z: Emerging Architectures for Modern Data Infrastructure

  3. Stratechery: An Interview with BuzzFeed CEO Jonah Peretti

    • In this post, Ben Thompson interviews CEO Jonah Peretti, just after Buzzfeed’s acquisition of Huffington Post was announced. It’s a wide ranging interview with someone who has been in the trenches of American media-social-political-celebrity culture for a long, long time. It covers a broad set of topics, including how BuzzFeed tinkered with and landed on what it believes is a winning business model, how the acquisition plays into this, how the media industry can navigate commoditization and digitization, and coastal elites’ domination of media. It touches upon how successful business models appear to have formed for freely available (ads) and for longform investigative pieces (subscriptions), but this leaves a hole where critical investigative pieces that should be widely and freely available is unsupported.

    • Peretti is obviously super knowledgeable about the digital media industry and comes across as humble despite his many years of experience; I was especially elated when he made the analogy for Buzzfeed being the Taiwan Semiconductor of journalism, whereby it provides the platform and monetization tools to a whole bunch of content creators, while relying on its brand value to get around aggregators like Facebook and Google to go directly to consumers. Amazing!

  4. Axios Deep Dive on Philanthropy

    • Axios published this special edition today on philanthropy - how the industry is changing from paternalism to grantees having some self determination on grant usage, how some uber-wealthy use philanthropy to gain PR, and Dolly Parton’s role in funding the Moderna vaccine (among many other awesome things!)

    • The holidays are firmly upon us, and there is a lot of suffering in the world right now as lockdowns are re-imposed in many places. If you have the means, please consider donating to a charity of your choosing; Google matches employee donations 1-to-1 and I will be happy to look up if your charity of choice is eligible for this program. Please reply to my email if interested!

  5. A Song of Ice And Fire Subreddit: Weirwood.net is real

    • This post and video is from two years ago, and is a 3 minute summary of this NYT article (The Social Life of Forests) that is making the rounds this week. Simplistic explanation: research has shown that the roots of trees in a forest form a network, with mushrooms as relay mechanisms. This mechanism allows trees in a forest to communicate and share resources among one another - a cooperative “underground economy”.

    • I don’t know why the NYT decided this was the week to popularize this research, which has been around for a while. To the point that George RR Martin made it a central plot point of the Game of Thrones book series. I’ll take any excuse to spotlight the work of Dr. Suzanne Simard though!

Honorable Mentions

Disclaimer: The views and opinions expressed in this post are my own and do not represent my employer.