I’m back! It was good to take a break for Thanksgiving, but it made the current week feel twice as long. Anyone else feel the same way? Add to the list of weird time dilations this year.
Building upon the attention mechanism I explained last time, let’s dig into transformer models today. I have the perfect development to explain how important transformers are to the state of the art in Machine Learning. I truly think that the Attention is All You Need paper is going to be the most important research work of the 21st century.
Next week I will explore how these transformer models have enabled the development of very large language models like OpenAI’s GPT-3 and Google’s Meena.
The Five Best Things
Earlier this week, Google’s Deepmind research lab published the results from AlphaFold 2 at the Critical Assessment of protein Structure Prediction (CASP) conference. AlphaFold2 is a machine learning model that predicts a protein’s 3D structure. Heretofore, predicting protein structure relies on bespoke experimental approaches. With AlphaFold 2, the desired prediction accuracy of 90% is achieved much faster (in hours vs days). This enables dependent research in a variety of fields to now proceed exponentially faster. And the breakthrough comes from the use of the attention mechanism.
I am not a biologist. My layperson understanding is that this is a big frickin’ deal that will likely speed up advances in a whole host of biological research - understanding cell behavior, drug development, therapeutics etc. We have a ton of genomic data sitting around that we have been unable to do much with so far, now we can. Exciting!
How did the attention mechanism help? Due to the fact that multiple attention blocks can be computed in parallel, you can arrive at the desired training accuracy with transformer models an order-of-magnitude faster. Whereas, architectures that rely on Recurrent Neural Networks (RNNs) and Long Short Term Memory (LSTM) blocks pay a penalty due to their sequential dependencies. This parallelism that is exposed by attention blocks in transformers, is precisely why they work excellently on chip architectures such as GPUs, TPUs and WSEs, that are designed for embarrassingly parallel workloads. The only limit is the amount of memory on chip, and the bandwidth between chips, once you run out of on-chip memory.
An extensive survey conducted by Andreesen Horowitz resulted in a reference architecture for what a highly performant data science/data engineering stack should look like. The article then dives into specific use cases such as AI/ML, business analytics and data processing.
This article is very dense, but it is an excellent summary of the state of the market in data science tools and data infrastructure. You can read the first part till the unified architecture is explained, then jump to the reference designs for the specific data use case you’re interested in. Simply fantastic work. I find a16z’s work on AI/Data Science to be the most accessible for a technically literate + business focused audience.
In this post, Ben Thompson interviews CEO Jonah Peretti, just after Buzzfeed’s acquisition of Huffington Post was announced. It’s a wide ranging interview with someone who has been in the trenches of American media-social-political-celebrity culture for a long, long time. It covers a broad set of topics, including how BuzzFeed tinkered with and landed on what it believes is a winning business model, how the acquisition plays into this, how the media industry can navigate commoditization and digitization, and coastal elites’ domination of media. It touches upon how successful business models appear to have formed for freely available (ads) and for longform investigative pieces (subscriptions), but this leaves a hole where critical investigative pieces that should be widely and freely available is unsupported.
Peretti is obviously super knowledgeable about the digital media industry and comes across as humble despite his many years of experience; I was especially elated when he made the analogy for Buzzfeed being the Taiwan Semiconductor of journalism, whereby it provides the platform and monetization tools to a whole bunch of content creators, while relying on its brand value to get around aggregators like Facebook and Google to go directly to consumers. Amazing!
Axios published this special edition today on philanthropy - how the industry is changing from paternalism to grantees having some self determination on grant usage, how some uber-wealthy use philanthropy to gain PR, and Dolly Parton’s role in funding the Moderna vaccine (among many other awesome things!)
The holidays are firmly upon us, and there is a lot of suffering in the world right now as lockdowns are re-imposed in many places. If you have the means, please consider donating to a charity of your choosing; Google matches employee donations 1-to-1 and I will be happy to look up if your charity of choice is eligible for this program. Please reply to my email if interested!
This post and video is from two years ago, and is a 3 minute summary of this NYT article (The Social Life of Forests) that is making the rounds this week. Simplistic explanation: research has shown that the roots of trees in a forest form a network, with mushrooms as relay mechanisms. This mechanism allows trees in a forest to communicate and share resources among one another - a cooperative “underground economy”.
I don’t know why the NYT decided this was the week to popularize this research, which has been around for a while. To the point that George RR Martin made it a central plot point of the Game of Thrones book series. I’ll take any excuse to spotlight the work of Dr. Suzanne Simard though!
The Print: MDH owner Mahashay Dharampal Gulati, the tongawala who turned crorepati, dies at 97 If you have ever bought Indian spices, this dude’s face stared back at you from the spice aisle. Hard to overstate how much packaged spices was a public health and consumer goods gamechanger in India and the world. Before these spices were packaged, you would have to buy them from your local roadside vendor with zero quality control. In 2017, the Economic Times reported that the then 94-year-old was the consumer goods sector’s highest paid CEO. He was also a massive philanthropist. RIP.
Techrepublic: Amazon reveals reason for last week's major AWS outage AWS had a major outage the Wednesday before Thanksgiving day, which caused a whole host of cloud hosted services such as 1Password, Acorns, Adobe Spark, Anchor, Autodesk, CapitalGazette, Coinbase, DataCamp, Getaround, Glassdoor, Flickr, iRobot, RadioLab, Roku, RSS Podcasting, Vonage, The Washington Post, and WNYC to go down. AWS announced in a post-mortem that it was due to the servers in the fleet exceeding the maximum number of threads allowed by an operating system.
Kevin Tsai: Serving Machine Learning Models in Google CloudA great 4-part overview on at-scale model serving by my coworker! I just noticed the first picture is of spices which is completely coincidental.
Spotify Design Blog: Designing Data Science Tools at Spotify a great practical guide on mapping out data science workflows and turning them into tangible products to meet users where they are - loved it!
Harper’s Bazaar: The Lessons I Learned From My Bubbie, RBG Clara Spera, RBG’s granddaughter wrote her a moving elegy. The biggest takeaway is how equitable sharing of domestic duties was key to RBG’s success, a pattern I have seen repeatedly in stories of trailblazing women.
John Ratcliffe in the WSJ: China Is National Security Threat No. 1 I have nothing to add here, other than to encourage folks to read this and make up their own minds.
This Cecelia Rouse bio made me very excited for evidence-based policies to come out of the Council of Economic Advisors in the Biden White House
Microsoft’s Windows Ugly Holiday Sweater Shop When Taco Bell starts its own boutique hotel, you know that Microsoft is somehow going to take corporate meme-fication too far. Then again these sold out in an instant, so what do I know?
Emily Oster: Vaccines Emily Oster gives a very accessible explanation of the mRNA based Moderna and Pfizer Covid-19 vaccines, and unfortunate news for pregnant women and kids - trials have not yet adequately covered them, so they will be the last to receive a vaccine. If you are pregnant, please consider registering for the Stonybrook University COPE study. Details below.
Disclaimer: The views and opinions expressed in this post are my own and do not represent my employer.