Hi all, Happy New Year! I hope your 2021 is better than 2020, and that you were able to catch a breath the last two weeks. I got thrown off the writing schedule a bit; my kids decided they would not let me sleep for a stretch of a few days.
An all GPT-3 edition today. If you recall, we’ve talked about the attention mechanism underpinning most large natural language models, which tend to be based on the transformer architecture. An example of such a large model is OpenAI’s GPT-2. Today, let’s dig in to its successor GPT-3, which was released this year to a limited set of folks, and caused a huge splash when the demos based on its technologies were rolled out in June and July.
The Five Best Things
GPT-3: What's Hype, What's Real
This podcast with Sonal Chokshi and Frank Chen is the best layperson overview of GPT-3 and its implications that I’ve come across. It’s only ~30 minutes and covers the technology, the demos released so far, and discusses if it passes the Turing test and if it’s going to be a job killer.
Some terminology: GPT-3 (and other transformer models) are called few-shot or zero-shot learners, in that they get a few or no “shots” to learn before they can be applied to a problem.
Prior to general release, OpenAI handpicked a few folks to play around with the model and build applications on top of it. These demos were released at the end of June in a fairly orchestrated (IMHO) manner, displaying mind-blowing and diverse capabilities. There was a Figma design builder, a Latex equation generator, fiction writing, chatbots, love letter writing, etc. Several more demos are cataloged here.
I applied for API access but the powers that be at OpenAI didn’t pick me :) I did play around with some of the existing demos and generated a few outputs. A couple of them below (diapers were on my mind at the time). As you can see, without some thoughtful prompting and tuning, the output text is banal, full of misspellings and incorrect context, or downright ridiculous (Pamela Anderson’s Diapers??!). Right now, I feel pretty confident that this isn’t going to take away a competent copy writer’s job. Will it at some point in the future?
This one is from Taglines
This one is from CopyAI, which generates ad copy
Ben Dickson: The GPT-3 Economy
After rolling out the API in June, OpenAI turned off the spigot in October, and Microsoft declared it would be exclusively licensing GPT-3 as part of its ongoing $1B partnership with OpenAI. The pricing model is quite interesting and Ben Dickson discusses the implications of it here. Instead of open sourcing the model, OpenAI will charge for it on a subscription basis, to companies they deem to be ethical and for non-harmful purposes. It’s unclear how much of this curation control rests with Microsoft. A few demos folded because they are unable to afford the costs.
Lambda labs estimated that a GPT-3 like model will required $4.6M to train once; it likely went through several rounds of hyperparameter tuning, putting the costs of training at at least 5x that amount. Post training, it also has to be served, costs of which are estimated at $100,000 - $150,000/year. In addition, there’s the research lab’s staff salaries. You’re easily looking at a $30M-$100M annual burn rate. Another estimate pegs OpenAI’s margins at 60x cloud operating costs.
Andrew Mayne’s blog covers ways in which companies can reduce costs on GPT-3. Ben’s follow up post presents some interesting potential outcomes from OpenAI’s Microsoft partnership.
Renee DiResta: The Supply of Disinformation Will Soon Be Infinite
Renee DiResta, research manager at the Stanford Internet Observatory wrote a compelling piece on how GPT-3 or copycat technologies will enable content farms, dis- and mis-information spreaders to go even further into overdrive in the near future. When the marginal cost of generating malicious content trends to zero, it’s going to remove any and all friction for bad actors. This position is corroborated by the Middlebury Institute of International Studies.
Renee is a complete bad ass and I highly recommend you follow her work. Her warnings were prescient - a GPT-3 based bot was found to be posting content and engaging with comments on Reddit, as if it were a real person.
Could identity verification on the internet finally be THE killer use case for Blockchain?
Page Street Labs: GPT-3 and A Typology of Hype
An excellent framework from Delip Rao at Page Street Labs on how to assess emergent technologies with lots of hype surrounding them. I especially encourage you to read the summary!
If you’re in the mood for an even longer discussion about GPT-3, I suggest Gwern’s May 2020 newsletter. Yann LeCun also gives a measured opinion here.
Honorable Mentions
Stanford botched the roll out of its vaccines, prioritizing older administrators over younger frontline staff. They blamed the “algorithm”, as if the algorithm was something super complex and hard to parse. Karen Hao from MIT Tech review got a hold of it here - it’s just a set of rules that a middle schooler could probably code up. I am not looking forward to an era of “the algorithm made me do it” excuses for bad leadership decisions.
NBER: How to Talk When a Machine is Listening: Corporate Disclosure in the Age of AIInteresting working paper which studies how algorithmic trading has impacted the way companies and leaders talk in public, to engineer certain outcomes.
Topbots: 2020’s Top AI & Machine Learning Research Papers A very well-written article, accessible at many levels. You can jump to the core idea of every paper to understand why it was deemed a “Top” paper.
WSJ: Far From Washington, Americans Are Finding Local Solutions Very heartening read and one of the bright things to come our of civic life this year.
WSJ: ‘There Was a Piece Missing—We Were All White’: One Bank Targets Racial Inequity Boston-based Eastern bank demonstrates the long and sustained effort it takes to achieve true diversity and community leadership. A brilliant case study.
Vogue: Zoom for the Holidays: How the World Celebrated Online in 2020 German photographer Thomas Dworzak reached out to various communities across the world and asked if he could join their Zoom Christmas calls, resulting in this photo essay.
Bloomberg: How a Homeless High School Dropout Became CEO of a $1 Billion Company Chronicles the story of Taihei Kobayashi, a Japanese programmer and founder of Sun* - a company he built from the streets and in Vietnam and Japan over 20 years. It listed on the Tokyo Stock Exchange in July and hit $1.4B market cap.
Mymodernmet: Couple Finds Over 60 Bottles of Smuggled Whiskey From Prohibition Times Hidden in Their Walls
VanityFair: A Few Things That Actually Went Right in 2020 Parasite’s Oscar win, the DNC roll call and Megxit were among some awesome things that happened this year!
GatesNotes: These breakthroughs will make 2021 better than 2020 A really hopeful coda to the year from Bill Gates.
Disclaimer: The views and opinions expressed in this post are my own and do not represent my employer.