A deep-dive on reinforcement learning (RL) and some very cool vision transformer stuff out of Facebook AI Research (FAIR) today.
The Five Best Things
The Sequence Edge: Reinforcement Learning
The Sequence is a substack I came across recently, which covers ML news, funding announcements and deep dives on specific topics. I really enjoy it and suggest you subscribe if you are interested in more-than-superficial awareness of ML.
The edition I linked to is the kick off of their reinforcement learning (RL) series; RL is a method of Deep Learning where agents develop “intelligence” by exploring, sensing and subsequently mastering their environments, usually by focusing on a very small or narrow task. Almost like a baby learning how to sit up, then crawl, then walk..
From an architecture standpoint, a reinforcement learning model is based on four fundamental components:
Agent: The intelligent program trying to learn a new task.
Environment: The programmatic world that the agent interacts with to execute the target tasks.
Rewards: A function that produces a score quantifying how the agent performs with respect to the environment.
Policy: an algorithm used by the agent to decide the next actions to take.
The policy used by RL methods can broadly be categorized as 1) model-based, where a deep learning model is supplied to make next action decisions, and 2) model-free, where no model is supplied. Instead, the goal is to sample and learn from the environment. An example of model-based methods is the AlphaZero model which learned to play and eventually beat human players at the game Go.
The next section of this post talks about the hide-and-seek RL agents developed by OpenAI; as simulation time went on, the agents demonstrated emergent behavior - such as fort building and using props for defense - completely organically.
Facebook AI:Vision transformers, DINO and PAWS
I’ve written extensively about the Transformer model architecture and its ability to pay “attention” to the most important part of an input sequence, in order to perform well on the output task at hand. While this architecture has been the basis of breakthroughs in natural language so far, we are increasingly seeing its applications in vision, i.e. videos and images. Facebook AI demonstrated a method called DINO, which trains a vision transformer with self-supervision. Self-supervision is when the training data used for a model has no identifying labels, so the model learns the labels by itself.
Mike Schroepfer, CTO of Facebook, shared a jaw-dropping video of the image segmentation (identifying lines, curves, objects) capabilities.
Why is this important? Labeling and annotation of videos and images is painstaking and very human-intensive; a lot of human labeling tasks have been outsourced to call-center like facilities in lower-income countries, and has resulted in human-rights outcries. Automation with DINO may alleviate these problems, but the flip-side is mass unemployment for the human labeling industry.
Inflation is here! The US Consumer Price Index or CPI for April jumped 4.2% year-over-year, driven by rising used auto prices, airfares and hotel room rates, and higher wages for crop, oil and transportation workers. What is measured, is managed, and what’s managed, gets gamed; here’s an amusing look at how the CPI is manipulated, sliced and diced.
We are at a turning point globally, where central banks seems unbothered or unwilling to acknowledge inflation as a threat, populism in politics has taken hold, balancing budgets is out of fashion, manufacturing at home vs in China is gaining steam, global population appears to have peaked leaving fewer workers and those workers are now empowered to put upward pressure on wages.
I’m far from a crypto bull, but the run up in cryptocurrencies are driven in some part by the loss of trust in central banks. Squint and it is ALL make-believe currency.
Verdad Capital: Do Treasuries Still Work?
While inflation is here, the Federal Reserve insists it’s transitory and is not willing to raise rates on US Treasuries just yet. If investors expect that inflation will reduce the value of currency in the future, they demand higher rates to compensate themselves for the risk. Expectations of higher rates in the future drives today’s Treasury returns down.
The two drivers of Treasury return, therefore, are growth, which impacts the real rate, and inflation. The real Treasury rate is the opportunity cost of not investing in the domestic economy. If economic growth improves, then the real rates rise and bonds do relatively poorly. If, however, growth falters, then real rates fall and bonds do well. Similarly, if inflation expectations rise, investors will demand a higher premium over real rates, driving Treasury yield up and returns down. When inflation expectations fall, this premium falls and bonds can do well.
This article insists that we must shift from thinking about inflation to worrying about economic growth.
We are arguing that investors in Treasurys should shift their attention from worrying about inflation to worrying about growth—and trading Treasurys based on the direction of the economy. When spreads widen and growth expectations fall, Treasurys provide attractive countercyclical return potential. And when spreads tighten and growth expectations rise, investors should reduce or eliminate Treasury holdings. We believe this paradigm should hold for as long as growth and inflation are positively correlated. They key thing to watch for is signs of this post-1980 paradigm shifting, of inflation and growth decoupling, which would create the conditions for a longer-term bear market in Treasurys and a longer-term bull market in real assets.
We hit the 10 year anniversary of Osama Bin Laden’s death. This article is a gripping oral history of the multi-agency effort to capture and kill public enemy #1. Funnily and fatefully enough, the White House and President Obama were also preparing his speech for the annual correspondent’s dinner, where he decided to take jabs at Donald Trump; many claim this was the event that spurred Trump into running for office in 2016…
I was at the edge of my seat reading this the whole time, and it does a good job weaving in humor and personalities involved to break the tension.
This guy in Abbottabad was the first civilian to pick up that something was odd; he converted his tweet into an NFT that auctioned for $525 for the anniversary.
Berkshire Hathaway’s class A shares are approaching the 32-bit limit used by NASDAQ’s computers; Buffett’s famous refusal to split Berkshire stock means NASDAQ blinked first.
Here’s the trouble: Nasdaq and some other market operators record stock prices in a compact computer format that uses 32 bits, or ones and zeros. The biggest number possible is two to the 32nd power minus one, or 4,294,967,295. Stock prices are frequently stored using four decimal places, so the highest possible price is $429,496.7295.
Audio and Video capture of the Perseverance copter on Mars; this was supremely cool
Huawei researchers announced that they had trained the Chinese language equivalent of GPT-3, with 200Billion parameters; the model is called Pangu. With 25 Billion more parameters than GPT-3, the model was trained on 1.1 terabytes of Chinese-language ebooks, encyclopedias, news, social media, and web pages.
David Swensen, Yale University’s endowment manager passed away this week at 67; he famously advocated for more private equity and venture capital in university endowment holdings, and legitimized the VC industry. He did so upon realizing that endowments do not die, i.e., they don’t have to plan for retirement and needing to liquidate their holdings on a set date. Snippets from a profile of Swensen in the Intelligent Investor this week -
Y’all just give women real pockets on clothing already.I’m a simple gal. I just want what every woman wants. Respect, a pack of trained wolves, death to the patriarchy, and real pockets in every dress and pair of pants.
Disclaimer: The views and opinions expressed in this post are my own and do not represent my employer.