2021 Will Be The Year of AI Hardware

Hardware powering AI is expensive, cumbersome and emits tons of carbon dioxide. There is plenty of room for innovation.

Jan 04, 2021

It’s been a while since I’ve published one of these, partly because, well, 2020 and on top of that my topic of choice for the newsletter felt too narrow and restrictive. I’m taking a different approach in 2021. Instead of finding time to write, I want to make time for it. Instead of only focusing on AI ethics (which will continue to feature prominently), I will also write about AI/ML startups, my reflections on building and researching machine learning systems, how AI can help us combat climate change and much more. I hope you will enjoy this renewed and recharged version of Fairly Deep.

In this issue:

I discuss hardware specialized for AI/ML applications and opportunities in that space.
Give an overview of how Nvidia achieved a near-monopoly on ML hardware and why few have successfully challenged it.

Hardware touches every part of the machine learning stack and in the past 10 years there’s been an explosion of research into hardware specialized for ML applications. Such hardware can speed up both training and inference, lower latency, lower power cost and reduce the retail cost of these devices. The current go-to ML hardware solution is the Nvidia GPU, which propelled Nvidia to dominate the market and grew its valuation to be greater than Intel’s. While more and more promising research emerged, Nvidia continued to dominate the space by selling GPUs with its proprietary CUDA toolkit. However, I see four factors that will challenge Nvidia’s dominance and change the ML hardware landscape as soon as this year and certainly in the next 2-3 years:

Academic research in this area is going mainstream.
Moore’s law is dead. With its demise, “technological and market forces are now pushing computing in the opposite direction, making computer processors less general purpose and more specialized.” [source]
Investors and founders alike are realizing that AI can not only break new ground but also their budgets.
AI’s carbon footprint is large and getting larger. We need to make computation more energy efficient.

Background

This is what a typical machine learning pipeline looks like:

General-purpose chips like CPUs are sufficient for most data science workflows until it comes time to train and deploy large models. For “deep learning,” which involves neural network architectures for tasks like vision and natural language processing, a GPU is almost always necessary. Lambda Labs, a company that provides GPU workstations for deep learning, approximated the cost of training GPT-3 to be around $4.6 million, involving a cluster of Nvidia’s top of the line GPUs.

The primary advantage of using a GPU over the conventional CPU is that it can run computations in parallel with greater data throughput. At its core, the computational part of machine learning is matrix multiplication, which can be greatly sped up when run in parallel. Nvidia’s proprietary CUDA provides both an API and a tool for developers to take advantage of this parallelization. It’s abstracted away by popular libraries like TensorFlow and PyTorch, where one line of code will automatically detect your GPU and then leverage the CUDA backend. If you are designing a brand new algorithm or library and need to take advantage of parallel computation, CUDA provides the tools to make that easier.

Nvidia started in the early 1990s as a video gaming company, wanting to provide graphical chips that could quickly render 3D graphics. It was successful in that business, consistently building some of the most powerful GPUs in a constant back-and-forth with AMD, another graphics card manufacturer. Coincidentally, the same graphics hardware turned out to be indispensable for deep learning to take off. CUDA gave Nvidia an advantage over other GPUs.

Nvidia’s CEO Jensen Huang (photo courtesy of Nvidia)

Nvidia first released its CUDA toolkit in 2006, providing an API that made it dramatically easier to work with GPUs. Three years later in 2009, Andrew Ng, an AI professor at Stanford, and his collaborators published a paper showing that large-scale deep learning could be made possible if GPUs were used during training. A year later, Ng and Sebastian Thrun, another Stanford professor and later cofounder of Google X, pitched Larry Page the idea for a deep learning research group at Google, which later became Google Brain. With the rise of Google Brain and “the Imagenet moment,” Nvidia’s GPUs became the de facto computing standard for the AI/ML industry. See this article for more.

TLDR: The Status Quo

Nvidia dominates the deep learning hardware landscape with its GPUs, largely due to CUDA. “In May 2019, the top four clouds deployed NVIDIA GPUs in 97.4% of Infrastructure-as-a-Service (IaaS) compute instance types with dedicated accelerators,” according to Forbes. It’s not sitting still in the face of competition either.
Google developed the TPU, an AI accelerator chip specifically made for neural networks, back in 2015. In its narrow use case as a domain-specific accelerator, TPUs are faster and cheaper than GPUs, but they’re walled off within Google’s GCP ecosystem and are supported only by TensorFlow and PyTorch (other libraries need to write their own TPU compilers).
AWS is making a bet on its own silicon, particularly for machine learning. So far, the AWS Inferentia chip appears to be the most successful. Much depends on how easy it will be for developers to switch from CUDA to Amazon’s toolkit for Inferentia and other chips.
In December 2019, Intel bought Habana Labs, an Israeli company that makes chips and hardware accelerators for both training and inference workloads, for $2 billion. Intel’s investment appears to be paying off – last month, AWS announced it will be offering new EC2 instances running Habana’s chips that “deliver up to 40% better price performance than current GPU-based EC2 instances for machine learning workloads.” Intel also has a new line of Xeon CPUs that it thinks can compete with Nvidia’s GPUs.
Xilinx, a publicly traded company that invented the FPGA and recently got into AI accelerator chips, was acquired by AMD in October 2020.
Demand for AI-purposed computing power is accelerating

Changes and Opportunities

As I mentioned in the introduction, my hypothesis is that Nvidia’s dominance will be increasingly challenged and eroded in 2021 and beyond. There are four factors contributing to this:

#1. Academic research turning into real products

A number of startups founded by academic and industry researchers are already working on ML-specific hardware and there is room for more to emerge. Papers published in this space are not just suggesting theoretical guarantees but are showing real hardware prototypes that achieve better metrics than commercially available options. [example 1, example 2 and example 3]

There are many types of chips and hardware accelerators, each of them with its own flourishing research community. To briefly list a few:

Application-Specific Integrated Circuit (ASIC). Examples of ASICs include the Google TPU and AWS Inferentia. It can cost upwards of $50 million to research and produce one, but the marginal cost to produce copies is typically low. ASICs can be designed to have low power consumption without compromising too much on performance.
Field-programmable Gate Array (FPGA). FPGAs are old news to high-frequency traders, but in machine learning examples include Microsoft Brainwave and Intel Arria. It’s cheap to produce one FPGA, but the marginal cost to produce many is higher than that of ASICs.
Neuromorphic computing. This field tries to model the biological structure of the human brain and translate it into hardware. Despite the fact that neuromorphic ideas go back to the 1980s, the field is still in its infancy. There is a good overview paper in Nature.
See this survey paper and pay attention to ISCAS for more

Some promising startups leveraging aforementioned research:

Blaize emerged from stealth in 2019, claiming they developed a fully programmable processor that is low-power and achieves 10x lower latency and “up to 60% greater systems efficiency.”
SambaNova Systems, a startup founded by Stanford professors and former Oracle executives, raised from GV and Intel Capital. It just announced a new product offering that is “a complete, integrated software and hardware systems platform optimized for dataflow from algorithms to silicon.”
Graphcore, a British startup that raised rounds led by Sequoia, Microsoft, BMW and DeepMind’s founders.

#2. Moore’s law is dead. For better or worse, specialized hardware is the future.

Graphic made by Max Roser/OurWorldinData

Moore’s law predicts that the number of transistors on integrated circuits will double every two years. This has been empirically true since the 1970s and is synonymous with the technological progress we’ve seen since then: the personal computing revolution, improvements in sensors and cameras, the rise of mobile, enough resources to power AI, you name it. The only problem is, Moore’s law is coming to an end, if it hasn’t already. It is no secret that “shrinking chips is getting harder, and the benefits of doing so are not what they were. Last year Jensen Huang, Nvidia’s founder, opined bluntly that ‘Moore’s law isn’t possible any more’,” writes the Economist.

In the MIT Technology Review, MIT Economist Neil Thompson explained that “rather than ‘lifting all boats,’ as Moore’s Law has, by offering ever faster and cheaper chips that were universally available, advances in software and specialized architecture will now start to selectively target specific problems and business opportunities, favoring those with sufficient money and resources.” Some, including Thomspon, argue that this is a negative development because computing hardware will start to fragment into “'fast lane' applications that get powerful customized chips and 'slow lane' applications that get stuck using general purpose chips whose progress fades.”

Distributed computing is often used as a way around this problem: let’s use less powerful and cheaper resources but lots of them. Yet, even this option is getting increasingly expensive (not to mention cumbersome when it comes to the complexity of distributed gradient descent algorithms).

So, what happens next? In 2018, Researchers at CMU published a paper arguing that the private sector’s short-term profitability focus is making it hard to find a general-purpose successor to Moore’s law. They call for a public-private collaboration to create the future of computing hardware.

Tim Cook presenting Apple Silicon. Photo courtesy of Apple.

While I’m not opposed to public-private collaborations (more power to them), I think the future of computing hardware is an ensemble of specialized chips that, when working in unison, account for general-purpose tasks even better than CPUs today. I believe Apple’s transition to its own silicon is a step in this direction – proof that hardware-software integrated systems will outperform traditional chips. Tesla also uses its own hardware for the autopilot. What we need is an influx of new players into the hardware ecosystem so the benefits of specialized chips can be democratized and distributed beyond expensive laptops, cloud servers and cars. (Dare I say... It’s Time to Build?)

#3. Founders and investors are concerned about rising costs

Martin Casado and Matt Bornstein of Andreessen Horowitz published an essay in the beginning of last year where they argued that the business of AI is different from traditional software. At the end of the day, it’s all about the margins: “cloud infrastructure is a substantial – and sometimes hidden – cost for AI companies.” As I mentioned, training AI models can cost thousands of dollars (or millions if you are OpenAI) but the costs don’t stop there. AI systems have to be consistently monitored and improved. If your model was trained “offline,” it is prone to concept drift, which is when the real world data distribution changes over time from the one you trained on. This can happen naturally or adversarially, such as when users try to trick a credit-worthiness algorithm. When that happens, the model has to be retrained.

There’s active research on mitigating concept drift and creating smaller models with the same performance guarantees as existing ones, but that’s content for another post. In the meantime, the industry is moving ahead with larger models and greater spending on compute. Cheaper, specialized AI chips can most certainly lower these costs.

#4. Training large models contributes to climate change

A study from UMass Amherst found that training one off-the-shelf natural language processing model generates as much carbon emissions as a flight from San Francisco to New York. Among the big three cloud providers, only Google gets more than 50% of its data center energy from renewable sources.

I don’t think I need to enumerate the reasons why we might want to decrease AI’s carbon footprint. What I will say is that existing chips are exceedingly power hungry and research shows that other types of hardware accelerators, like FPGAs and ultra low energy chips (e.g. Google TPU Edge), can be more energy efficient for machine learning and other tasks.

Even geography matters when it comes to carbon emissions from AI. Stanford researchers estimated that “running a session in Estonia, which relies overwhelmingly on shale oil, will produce 30 times the volume of carbon as the same session would in Quebec, which relies primarily on hydroelectricity.”

On the horizon

I’ve mentioned hardware for AI, but how about AI for hardware? Google recently filed a patent for “a method for determining a placement for machine learning model operations across multiple hardware devices'' using Reinforcement Learning. One of the researchers behind this patent is Azalea Mirhoseini, who is in charge of ML hardware/systems moonshots at Google Brain.

Thanks for reading. See you soon!

Issue #3: Promise & Responsibility

Andrei Kozyrev

Feb 03, 2020

The last couple of weeks have been eventful. January 21st through 24th was the Davos World Economic Forum, and this past Thursday was the closing day of the FAT* conference, now renamed to FAccT (Fairness, Accountability and Transparency). The conference was established in 2018 to bring together “researchers and practitioners interested in fairness, accountability, and transparency in socio-technical systems” and over the last two years grew into an ACM-affiliated, international conference. This year it was hosted in Barcelona, Spain. The organizers stole some of my thunder this week because I have been meaning to write about the conference’s name and suggest that the “FAT” acronym may not be the best idea. Though the new acronym is not the most aesthetically appealing, I do find it meaningful that in an age when facts are too often ignored and diminished, a conference dedicated to fairness, accountability and transparency has “fact” in its name.

The other topic of today’s newsletter is promise and responsibility in the context of tech corporations. Right before Davos, Microsoft’s CEO Satya Nadella wrote a post titled “Achieving more for the world,” in which he outlined Microsoft’s four goals for “the decade for urgent action”:

Power broad economic growth through tech intensity
Ensure that this economic growth is inclusive
Build trust in technology and its use
Commit to a sustainable future

The language of the four commitments is relatively standard, but the phrase “tech intensity” seemed unusual. Nadella defined the term as “adopting best-in-class digital tools and platforms for the purpose of building new, proprietary products and services.” This principle is not just for Microsoft to take advantage of, he says, but is meant to be broadly utilized by “companies, communities and countries.” A certain section of the statement should be particularly relevant to the readers of this newsletter:

we [will] build AI responsibly, taking a principled approach and asking difficult questions, like not what computers can do, but what computers should do? Fairness, reliability and safety, privacy and security, inclusiveness, transparency and accountability are the ethical principles that guide our work.

At Davos, Nadella spoke in detail about the promises and dangers of AI, calling for government regulation in the space. His call for regulation was reiterated by IBM CEO Ginni Rometty, whose company issued a policy paper calling for “precision regulation of AI,” and Alphabet’s Sundar Pichai, who argued that AI will prove to be a more important invention than electricity and that “you can’t get [AI] safety by having one country or a set of countries working on it. You need a global framework.” But, some observers see tech’s increased self-awareness and AI ethics bent to be a diversion tactic from the more thorny issues of anti-trust and broad regulation of the tech sector. “Warning the business elite about the dangers of AI has meant little time has been spent at Davos on recurring problems, notably a series of revelations about how much privacy users are sacrificing to use tech products,” reports Bloomberg. The same article also mentions that despite flashy and (truly) commendable public statements about taking on more corporate responsibility, “Facebook, Amazon, Apple and Microsoft all increased the amount they spent on lobbying in Washington last year, with some of those funds going to pushing industry-friendly privacy bills.” Of course, it would be naive to think that a company like Microsoft would suddenly cease lobbying for itself in Washington (don’t hate the player, hate the game), but a healthy dose of skepticism about grand promises is encouraged. With that said, I think that if Satya Nadella and Microsoft deliver even on half of their promises for this decade, then we will all be far better off.

Best wishes,

Andrei

Speaking of Sustainability 🌍

BlackRock will emphasize environmental sustainability as a core investment goal, CEO Larry Fink announced in his annual letter to the CEOs of world's largest corporations. BlackRock will also "introduce new funds that shun fossil fuel-oriented stocks, move more aggressively to vote against management teams that are not making progress on sustainability" and more. The annual letter is considered to be among the most influential documents in the world of finance, and CEOs of some of the most important companies around the world tend to pay close attention because BlackRock often has outsize influence on their Boards of Directors. [Letter]

Microsoft announced plans to become “carbon negative” by 2030, as part of its other promises regarding environmental sustainability, AI safety and corporate responsibility in this new decade. Being carbon negative would mean that the company will not only have net zero atmospheric emissions, but will actively remove extra carbon dioxide as well. [Announcement]

ML Education 📚

In the previous issue of Fairly Deep, I wrote a bit about shoddy online AI education. Having taken a good number of online ML courses (and a number of in-person classes at Cornell) and having fallen victim to BS resources a few times, I decided to compile a spreadsheet of free online courses that I found to be worthwhile and educational: https://docs.google.com/spreadsheets/d/1QVSmUNUqT80Hh49dVmVVfpJoZEVz-elKXq2TxbpkkgY/edit?usp=sharing

It is by no means an exhaustive list of good resources, so please feel free to add more!

👂All Ears: a selection of memorable podcast episodes

Timnit Gebru, one of the most well-known researchers in ML ethics and the lead of Google’s ethical AI team, recently discussed latest trends in fairness and ML ethics on the TWIML podcast:

Cristos Goodrow, head of Search and Discovery at YouTube, talked about the YouTube algorithm, clickbait videos, personalization and much more in this excellent episode of the Artificial Intelligence Podcast. If you’re interested in more news about the YouTube algorithm, check out the previous issue of Fairly Deep :)

Last but not least, I recently discovered a wonderful podcast that discusses both technical and non-technical trade-offs in tech careers. The episodes’ topics range from “Bootcamp vs. Computer Science Degree” to “AWS vs. GCP” and the podcast is hosted by Mayuko Inoue, who is a popular tech YouTuber and an iOS engineer working in Silicon Valley. [Link]

In other News 📰

London police will begin using facial-recognition cameras, while privacy advocates and civil liberties groups are sounding alarm bells. The European Union, which the UK officially (Br)exited this past week, is considering a blanket ban on using the technology for law enforcement purposes (similar to what San Francisco did last year). [article]

Another AI Winter? 🥶The BBC reported that a number of ML researchers are concerned that overblown promises of Artificial General Intelligence (AGI) and “a general feeling of plateau” are increasing the possibility of another “AI Winter.” The term has been used to describe periods of low funding and low trust in AI research in the 1970s and the late-1980s. In the past two decades that has been replaced by an exuberantly optimistic outlook and a number of truly remarkable breakthroughs (and certainly no lack of funding). But, “by the end of the decade there was a growing realisation that current techniques can only carry us so far.” What do you think, will there be another “AI winter”? [article]

Visa acquired fintech startup Plaid for $5.3 billion. Plaid’s technology allows apps to access bank accounts of their users and facilitate transactions, acting as a middleman between the two endpoints. WSJ reports that Visa and other card issuers are increasingly concerned that consumers are starting to avoid credit cards in favor of direct bank transfers. “Bank-account payments also offer a way into business-to-business payments, a sector in which card companies have been trying to play a bigger role because it is viewed as untapped compared with consumer payments.” [article]

Issue #2: AI snake oil

Andrei Kozyrev

Jan 20, 2020

Like any rapidly growing scientific(-ish) field that captures the public’s imagination, ML is susceptible to its fair share of snake oil salesmen and dubious claims of magical achievements. This issue really crystallized for me with the Siraj Raval scandal last year. Raval is a YouTube personality and self-proclaimed data scientist who makes “Artificial Intelligence education” videos for nearly 700,000 subscribers. His work has been praised and retweeted by Elon Musk and DeepMind’s Demis Hassabis, and his list of followers on Twitter used to include the likes of Elon Musk, Marc Andreessen, Jeff Dean, Ian Goodfellow and many other credible, well-known personalities not just in ML but in tech broadly. While his videos have always had that unmistakable YouTube clickbait-ness to them, Raval’s videos were entertaining and relatively informative for beginners (except for predicting stock prices with 10 lines of Keras code, please don’t do that). All was going well until September 2019, when a reddit post on r/MachineLearning accused Raval of scamming students through his paid course called “Make Money with Machine Learning,” which charged $199 for access. Many students, some of whom were experienced data scientists, found the course to be poorly made, full of mistakes and abandoned by Raval. When students began requesting refunds, Raval quietly edited the course’s refund policy to be “within 14 days of registration” (it was 4 weeks past registration date at that point) [article]. As the ML course fiasco was still in full swing, researcher Andrew M. Webb wrote a Twitter thread showing that Raval almost fully plagiarized a “Neural Qubit” paper and presented it as his own work (the original paper is called “Continuous-variable quantum neural networks”).

Three weeks ago, Raval posted an apology video on YouTube and is now back to making “educational” content. While I believe that people deserve a second chance, there are allegations that Raval still has not fully refunded some students.

What I find interesting about the whole Siraj Raval ordeal is not so much that there are AI snake oil salesmen. Scamming people with a shoddy ML course and plagiarizing papers is appalling, but I don’t think that happens often, especially with such prominent personalities. What does happen on a regular basis is excessive hype around AI/ML (this acronym is an example of that), irresponsible and downright unethical use of ML, and the use of the subject’s technical complexity to obfuscate flaws in research papers. I’m going to break down these statements into examples below, but my takeaway from all of this is that we need to take on more responsibility in countering misleading claims that capitalize on ML’s popularity and complexity. As a first practical step, I am going to follow Vicki Boykis’ 4 excellent suggestions that she wrote about in her newsletter Normcore Tech (one of my favorite newsletters on the tech industry – highly recommend):

We need to provide people with the right tools and content to evaluate what they’re watching education-wise for tech topics
We need to end the hype cycle around AI and ML pronto. I’ve been pretty vocal about this myself, and one of the reasons I started this newsletter - to moderate the gushing fountain of both tech enthusiasm and tech negativity from the mainstream media. [Vicki’s newsletter]
We need to be critical of what we watch and read (easier said than done!)
We need to help people around us if we see they need technical boost, and to make ourselves available. We need to learn to offer constructive criticism outside the parameters of a sarcastic tweet or a one-off YouTube comment. In today’s internet, ultra-engineered for attention and outrage to push all our worst buttons as humans, it can be really hard on a daily basis, but we have to try.
[source]

I’d add one thing to these great suggestions: we need to improve how we promote credible, solid research to make it as appealing to the media and general public as OpenAI press releases. This is hard for many reasons, such as lack of funding, absence of massive PR departments with said funding and, more often than you’d think, reluctance to engage in “marketing” because of a false perception that it is “beneath” serious research (i.e. my work speaks for itself). First, it goes without saying that great marketing is a science and an art. Second, a marketing lens helps to distill the essence of one’s work into familiar and approachable language, far better than even an abstract could do. But, making your voice heard can be difficult, so it’s important to build relationships with the media and get to know the journalists that cover tech. One way to do this is to make yourself available, like Vicki says in #4, to offer constructive criticism and be one of those experts quoted in a NYTimes article about a new ML breakthrough.

I’d love to hear your thoughts on the issues mentioned above and any advice for how to deal with them. If you disagree with anything I wrote, I’d love to have a discussion about that too.

P.S. On Monday, I am starting my last semester of undergrad at Cornell. While I still don’t know what I’ll be doing after graduation, I’m excited for what’s to come, especially in terms of growing and improving Fairly Deep.

Thank you for reading ⚡️

The Horsemen of the Credibility Apocalypse

Right before New Years, a pre-print of a paper made rounds on Twitter because it claimed that YouTube’s new recommendation algorithm has a de-radicalizing influence on users:

Mark Ledwich @mark_ledwich

1. I worked with Anna Zaitsev (Berkely postdoc) to study YouTube recommendation radicalization. We painstakingly collected and grouped channels (768) and recommendations (23M) and found that the algo has a deradicalizing influence. Pre-print: arxiv.org/abs/1912.11211 🧵

arxiv.orgAlgorithmic Extremism: Examining YouTube’s Rabbit Hole of RadicalizationThe role that YouTube and its behind-the-scenes recommendation algorithm plays in encouraging online radicalization has been suggested by both journalists and academics alike. This study directly quantifies these claims by examining the role that YouTube’s algorithm plays in suggesting radicalized c…

Several prominent VCs even proclaimed that the study was an example of a “narrative violation.” It also seemed like one of the authors had an axe to grind with the mainstream media:

Mark Ledwich @mark_ledwich

4. My new article explains in detail. It takes aim at the NYT (in particular, @kevinroose) who have been on myth-filled crusade vs social media. We should start questioning the authoritative status of outlets that have soiled themselves with agendas.

medium.comAlgorithmic Radicalization — The Making of a New York Times MythThe New York Times and other “Authoritative” sources tell us about algorithmic radicalisation of YouTube. They are wrong and untrustworthy.

Almost immediately, several acclaimed researchers called into question the methodology of the study. Among them was Princeton’s Arvind Narayanan, a privacy and AI ethics expert, who published a thread examining many of the flaws of this (irreproducible) study, such as this:

Arvind Narayanan @random_walker

The key is that the user’s beliefs, preferences, and behavior shift over time, and the algorithm both learns and encourages this, nudging the user gradually. But this study didn’t analyze real users. So the crucial question becomes: what model of user behavior did they use?

Arvind Narayanan @random_walker

The answer: they didn’t! They reached their sweeping conclusions by analyzing YouTube *without logging in*, based on sidebar recommendations for a sample of channels (not even the user’s home page because, again, there’s no user). Whatever they measured, it’s not radicalization.

Though it’s impossible to tell how many people who saw Ledwich’s thread and paper also saw the criticism, I thought this was a great example of the ML community coming together to refute a clearly flawed paper.

Another interesting case is of the far more credible and serious paper published by Google Health and DeepMind in collaboration with a number of top research hospitals. The paper presents a new ML system that outperforms doctors in detecting breast cancer from mammograms and “reduces false positives by 5.7 percent for US women” and by 1.2 percent in the UK. It also reduced false negatives by 9.4 percent and 2.7 percent in the US and the UK respectively. While these are wonderful results and ML clearly has a lot of potential to improve healthcare outcomes, media coverage of the study was fairly balanced and included cautionary remarks that these AI/ML systems are not be all end all.

From the NYTimes:

in some instances, A.I. missed a cancer that all six radiologists found — and vice versa.
Dr. Lehman [director of breast imaging at the Massachusetts General Hospital], who is also developing A.I. for mammograms, said the Nature report was strong, but she had some concerns about the methods, noting that the patients studied might not be a true reflection of the general population. A higher proportion had cancer, and the racial makeup was not specified. She also said that “reader” analyses involving a small number of radiologists — this study used six — were not always reliable. [article]

In Wired, an award-winning science journalist Christie Aschwanden cautioned that ML systems designed for controversial healthcare practices can exacerbate the problem of “bad medicine”:

In a sense, that’s what happened with the recent Google paper. It’s trying to replicate, and then exceed, human performance on what is at its core a deeply flawed medical intervention. In case you haven’t been following the decades-long controversy over cancer screening, it boils down to this: When you subject symptom-free people to mammograms and the like, you’ll end up finding a lot of things that look like cancer but will never threaten anyone’s life. As the science of cancer biology has advanced and screening has become widespread, researchers have learned that not every tumor is destined to become deadly. In fact, many people harbor indolent forms of cancer that do not actually pose a risk to their health. Unfortunately, standard screening tests have proven most adept at finding precisely the latter—the slower-growing ones that would better be ignored. [article]

👂All Ears: a selection of memorable podcast episodes

Recode’s Kara Swisher needs no introduction. She is one of the most prolific and well-known tech journalists and has been active in the space for over 20 years. She knows everyone in the industry and once made Mark Zuckerberg sweat so much in an interview that he had to take off his hoodie. Her most recent interview with Ben Silbermann, the CEO of Pinterest, offers an excellent look into the company’s efforts to deal with misinformation on the platform and how Pinterest is surviving in a competition with the likes of Facebook, Google, Snapchat and others. [link]

Just For Fun

Bayes 😐 or Bae-yes 🥰?

Before the holidays, the ML community on Twitter and Reddit was consumed by a debate on what should and shouldn’t be considered “Deep Learning.” Now that this is last year’s news, the community is fervently engaged in debating the merits of Bayesian Deep Learning, which is inevitably growing into a frequentist vs. Bayesian reasoning debate. I have no opinion on this because I just heard about Bayesian Deep Learning for the first time when I saw this debate on Twitter, but I do share this sentiment:

Thank you for reading and see you next time!

Introducing Fairly Deep 🔬

Andrei Kozyrev

Jan 05, 2020

Wassily Kandinsky, Joyous Assent. Color lithograph. 1923. UCLA Hammer Museum

What does it mean to be "fair"? How would you define fairness? Why?

I kept asking myself these questions when, in early September, I joined a research group at Cornell looking to define how ranking algorithms can be fair to both users and items being ranked. I found this topic fascinating because we encounter these algorithms multiple times a day as a natural way of helping us organize information. Whether I'm searching for a new coffee machine on Amazon, or I'm a recruiter looking for software engineering candidates on LinkedIn, these platforms have ranking algorithms that optimize for relevant items that I, the user, should hopefully see at the top of my search results. And it's no surprise that the first items in any ranking get disproportionately more exposure than the rest (when was the last time you went to the second page of a Google search?).

But, how do you produce a ranking that is not just relevant to the user but also fair to the people (or items) being ranked? What does it mean to apply fairness to algorithms in the first place?

Having spent the entire semester looking into these issues I came out with more questions than answers. At the same time, I found a wealth of research, blogs, articles and podcasts that were scattered across various corners of the internet.

In this newsletter, I will try to gather latest developments, news, and trends in the rapidly growing area of machine learning fairness, transparency and ethics. However, I can only hope to capture a tiny snapshot of this field in an email. So, I’m asking all my readers (yes, all 10 of my friends whom I’m spamming with this) to send any and all material that you think should be highlighted in the newsletter. Last but not least, this newsletter is intended for non-technical and technical audiences alike, and if you think that I’m not striking the right balance, please let me know. Having always been passionate about writing, I’m excited to take on this challenge.

Without further ado – welcome to Fairly Deep, a bi-weekly newsletter about fairness, privacy, transparency and ethics in machine learning.

Thank you for reading and Happy New Year 💫

Fair Enough

December was an important month for all things ML because of NeurIPS, the largest annual AI conference that brings together thousands of researchers, data scientists and others. NeurIPS 2019 had by far the most number of attendees in its 33 year history, attracting 13,000 registrations with 6743 paper submissions, of which 1428 were accepted (~21% acceptance rate).

🧬At the Fair ML for Health workshop, Stanford’s Sharad Goel talked about the challenges of formalizing ideas about fairness in algorithmic decision making. He presented on the limitations of a lot of the fairness definitions that have been proposed in recent years to deal with issues of discrimination and bias. For example, he discussed Classification Parity – which is a notion of fairness that requires a particular metric to be the same across groups – in the context of the false positive rate metric. [video of the talk | slides | paper]

P.S. for an excellent in-depth explanation of the talk and Goel’s associated paper, check out these slides.

If you are interested in the intersection of ML and healthcare, do check out other talks from the Fair ML for Health workshop as well as the ML4H (Machine Learning for Health) workshop.

🏹 Speaking of formalizing fairness, a group of researchers from UMass Amherst and Stanford introduced Robinhood, an offline contextual bandit algorithm that can satisfy fairness constraints and “achieve performance competitive with other offline and online contextual bandit algorithms.” The algorithm relies on users to specify their own fairness rule, which needs to be mathematically expressed, and plug it into the algorithm. For example, one can specify that men and women should have an equal chance of approval for a loan. The researchers tested the algorithm on three tasks: an interactive tutoring system, a loan approval system and a criminal recidivism predictor based on ProPublica’s notable investigation. They report that “in each of our experiments, Robinhood is able to return fair solutions with high probability given a reasonable amount of data.” Some of the fairness constraints discussed by the authors are also analyzed in Goel’s talk. [Paper]

🧮 The idea of “plug-and-play” fairness constraints is also present in a paper by Singh and Joachims that proposes both a framework and a learning-to-rank algorithm that not only optimizes for utility of the ranked items but also enforces fairness constraints with respect to the items. Concretely, the authors claim that algorithms that rank items (say, ranking relevant products for your Amazon search) are usually blind to how this ranking impacts the items themselves. In other words, the ranking can be maximizing utility to the user (and there’s many ways to define utility too) by being unfair to the items being ranked, causing real world harm to those stakeholders. The authors introduce a learning-to-rank framework that allows one to optimize for a variety of utility metrics while “satisfying fairness of exposure constraints with respect to the items,” where the definition of fairness can be specified by the user. [Paper]

[Full Disclosure: I worked with the authors of this paper on my research project this past semester. I will share more about my research in the upcoming newsletters]

Not Kidd-ing

Dr. Celeste Kidd, Professor of Psychology at UC Berkeley

One of the most talked about presentations at this year’s NeurIPS was Professor Celeste Kidd’s How To Know, an exploration of how people come to believe what they believe, “why do people sometimes believe things that aren’t true,” and how machine learning researchers and practitioners can be better aware of human belief formation when they design systems that can affect people’s perception of things. Dr. Kidd was also Time Person of the Year in 2017, sharing the title with numerous other women who were the trailblazers of the #MeToo movement.

Closer to the end of her remarks, she addressed the men in the audience to talk about a belief that she said was common among her male colleagues – that in the age of #MeToo, the slightest misstep or “misinterpretation” can ruin a male researcher’s career over allegations of sexual misconduct. “You have been misled,” she said, taking a pause before a predominantly male audience. “The truth is, it takes an incredible, a truly immense amount of mistreatment before women complain. No woman in tech wants to file a complaint, because they know the consequence of doing so.” Among those consequences is workplace retaliation, such as being overlooked for a promotion or being sidelined entirely. She argues that if one hears about a public case of mistreatment, then chances are this case involved particularly egregious circumstances. And, she cautions her listeners not to fall for smokescreens that the offenders try to put up by apologizing for “a minor infraction, while omitting the many more serious and severe behaviors they should be remorseful for – lying by omission.”

Dr. Kidd’s full remarks are available on YouTube (the #MeToo remarks start around 27:00)

👂All Ears: a selection of memorable podcast episodes

If you are interested in hearing more about Prof. Kidd’s background and research besides the invited talk at NeurIPS, I highly recommend her recent interview on the TWIML podcast (I highly recommend TWIML too).

Michael Kearns, a well-known researcher in algorithmic game theory and computer science generally, recently published a general-audience book called The Ethical Algorithm. Fun fact: Kearns and Leslie Valiant posed the famous weak learnability question that inspired the development of AdaBoost.

p.s. these podcasts are available on all major platforms

In Other News

In December, Dr. Rediet Abebe became the first black woman to receive a Ph.D. in computer science from Cornell University. She successfully defended her dissertation, titled “Designing Algorithms for Social Good,” and is now a Junior Fellow at the prestigious Harvard Society of Fellows. [Article]

Intel acquired an Israel-based Habana Labs for $2 billion. The company specializes in developing computer chips designed for AI and machine learning applications. [Article]

Following Intel’s high-profile acquisition, WSJ reported that “even before Intel’s latest purchase, AI-related deals globally had surged to $35 billion in value through early November, topping the previous high of $32 billion two years ago and $11 billion in transactions last year.” Top tech companies are rushing to poach, hire and train top talent for their AI/ML divisions, often poaching people from top universities and in the process obliterating their CS departments: “Last year, private industry also lured away 60% of AI doctoral graduates.” [Article]

Subscribe now so you don’t miss the next issue. Let’s stay in touch :)

In the meantime, tell your friends!

Loading more posts…