By Eric Topol
Jun 13, 2026
A New Path to Preventing Cancer
In the journal Cell, a team of >80 researchers from 4 continents reported on the discovery of a 14-protein blood test which paved the way for predicting and preventing lung cancer more than 5 years before it would be diagnosed. Current efforts in cancer are largely directed to therapy and early detection, with only a very limited foray to prevention. In a recent Ground Truths, I reviewed the emerging potential for preventive cancer vaccines in people with a known pathogenic mutation (such as Lynch syndrome or BRCA). In contrast, this extraordinary new report used machine learning of high-throughput proteomics, with validation of the 14-protein signature in 8 different cohorts, and in a study of people with lung cancer who never smoked. Beyond that, there was extensive work to understand the role of air pollution (via particulate matter, PM), wild-type and mutant mice, lung organoids, single-cell biology, and adjacent healthy tissue to the tumor microenvironment. The study was covered on the front page of the NY Times (←gift link) in an easy to understand way, although it glossed over its unique features, many important details, and the implications. The infographic below that I made with the help of NotebookLM is also reductionist but conveys the main thrust of the work. In this post, I’ll take you through the findings and the ramifications that it has for preventing cancer in the future. It’s a massive amount of work, so I will not get into the abundant technical details, but instead hit the high points. Let’s first go back to the CANTOS trial that set this up. Back in 2017, there was publication of CANTOS, a large randomized clinical trial that assessed an anti-inflammatory medication (canakinumab, an interleukin-1β antibody) in more than 10,000 participants with prior heart attack for reduction of subsequent cardiovascular events. While the results were not compelling for this cardiovascular indication (small benefit and a risk of fatal infections), there was an unexpected outcome of reduction of lung cancer and fatal lung cancer during 5 year follow-up, both with a dose-response (Figure below for lung cancer incidence). At the highest dose of the drug, there was nearly an 80% reduction of fatal lung cancer. But the number needed to treat (NNT) with the antibody to prevent 1 person’s lung cancer was >1,000, which reflects there wasn’t a way to know who would benefit from the drug. Wouldn’t it be great if we could understand this better and identify whose lung cancer could be prevented with the interleukin-1β antibody? The first step was to do high-throughput plasma proteomics in over 48,000 UK Biobank participants (schematic below). This cohort had nothing to do with the CANTOS trial. This involved assessing nearly 3,000 different proteins from the blood (O-link, ThermoFisher) and using machine learning of the data along with age, history of smoking and lung disease, to find the proteins that predicted lung cancer. There were 14 proteins that fulfilled this objective. They had 4 major categories of function indicative of deep lung cell stress: inflammation, lung surfactant production, epithelial cell secretion, and matrix remodeling (more on the latter 2 categories later). The proteins were present on average 5.6 years before UK Biobank participants who developed lung cancer. These 14-proteins (Figure below with risk of lung cancer) were assessed in 8 different cohorts with either the same (O-link) high-throughput proteomic platform or with Somalogic. The proteins were also confirmed in a Taiwanese cohort (comprised of 81% women, 62% adenocarcinomas) of whom 93% never smoked. Prediction of lung cancer incorporating the 14-proteins outperformed 2 prior lung cancer models based on demographics and smoking history (known as LCRAT and LLVP3). The 14-protein signature was also found in some people with idiopathic pulmonary fibrosis and in chronic obstructive pulmonary disease. Replication of the 14-protein signature in 8 different cohorts (known by their acronyms like ARIC, EPIC, CKB, etc.) There were important results of the TRACERx clinical study (Tracking Cancer Evolution through therapy [Rx]) that had to be folded in to help identify the source of the 14-proteins. That study tracked lung cancer evolution and post-surgical results. There was no correlation of the 14-proteins with more advanced lung cancer (compared with early cancers) and no reduction of the protein signature after resection of the lung tumor. So, surprisingly the 14-proteins were not coming directly from the tumor cells. What was the source of these proteins? The next steps were to introduce the EGFR mutation to 4 different types of lung epithelial cells (basal, club, neuroendocrine, and AT2) in the mouse model to develop lung adenocarcinomas, and subsequently to test the impact of air pollution (particulate matter exposure), the EGFR mutation, and the combination. Single-nucleus sequencing of ~37,000 cells helped to get at the gene expression for the epithelial progression to cancer. As these different types of lung epithelial cells converged to the pre-cancerous KAC state (stands for KRT8+ alveolar intermediate cells, K for keratin). Notably, the 14-proteins were not secreted; these cells had lost their identity with malignant transformation. The source turned out to be healthy, wild-type (cells without the mutation) bystander cells sensing stress from the precancerous process, the formation of the KAC state. Particulate matter activated macrophages to release interleukin-1β to drive production of the 14-proteins. The combination of interleukin-1β from PM exposure and the KRAS mutation took this 14-protein production to the next level. This was confirmed in human lung organoids with exposure to interleukin-1β, incriminating this factor as key to pushing dormant mutant cells into cancer and driving the 14-protein release. With proof of the pivotal role of interleukin-1β, it was time to go back to the CANTOS trial participants and see how the antibody to it correlated with the development of lung cancer. Within 2,325 participants in that trial, there was a more than doubling of risk of lung cancer in this who had the 14-protein signature (high-risk, top graph below). And the high-risk group were the people who drive marked benefit from the interleukin-1β antibody (camakinumab, graph below, bottom) with an approximate halving of risk of lung cancer. The number needed to treat to prevent one case of lung cancer partitioned by the 14-proteins went from 1500 in the low-risk group to 50 in the high-risk group. Here is a diagram to pull all of the above together (made with help from Gemini) We don’t have any protein biomarkers that are validated for primary prevention of cancer, although there are some used as markers to help in early detection like CA-125 (ovarian cancer), CA19-9 (pancreatic cancer) and CEA (carcinoembryonic antigen, gastrointestinal cancer). The new study takes us to an unprecedented position for identifying and thoroughly validating a 14-protein signature for lung cancer. This work took advantage of new methods that have only become available in recent years. At the top of that was the use of high-throughput proteomics to assay ~3,000 plasma proteins and use of machine learning to nail the 14 significantly linked to lung cancer. That is akin to many recent studies that started with 6,000 to 11,000 proteins in the blood and used AI to partition these as organ clocks and cell clocks, tracking their pace of aging status. this comes down to pinpointing the small number of proteins in the blood that correlate with a clinical metric or outcome. In the current study, showing these same 14 proteins found in the UK Biobank were predictive of lung cancer in 8 additional cohorts and in non-smokers strongly reinforced this as a seminal finding. But this body of work went on to dissect the mechanism of the 14 proteins and what accounted for their production, through 4 different types of lung epithelial cells, laborious mouse model work with knock-in EGFR mutations, and extending into the known environmental hit of particulate matter from air pollution, strongly associated with lung cancer. We would have assumed the 14-proteins were coming from the cancer cells, but they weren’t. They were, challenging dogma, an indirect siren emitted from healthy cells, sensing the stress in cells transitioning to a precancerous state. Plus the confirmation of a double hit—particulate matter and an EGFR mutation—that led to the highest production of the 14 proteins. Imagine if we could identify a group of proteins that were predictive of each cancer 5+ years in advance and understand precisely the basis of the protein cluster! That’s where this is ultimately headed. And we’re just talking about the biomarker here. The other dimension of this landmark study stemmed from serendipity. CANTOS was a trial directed at reducing heart disease, a secondary prevention for people who already had a heart attack. But the researchers backed into the finding of reduced lung cancer and fatal lung cancer with the innate immunity intervention of blocking interleukin-1β. While interesting, there was no way to practically take that to the clinic with a number needed to treat (NNT) exceeding 1,000 (to prevent 1 lung cancer) and the hazard of the antibody for serious infections. But CANTOS took the 14-protein biomarker story to a new level, by testing whether the antibody would work in the high risk group. And it did, lowering the NNT to 50 (figure below, made with Claude) in the 20% of participants who had the biomarkers. That’s a remarkable low number of people needed to treat to prevent one deadly cancer. Now this, of course, is retrospective. We don’t yet have a clinical trial to prove that lung cancer can be reduced by 50% with the 14-protein biomarkers and the antibody treatment. That is the next step along the way. If it is proven prospectively, that would cinch a “prevmed”, what I call a prevention medicine, here for preventing cancer. But if it weren’t for CANTOS the hunt for the biomarker would not have likely proceeded, no less identifying a highly promising interception, a primary prevention for the most common form of cancer diagnosed globally. And if we didn’t have the UK Biobank resource with almost 50,000 participants who had high-throughput proteomics and >17-year follow up for outcomes, the 14-protein biomarkers would not have been found. Just think that now all 550,000 UK Biobank participants have had O-link plasma proteomics, greatly enriching this resource for finding biomarkers and candidate prevmeds for many other types of cancer. Both the large CANTOS trial of >10,000 participants and the UK Biobank played an instrumental role in enabling the new study, no less many other cohorts with proteomics and outcomes for independent replication. All of this highlights how critical it is to have such clinical resources to make such pivotal discoveries that can impact medical practice and the lives of so many patients whose cancer could be prevented. The new proteomic based work that elevated the potential for preventing cancer is complementary to the vaccine approach that revs up the immune system in a person at high-risk. These 2 primary prevention strategies could be combined, and would have even greater effect than either along in high-risk people. What we have in this new study and context is the beginning of a roadmap, a template, to achieve primary prevention of a common and all too frequently lethal cancer. Once we can accurately predict cancer 5+ years ahead, the era of primary prevention will go into high gear.
Source: Ground Truths | Eric Topol