Advertisement
Original Article| Volume 15, ISSUE 6, P948-961, June 2020

Identification of Novel CD74-NRG2α Fusion From Comprehensive Profiling of Lung Adenocarcinoma in Japanese Never or Light Smokers

Open AccessPublished:February 06, 2020DOI:https://doi.org/10.1016/j.jtho.2020.01.021

      Abstract

      Introduction

      Studies are yet to characterize the differences in molecular profiles of lung adenocarcinoma (LUAD) among divergent ethnic groups. Herein, we conducted comprehensive molecular profiling of LUAD in never or light smokers from Asia to discover novel targetable mutations and prognostic biomarkers of this distinct disease entity.

      Methods

      We analyzed 996 cases of Japanese LUAD and performed whole-exome sequencing and RNA-seq in 125 cases of Japanese LUAD negative for the driver oncogenes defined by conventional laboratory testing. We also investigated the clinical and pathologic characteristics among the 996 cases.

      Results

      Driver oncogenes were identified in 88 cases (70.4%) with specific hotspot mutations differing from those in The Cancer Genome Atlas study. Two actionable novel fusions of FGFR2 and NRG2α were also identified. Clustering on the basis of mRNA expression profiles, but not genetic mutational ones, could predict patient prognosis. The risk score generated by the expression of a three-gene set was a strong prognostic marker for overall survival and progression-free survival in our cohort, and was further validated using The Cancer Genome Atlas cohort. Among the 996 cases, each driver alteration is distributed across all histologic subtypes. Adenocarcinoma in situ was identified to harbor driver mutations, suggesting that these alterations are early events in the pathogenesis of LUAD. ERBB2 mutations were over-represented in young adults.

      Conclusions

      This study indicates the value of applying gene expression profiling for predicting the prognosis after a surgical operation, and that the identification of actionable mutations is important for optimizing targeted drugs in Japanese LUAD.

      Keywords

      Introduction

      Globally, it has been reported that lung cancer is the most common cause of cancer-related mortality, being associated with over a million deaths annually; the most common histologic type of lung cancer is lung adenocarcinoma (LUAD).
      • Siegel R.L.
      • Miller K.D.
      • Jemal A.
      Cancer statistics, 2017.
      Dramatic responses in subsets of patients with LUAD harboring activating genomic alterations in the corresponding kinase genes, including EGFR,
      • Lynch T.J.
      • Bell D.W.
      • Sordella R.
      • et al.
      Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib.
      • Paez J.G.
      • Janne P.A.
      • Lee J.C.
      • et al.
      EGFR mutations in lung cancer: correlation with clinical response to gefitinib therapy.
      • Rosell R.
      • Carcereny E.
      • Gervais R.
      • et al.
      Erlotinib versus standard chemotherapy as first-line treatment for European patients with advanced EGFR mutation-positive non-small-cell lung cancer (EURTAC): a multicentre, open-label, randomised phase 3 trial.
      ALK,
      • Kwak E.L.
      • Bang Y.J.
      • Camidge D.R.
      • et al.
      Anaplastic lymphoma kinase inhibition in non-small-cell lung cancer.
      • Shaw A.T.
      • Kim D.W.
      • Nakagawa K.
      • et al.
      Crizotinib versus chemotherapy in advanced ALK-positive lung cancer.
      • Solomon B.J.
      • Mok T.
      • Kim D.W.
      • et al.
      First-line crizotinib versus chemotherapy in ALK-positive lung cancer.
      and ROS1,
      • Shaw A.T.
      • Ou S.H.
      • Bang Y.J.
      • et al.
      Crizotinib in ROS1-rearranged non-small-cell lung cancer.
      have been achieved by molecularly targeted therapies directed against receptor tyrosine kinases (RTKs). Studies are also currently being undertaken on other targeted therapies directed against activating alterations in the KRAS, ERBB2, BRAF, MET, RET, NTRK1, and NTRK2.
      • Paik P.K.
      • Drilon A.
      • Fan P.D.
      • et al.
      Response to MET inhibitors in patients with stage IV lung adenocarcinomas harboring MET mutations causing exon 14 skipping.
      • Drilon A.
      • Rekhtman N.
      • Arcila M.
      • et al.
      Cabozantinib in patients with advanced RET-rearranged non-small-cell lung cancer: an open-label, single-centre, phase 2, single-arm trial.
      • Drilon A.
      • Siena S.
      • Ou S.I.
      • et al.
      Safety and antitumor activity of the multitargeted pan-TRK, ROS1, and ALK inhibitor Entrectinib: combined results from two Phase I trials (ALKA-372-001 and STARTRK-1).
      • Planchard D.
      • Kim T.M.
      • Mazieres J.
      • et al.
      Dabrafenib in patients with BRAF(V600E)-positive advanced non-small-cell lung cancer: a single-arm, multicentre, open-label, phase 2 trial.
      • Kris M.G.
      • Camidge D.R.
      • Giaccone G.
      • et al.
      Targeting HER2 aberrations as actionable drivers in lung cancers: phase II trial of the pan-HER tyrosine kinase inhibitor dacomitinib in patients with HER2-mutant or amplified tumors.
      • Mazieres J.
      • Barlesi F.
      • Filleron T.
      • et al.
      Lung cancer patients with HER2 mutations treated with chemotherapy and HER2-targeted drugs: results from the European EUHER2 cohort.
      • Planchard D.
      • Smit E.F.
      • Groen H.J.M.
      • et al.
      Dabrafenib plus trametinib in patients with previously untreated BRAF(V600E)-mutant metastatic non-small-cell lung cancer: an open-label, phase 2 trial.
      • Frampton G.M.
      • Ali S.M.
      • Rosenzweig M.
      • et al.
      Activation of MET via diverse exon 14 splicing alterations occurs in multiple tumor types and confers clinical sensitivity to MET inhibitors.
      In recent studies, researchers have focused on achieving comprehensive characterization of the changes found in the genome, epigenome, transcriptome, and proteome of cancer specimens to discover new driver genes against which clinically appropriate measures could be implemented.
      • Campbell J.D.
      • Alexandrov A.
      • Kim J.
      • et al.
      Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas.
      ,
      Cancer Genome Atlas Research Network
      Comprehensive molecular profiling of lung adenocarcinoma.
      However, to the best of our knowledge, no studies have yet comprehensively determined the difference in mutational distribution or transcriptional profile among different ethnic groups. So far, studies have only analyzed the impact of ethnicity on genomic alterations in a few driver oncogenes and tumor suppressors.
      As lung cancer in smokers is relatively common in Caucasians compared with Asians,
      • Mitsudomi T.
      Molecular epidemiology of lung cancer and geographic variations with special reference to EGFR mutations.
      in The Cancer Genome Atlas (TCGA) study, only limited data about smoking-unrelated lung cancer were obtained. Smoking is known to be the major cause of LUAD but, as smoking rates decrease, proportionally more cases are arising in never or light smokers.
      • Pelosof L.
      • Ahn C.
      • Gao A.
      • et al.
      Proportion of never-smoker non-small cell lung cancer patients at three diverse institutions.
      This has increasingly made it clear that lung cancer in never smokers represents a unique disease entity separate from smoking-related lung cancer,
      • Sun S.
      • Schiller J.H.
      • Gazdar A.F.
      Lung cancer in never smokers–a different disease.
      highlighting the need to investigate and discover novel genetic factors influencing survival in this population. Studies reported to date have shown that never smokers have lower rates of mutation in the KRAS and TP53 genes than smokers
      • Vahakangas K.H.
      • Bennett W.P.
      • Castren K.
      • et al.
      p53 and K-ras mutations in lung cancers from former and never-smoking women.
      ,
      • Slebos R.J.
      • Hruban R.H.
      • Dalesio O.
      • Mooi W.J.
      • Offerhaus G.J.
      • Rodenhuis S.
      Relationship between K-ras oncogene activation and smoking in adenocarcinoma of the human lung.
      whereas never smokers show a greater tendency to develop EGFR mutations.
      • Pao W.
      • Miller V.
      • Zakowski M.
      • et al.
      EGF receptor gene mutations are common in lung cancers from “never smokers” and are associated with sensitivity of tumors to gefitinib and erlotinib.
      With regard to transcriptomic analysis in this field, considerable research on gene expression-based prognostic signatures using microarrays
      • Kratz J.R.
      • He J.
      • Van Den Eeden S.K.
      • et al.
      A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international validation studies.
      • Wistuba I.I.
      • Behrens C.
      • Lombardi F.
      • et al.
      Validation of a proliferation-based expression signature as prognostic marker in early stage lung adenocarcinoma.
      • Beer D.G.
      • Kardia S.L.
      • Huang C.C.
      • et al.
      Gene-expression profiles predict survival of patients with lung adenocarcinoma.
      • Shedden K.
      • Taylor J.M.
      • et al.
      Director's Challenge Consortium for the Molecular Classification of Lung Adenocarcinoma
      Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study.
      • Chen H.Y.
      • Yu S.L.
      • Chen C.H.
      • et al.
      A five-gene signature and clinical outcome in non-small-cell lung cancer.
      has led to the commercialization of two biomarkers in LUAD (myPlan Lung Cancer and Pervenio Lung Risk Score), but their accuracy for survival estimation remains limited. Although there are certain biases and limitations associated with microarray data, these can be ameliorated through RNA-seq, particularly in the detection of transcripts present at low levels.
      • Febbo P.G.
      • Kantoff P.W.
      Noise and bias in microarray analysis of tumor specimens.
      ,
      • Robinson D.G.
      • Wang J.Y.
      • Storey J.D.
      A nested parallel experiment demonstrates differences in intensity-dependence between RNA-seq and microarrays.
      Although a recent study featured RNA-seq prognostic analysis in LUAD using TCGA data,
      • Shukla S.
      • Evans J.R.
      • Malik R.
      • et al.
      Development of a RNA-seq based prognostic signature in lung adenocarcinoma.
      further large-scale prospective validation is needed.
      In this study, we conducted comprehensive molecular profiling of LUAD in never or light smokers from Asia to discover novel targetable mutations and prognostic biomarkers of this distinct disease entity.

      Materials and Methods

      Study Design and Patient Specimens

      The study cohort consisted of 996 primary LUADs from 920 patients that underwent surgical resection at Juntendo University between 2010 and 2014. Two board-certified pathologists (TH and TS) reviewed the histologic features of the LUADs according to the criteria of the current WHO classification of lung carcinomas.

      Travis WD, Brambilla E, Nicholson AG, et al. The 2015 World Health Organization classification of lung tumors: impact of genetic, clinical and radiologic advances since the 2004 classification. J Thorac Oncol. 10:1243–1260.

      Fresh frozen tumor samples and formalin-fixed and paraffin-embedded tissue blocks were obtained from all patients. Approval for this study was obtained from the Ethics Committee of The University of Tokyo (No. G3546) and Juntendo University School of Medicine (No. 2014176). Written informed consent was obtained from all patients involved in the present study.

      Analyses of Mitogenic Driver Alterations

      First, analyses of driver oncogene alterations in a total of 996 primary LUADs were performed in accordance with previously reported methods. Briefly, EGFR mutations were analyzed using the peptide nucleic acid-locked nucleic acid polymerase chain reaction (PCR) clamp method; KRAS mutations using the peptide nucleic acid-mediated PCR clamping method; ALK fusions using break-apart fluorescence in situ hybridization (FISH) and the intercalated antibody-enhanced polymer method; and ROS1 and RET fusions using break-apart FISH.
      • Takamochi K.
      • Takahashi F.
      • Suehara Y.
      • et al.
      DNA mismatch repair deficiency in surgically resected lung adenocarcinoma: microsatellite instability analysis using the Promega panel.
      Samples without mitogenic driver alterations were subsequently analyzed by whole-exome sequencing (WES) and whole-transcriptome sequencing

      WES Including Mutation Call, Copy Number Analysis, and Signature Analysis

      Genomic DNA was isolated from fresh frozen samples using QIAamp DNA Mini Kit (Qiagen, Hilden, Germany), and 500 ng of each sample was subjected to target fragment enrichment using an Agilent Exome Kit (v6) (Agilent Technologies, Santa Clara, CA). Massively parallel sequencing of isolated fragments was performed with a HiSeq2500 (Illumina) using the paired-end option. Paired-end WES reads were independently aligned to the human reference genome (hg38) using BWA,
      • Li H.
      • Durbin R.
      Fast and accurate short read alignment with Burrows-Wheeler transform.
      Bowtie2 (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml), and NovoAlign (http://www.novocraft.com/products/novoalign/). Somatic mutations were called using MuTect (http://www.broadinstitute.org/cancer/cga/mutect), SomaticIndelDetector (http://www.broadinstitute.org/cancer/cga/node/87), and VarScan (http://varscan.sourceforge.net). Mutations were discarded if any of the following occurred: (1) the read depth was less than 20 or the variant allele frequency (VAF) was less than 0.1, (2) they were supported by only one strand of the genome, or (3) they were present in normal human genomes in either the 1000 Genomes Project dataset (http://www.internationalgenome.org/) or our in-house database. Gene mutations were annotated by SnpEff (http://snpeff.sourceforge.net). Copy number status was analyzed using our in-house pipeline, which determines the logR ratio (LRR) as follows: (1) we selected single nucleotide polymorphism positions in the 1000 Genomes Project database that were in a homozygous state (VAF = ≤ 0.05 or ≥ 0.95) or a heterozygous state (VAF = 0.4–0.6) in the genomes of respective normal samples, (2) normal and tumor read depths at the selected position were adjusted on the basis of GþC percentage of a 100-base pair window flanking the position,
      • Yoon S.
      • Xuan Z.
      • Makarov V.
      • Yem K.
      • Sebat J.
      Sensitive and accurate detection of copy number variants using read depth of coverage.
      (3) we calculated the LRR as equal to log2 (ti/ni), where ni and ti are normal and tumor-adjusted depths at position i, and (4) each representative LRR was determined by the median of a moving window (1 megabase) centered at position i. The values of LRR of the copy number of both alleles, that of the major allele, and that of the minor allele were determined for every region of the genome. The p values for gain or loss of respective genomic regions were determined from the LRRs with a permutation test (100,000 iterations) following the algorithm used in Genomic Identification of Significant Targets in Cancer
      • Beroukhim R.
      • Getz G.
      • Nghiemphu L.
      • et al.
      Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma.
      ,
      • Mermel C.H.
      • Schumacher S.E.
      • Hill B.
      • Meyerson M.L.
      • Beroukhim R.
      • Getz G.
      GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers.
      The q values were calculated from the p values using the R package q value (http://github.com/jdstorey/qvalue). Mutational signatures were analyzed using the Wellcome Trust Sanger Institute Mutational Signature Framework (http://jp.mathworks.com/matlabcentral/fileexchange/38724-wtsi-mutationalsignature-framework). The optimal number of signatures was determined in accordance with the signature stabilities and average Frobenius reconstruction errors.

      Transcriptome Sequencing, Expression Analysis, and Detection of Fusion Genes and Exon Skipping

      Total RNA was extracted from fresh frozen samples using RNA-Bee (Tel-Test Inc., Gainesville, FL), followed by treatment with DNase I (Thermo Fisher Scientific, Waltham, MA) and then poly(A)-RNA selection before cDNA synthesis. The library used for RNA-seq was prepared with a NEBNext Ultra Directional RNA Library Prep Kit (NEB, Ipswich, MA), in accordance with the manufacturer’s protocol. Sequencing was conducted from both ends of each cluster using a HiSeq 2500 or NextSeq platform (Illumina, San Diego, CA). RNA-seq was aligned to hg19 using TopHat (v2.0.9; https://ccb.jhu.edu/software/tophat/index.shtml). The expression level of each gene was calculated using Cufflinks (v2.1.1; http://cole-trapnell-lab.github.io/cufflinks), and gene fusions were detected using the deFuse pipeline (https://bitbucket.org/dranew/defuse). Exon skipping was analyzed using an in-house pipeline as follows: (1) RNA-seq reads were aligned to hg38 and the National Center for Biotechnology Information reference sequence (RefSeq) using Burrows–Wheeler Aligner and Bowtie2, (2) skipped exons were detected from the mapped RefSeq data, (3) virtual transcriptome sequences were created dynamically, (4) RNA-seq reads were aligned to the candidate transcriptome sequences, and (5) exon skipping candidates were identified on the basis of reads with a breakpoint.

      Signature Generation and Statistical Analysis

      Survival data for the cohort were collected at Juntendo University. TCGA clinical data were downloaded from the TCGA data portal, and manually curated. The duration of overall survival (OS) was defined as the time between the date of surgical intervention and the date of either death or the previous follow-up. The duration of recurrence-free survival (RFS) was defined as the time between the date of surgical intervention and the date of either recurrence, death from any cause, or the previous follow-up. Univariate Cox regression analysis was used to evaluate the correlation between the expression level of each gene and OS in our cohorts. Only genes with q values less than 0.003 and SD greater than 1 were considered as candidate genes for the correlation analysis, and those genes were used to construct the predictive model. The candidate genes were then fitted in stepwise multivariate Cox regression analysis to assess the relative contribution of each gene to survival prediction in our cohort. The genes that correlated with survival were included in the prognostic signature. According to the estimated regression coefficients in multivariate Cox regression analysis, a prognostic risk score for predicting OS was then calculated as follows:
      Riskscore=n=1expiβi


      where n is the number of prognostic genes, expi is the expression level of prognostic gene i, and βi is the regression coefficient of gene i.
      In this study, all statistical analyses were performed with R (version 3.5.1; https://www.r-project.org/) and its contained packages. Survival analysis and Cox regression analyses were performed using the “survival” (v2.44.1.1) package. OS and RFS were analyzed using the Kaplan-Meier method, and curve differences were evaluated using the log-rank test according to either the risk score or the driver mutation subtypes. The gene set enrichment analysis (GSEA) was performed using Java GSEA software (http://software.broadinstitute.org/gsea/index.jsp) (v2.2.4).

      Cell Lines

      Human embryonic kidney (HEK) 293T cells and mouse 3T3 fibroblasts were obtained from the American Type Culture Collection and maintained in Dulbecco’s modified Eagle’s medium-F12 (DMEM-F12) supplemented with 10% fetal bovine serum (both from Thermo Fisher Scientific). Ba/F3 cells were cultured in RPMI 1640 medium (Thermo Fisher Scientific) supplemented with 10% fetal bovine serum and mouse IL-3 (20 U/mL; Sigma-Aldrich).

      Preparation of Retrovirus and Transduction of Cell Lines

      The recombinant plasmids were introduced together with packaging plasmids (Takara Bio) into human embryonic kidney 293T cells to obtain recombinant retroviral particles. For the focus formation assay, 3T3 cells were infected with ecotropic recombinant retroviruses using 4 μg/mL polybrene (Sigma-Aldrich) for 24 hours. They were then subjected to further culture for up to 2 weeks in Dulbecco's Modified Eagle Medium-F12 supplemented with 5% calf serum. Cell transformation was assessed through either phase-contrast microscopy or staining with Giemsa solution.

      Alamar Blue Cell Viability Assay

      After cell incubation in 96-well plates (with 100 μL of culture medium per well), the addition of 10 μL of Alamar Blue (Thermo Fisher Scientific) was performed, after which the fluorescence was measured using a microplate reader (2030 ARVO X3; PerkinElmer, Waltham, MA) (excitation 530 nm, emission 590 nm) at the indicated times. Wells without cells were assayed as negative controls. Adjustment for fluorescence gain for every well was performed against the well with the maximum fluorescence intensity.

      Xenograft Tumor Assays

      All animal studies were conducted in accordance with the protocols approved by the Animal Ethics Committee of the National Cancer Research Center, Tokyo, Japan. Before injection, 3T3 cells (1.0 × 106) were mixed in PBS with Matrigel (BD Biosciences, Franklin Lakes, NJ) at a 1:1 ratio. Subcutaneous injection of the cell suspension was performed (at 200 μL/mouse) into 6-week-old female BALB/c nude mice (CLEA Japan, Tokyo, Japan). The mice were treated twice a week with an intraperitoneal injection of either trametinib (10 mg/kg body weight) or vehicle control, which was initiated once the tumors had reached a size of approximately 100 to 150 mm3. The average tumor volume in each group is expressed in cubic millimeters and was calculated using the following formula: π/6 × (largest diameter) × (smallest diameter)
      • Lynch T.J.
      • Bell D.W.
      • Sordella R.
      • et al.
      Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib.
      . Tumor injections and volume measurements were performed in a manner blinded to the constructs expressed by the cells used for injection. The mice were killed after 6 weeks of treatment and resection of their solid tumors was performed.

      Sanger Sequencing

      For capillary sequencing with a 3130xl Genetic Analyzer (Thermo Fisher Scientific), PCR products prepared from 10 ng of template cDNA were used to amplify CD74-NRG2α and FGFR2-MBIP by GoTaq G2 Hot Start Master Mix Green (Promega, Madison, WI), in accordance with the manufacturer’s instructions with the following primers: 5′-CACCTTAAGAACACCATGGAGACC-3′ and 5′-ATTTGATGCGAATGTCTCGGCTGC-3′ for CD74-NRG2α and 5′-ACATGATGATGAGGGACTGTTGGC-3,′ and 5′-GCTTTTCTTCCTCTTGTAGGTCGC-3′ for FGFR2-MBIP.

      TaqMan Real-Time PCR Assay

      Quantitative real-time PCR was performed using TaqMan assays (20× Primer Probe mix; Thermo Fisher Scientific) corresponding to MAP2K1 (Assay ID AHI16ER for p.E102_I103del, AHKA4KZ for P105_A106del, AHLJ2Q7 for K57N) and GAPDH (Assay ID Hs02758991_g1). All PCR reactions were performed with TaqMan Genotyping Master Mix (Thermo Fisher Scientific) on an Applied Biosystems 7900HT Fast Real-Time PCR System, in accordance with the standard protocols. Cycle threshold values were calculated using the built-in data collection software, and samples with Cycle threshold less than or equal to 37 were considered to be positive. All assays were performed in triplicate.

      Immunohistochemical Analysis

      Formalin-fixed paraffin-embedded tissues were sectioned and stained with hematoxylin and eosin. Immunohistochemistry was performed on the sections using anti–phospho-EGFR (Tyr1068) antibody (Cell Signaling Technology, Danvers, MA), anti–phospho-HER2 (Tyr1221/1222) antibody (Cell Signaling Technology), anti–phospho-HER3 (Tyr1289) antibody (Cell Signaling Technology), and anti–phospho-HER4 (Tyr1162) antibody (Abcam, Cambridge, United Kingdom) following the manufacturer’s recommendations.

      Data Availability

      We have deposited the raw sequencing data in the Japanese Genotype-Phenotype Archive (http://trace.ddbj.nig.ac.jp/jga), which is hosted by the DNA Data Bank of Japan, under accession number JGAS00000000215.

      Results

      Whole-Transcriptome Sequencing and WES on LUAD of Never or Light Smokers With Unknown Driver Oncogenes

      The study cohort was composed of 996 patients with primary LUAD from 920 patients who underwent surgical resection at Juntendo University between 2010 and 2014. Of the cohort, 373 patients were heavy smokers (401 ≤ smoking index = cigarettes smoked per day × y of cigarette use), 104 patients were light smokers (101 ≤ smoking index ≤ 400), 510 patients were never smokers (smoking index ≤ 100), and the other nine patients’ smoking history was unknown. The LUAD subtypes included 280 lepidic adenocarcinomas, 268 acinar adenocarcinomas, 110 papillary adenocarcinoma, 103 adenocarcinomas in situ (AIS), 94 minimally invasive adenocarcinomas, 91 solid adenocarcinomas, 42 invasive mucinous adenocarcinomas, five micropapillary adenocarcinomas, two enteric adenocarcinomas, and one fetal adenocarcinoma.
      Among 987 LUADs with known smoking history, EGFR mutations, KRAS mutations, ALK fusions, RET fusions, or ROS1 fusions were identified in 435, 121, 22, 10, and 10 patients, respectively, by conventional methods. KRAS G12C, a variant for which several covalent inhibitors have been recently developed, was found in 47 cases (4.8% of the total), and it was more common in heavy smokers (9.7%) compared with never or light smokers (1.8%) (Supplementary Fig. 1). ALK, RET, and ROS1 fusions were determined using either FISH or immunohistochemistry, and fusion partners were confirmed by either Sanger sequencing or next-generation sequencing (NGS) (Supplementary Table 1). A total of 389 patients were negative for all of these driver mutations, among whom 201 had never or only lightly smoked (Fig. 1A).
      Figure thumbnail gr1a
      Figure 1Summary of mutations in lung adenocarcinoma (LUAD) of Japanese who have never or only lightly smoked. (A) Summary of driver oncogenic mutations in LUAD. This shows the number of cases identified as being positive for driver mutations before this study; (B) here, the 13 frequently mutated genes with color coding of their alteration status for each tumor are indicated. The sex and smoking status are shown at the top; (C) schematic diagram depicting the NRG1/2 fusions. The CD74 gene (NM_001025159 at 5q32) is disrupted downstream of exon 6 and is subsequently ligated to a position upstream of either exon 2 of NRG2α (NM_004883 at 5q31) or exon 6 of NRG1 (NM_013956 at 8q12). NRG1/2 fusions identified by RNA-seq are shown with their functional domains. The EGF-like domain is maintained in all fusions identified; (D) transcript variants in NRG1 fusions. The exon junction reads of NRG1 variants supporting specific exons for NRG1α (TMc_α) or NRG1β (TMc_β and En_β) were counted. The ratio of junction reads which corresponds to the ratio of NRG1β/NRG1α was calculated by the formula shown. Exon α and β represent the specific exon of NRG1α and NRG1β, respectively. The exon En represents the downstream exons of NRG1β. The exon TMc represents the downstream exons of exon α and exon En; (E) representative photographs of NRG1/2-positive LUAD. The histology of LUAD with CD74-NRG2α (left), CD74-NRG1 (middle), and SDC4-NRG1 (right) is shown with hematoxylin-eosin staining (top panels). Immunohistochemical stainings for p-EGFR and p-HER2/3/4 are shown in the lower panels. ND, not determined. TM, transmembrane domain; EGF, epidermal growth factor-like domain.
      Figure thumbnail gr1b
      Figure 1Summary of mutations in lung adenocarcinoma (LUAD) of Japanese who have never or only lightly smoked. (A) Summary of driver oncogenic mutations in LUAD. This shows the number of cases identified as being positive for driver mutations before this study; (B) here, the 13 frequently mutated genes with color coding of their alteration status for each tumor are indicated. The sex and smoking status are shown at the top; (C) schematic diagram depicting the NRG1/2 fusions. The CD74 gene (NM_001025159 at 5q32) is disrupted downstream of exon 6 and is subsequently ligated to a position upstream of either exon 2 of NRG2α (NM_004883 at 5q31) or exon 6 of NRG1 (NM_013956 at 8q12). NRG1/2 fusions identified by RNA-seq are shown with their functional domains. The EGF-like domain is maintained in all fusions identified; (D) transcript variants in NRG1 fusions. The exon junction reads of NRG1 variants supporting specific exons for NRG1α (TMc_α) or NRG1β (TMc_β and En_β) were counted. The ratio of junction reads which corresponds to the ratio of NRG1β/NRG1α was calculated by the formula shown. Exon α and β represent the specific exon of NRG1α and NRG1β, respectively. The exon En represents the downstream exons of NRG1β. The exon TMc represents the downstream exons of exon α and exon En; (E) representative photographs of NRG1/2-positive LUAD. The histology of LUAD with CD74-NRG2α (left), CD74-NRG1 (middle), and SDC4-NRG1 (right) is shown with hematoxylin-eosin staining (top panels). Immunohistochemical stainings for p-EGFR and p-HER2/3/4 are shown in the lower panels. ND, not determined. TM, transmembrane domain; EGF, epidermal growth factor-like domain.
      We performed whole-transcriptome sequencing on driver oncogene unidentified 126 NSCLC samples (124 LUADs of never or light-smoker and one heavy-smoker LUAD and one heavy-smoker squamous cell carcinoma as control), with 83 of them also undergoing WES. We identified 26 cases with EGFR uncommon mutations, 16 cases with ERBB2 mutations, 17 cases with BRAF mutations, eight cases with MAP2K1 mutations, and three cases with NRG1 fusions (Fig. 1B). A total of 13 cases of MET exon 14 skipping were identified by RNA-seq, with split reads being found that supported the ligation of MET exon 13 to exon 15. Overexpression of any of EGFR, ERBB2, MET, and MAP2K1 was identified in each single case. One case of CD74-NRG2 fusion and one case of FGFR2-MBIP fusion were identified, neither of which had been identified before (Fig. 1C and Supplementary Fig. 2). The RNA-seq data suggested that the transcript variant of NRG2 constituting CD74-NRG2 was NRG2α (NM_004883) whereas NRG1 fusions were composed of different transcript variants including NRG1α and NRG1β (Fig. 1D and Supplementary Fig. 3). The EGF motif was encoded by exon 4 and exon 5 of NRG2α and was included in the CD74-NRG2α fusion. In total, driver oncogenes were identified in 88 cases (70.4%) among 125 LUADs.

      Clinicopathologic Characteristic of NRG1/2-Fusion-Positive LUAD

      All four cases of NRG1/2-fusion-positive LUAD were women in their 60s or 70s, and they all underwent surgical resection (Table 1). The patient with CD74-NRG2α was treated with erlotinib after tumor recurrence. However, as the patient had a severe skin rash, the physician discontinued erlotinib after 1 month. The two patients with CD74-NRG1 are still alive, with no recurrence, whereas the patient with SDC4-NRG1 died of empyema after operation. The histologic diagnosis of CD74-NRG2α-positive LUAD was acinar adenocarcinoma consisting of round- to oval-shaped atypical glands, whereas CD74-NRG1-positive cases of LUAD were invasive mucinous adenocarcinomas consisting of columnar cells with abundant intracytoplasmic mucin and basally oriented nuclei. SDC4-NRG1–positive tumor was solid adenocarcinomas composed mainly of polygonal tumor cells forming sheets (Fig. 1E).
      Table 1Clinicopathologic Characteristics of Patients With NRG1 or NRG2 Rearrangements
      Sample IDFusionAgeSexSmoking indexPathologic stageStageHistologyTreatment modalityOutcome (weeks)
      LUAD_085CD74-NRG170Female100pT1aN0M0IAInvasive mucinous adenocarcinomaSurgical resectionAlive with no recurrence (130)
      LUAD_086CD74-NRG163Female0pT1aN0M0IAInvasive mucinous adenocarcinomaSurgical resectionAlive with no recurrence (110)
      LUAD_087SDC4-NRG171Female0pT3N0M0IIBSolid adenocarcinomaSurgical resectionDead of empyema (22)
      LUAD_088CD74-NRG270Female0pT2aN2M0IIIAAcinar adenocarcinomaSurgical resection and elrotonib for recurrent tumorDead of desease (32)
      LUAD, lung adenocarcinoma
      The oncoproteins of NRG1 fusions are known to bind ERBB2/3 heterodimer and activate downstream signaling
      • Drilon A.
      • Somwar R.
      • Mangatt B.P.
      • et al.
      Response to ERBB3-directed targeted therapy in NRG1-rearranged cancers.
      whereas NRG2α was reported to have moderate affinity with ERBB2/4 heterodimer.
      • Jones J.T.
      • Akita R.W.
      • Sliwkowski M.X.
      Binding specificities and affinities of egf domains for ErbB receptors.
      As phosphorylation of ERBB2/3/4 can be a surrogate marker for pathway activation, immunohistochemical analysis of p-EGFR and p-HER2/3/4 was performed. Tumor cells in CD74-NRG2α-positive case were moderately positive for p-HER4 but negative for p-EGFR, p-HER2, and p-HER3, whereas tumor cells of NRG1 fusion-positive cases were positive for all HER family member phosphorylation (Fig. 1E).

      Mutational Signatures in LUAD of Never or Light Smokers

      Various carcinogenic and cancer-related processes contribute to mutational patterns observed in tumor cells
      • Alexandrov L.B.
      • Nik-Zainal S.
      • Wedge D.C.
      • et al.
      Signatures of mutational processes in human cancer.
      . Previous large-scale studies of lung cancer genomes have identified signatures associated with patients who do or do not smoke.
      Cancer Genome Atlas Research Network
      Comprehensive molecular profiling of lung adenocarcinoma.
      ,
      • Imielinski M.
      • Berger A.H.
      • Hammerman P.S.
      • et al.
      Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing.
      ,
      • Govindan R.
      • Ding L.
      • Griffith M.
      • et al.
      Genomic landscape of non-small cell lung cancer in smokers and never-smokers.
      Using the Wellcome Trust Sanger Institute Mutational Signature Framework, we identified four mutational signatures in this cohort, many of which are strongly correlated with previously defined signatures in the Catalogue of Somatic Mutations in Cancer database (COSMIC, https://cancer.sanger.ac.uk/cosmic) (Supplementary Fig. 4A and B). These include an apolipoprotein B mRNA-editing enzyme-catalytic polypeptide-like-related signature of a C to G or C to T change at a TCT or TCA site (COSMIC signature 13, abbreviated SI13), a mismatch-repair signature of a C to T change at a GCG site (SI6), a smoking-related signature of a C to A transversion (SI4), and a signature with a moderate correlation to COSMIC signature 5 (SI5) with unknown cause (Supplementary Fig. 4C).

      Copy Number Analysis on LUAD of Never or Light Smokers

      Chromosomal copy number amplification was observed in chr1q, chr5p (encompassing the TERT locus), chr7p (EGFR), chr8q (MYC), chr12 (MDM2), chr16p, chr17q (ERBB2), and chr20q. Losses of copy number were observed in chr9 and chr17, including in CDKN2A, CDKN2B, and TP53 (Supplementary Fig. 4D). The copy number profile in our cohort is similar to that found in TCGA study.
      • Campbell J.D.
      • Alexandrov A.
      • Kim J.
      • et al.
      Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas.
      ,
      Cancer Genome Atlas Research Network
      Comprehensive molecular profiling of lung adenocarcinoma.

      Somatic Mutations of Driver Oncogenes Detected by NGS Analyses

      The mutational profile of our cohort was compared with that of the TCGA cohort with only a limited number of Asian patients. The mean numbers of somatic mutations identified in each tumor of both never and light smokers of our cohort were markedly smaller than those of the TCGA study (average mutation # in never smokers, 21 versus 157; average mutation # in light smokers, 96 versus 299) (Supplementary Fig. 5A). There were also differences between our study and the TCGA study in the frequency of common mutations, such as TP53 (51.3% versus 13.6%), KEAP1 (24.3% versus 3.7%), KMT2C (21.6% versus 3.7%), FAT3 (20.3% versus 1.2%), and STK11 (17.6% versus 1.2%) (Supplementary Fig. 5B). In contrast, there were similarities between the two studies in the frequencies of growth-promoting driver mutations such as in EGFR, BRAF, ERBB2, and MAP2K1.
      The somatic mutation profiles of EGFR, BRAF, ERBB2, and MAP2K1 in our study are shown in Supplementary Figure 5C. Here, the BRAF mutation hotspot was K601E, and the MAP2K1 mutation hotspot was E102_I103del, whereas those of the TCGA study or the Genomics Evidence Neoplasia Information Exchange project were G469A/L/R/S/V or V600E in BRAF and K57N/T in MAP2K1 (Supplementary Fig. 6). Even in a cohort from the Memorial Sloan Kettering Cancer Center of 302 never smokers with LUAD, BRAF, and MAP2K1 mutation hotspots differed from those in our cohort.
      • Jordan E.J.
      • Kim H.R.
      • Arcila M.E.
      • et al.
      Prospective comprehensive molecular characterization of lung adenocarcinomas for efficient patient matching to approved and emerging therapies.
      We identified seven cases with MAP2K1 mutations, six of which were MAP2K1 p.Glu102_Ile103del and the other was p.P105_A106del (Supplementary Fig. 7A). Another 67 never or light-smoker samples and 181 heavy-smoker samples of this LUAD cohort without known driver oncogenes were tested for MAP2K1 mutations by TaqMan single nucleotide polymorphism Genotyping Assays. The findings revealed p.P105_A106del in two never or light smoker samples (Supplementary Fig. 7B). Notably, no MAP2K1 mutation was identified among 181 heavy smokers, suggesting that these MAP2K1 exon 3 deletions are specific for tumors in those who have never or only lightly smoked.
      To investigate the transforming potential of these MAP2K1 mutations, focus formation assays were performed; for this purpose, each mutant was transduced into cells of the mouse fibroblast cell line NIH/3T3 (3T3). Transformed foci were observed in the cells expressing E102_I103del, P105_A106del, or the K57N mutant, but not in mock-transfected cells, or those expressing wild-type MAP2K1 or MAP2K1(K97M) (kinase-dead) (Supplementary Fig. 8A). To determine the effects of MAP2K1 mutations on MAPK signaling, we investigated the ability of the mutants to induce ERK phosphorylation in 293T cells. Western blot analyses reported increased kinase activity in the E102_I103del, P105_A106del, and K57N mutants (Supplementary Fig. 8A).
      MAP2K1 mutants were transduced into Ba/F3, a murine interleukin-3 (IL-3)–dependent pro–B-cell line, to assess the ability of Ba/F3 to grow independently of IL-3. Cells expressing the E102_I103del, P105_A106del, or K57N mutant could grow even without IL-3; however, this was not the case for the cells expressing the wild-type. Treatment with an MEK1/2 inhibitor inhibited Ba/F3 growth with the MAP2K1 mutants; however, this was not observed for the parental Ba/F3 supplemented with IL-3 (Supplementary Fig. 8B). Furthermore, the 3T3 cells expressing the E102_I103del mutant formed subcutaneous tumors in nude mice, and the tumor growth was significantly (p < 0.01) inhibited in vivo by the treatment with trametinib (Supplementary Fig. 8C).

      Whole-Transcriptome Analysis of LUAD

      To identify the gene expression profiles associated with clinicopathologic features or gene mutational profiles, we conducted k-means clustering analysis using RNA-seq data. We used the top 100 genes with the most variation among the samples to divide the cohort into two groups by this clustering approach (Fig. 2A).
      Figure thumbnail gr2
      Figure 2Gene expression profile of lung adenocarcinoma in Japanese who have never or only lightly smoked. (A) The k-means clustering analysis was conducted using RNA-seq data. The clinical information (sex, age, smoking index, pathologic stage) and driver mutation profile are shown in the upper part; (B) Fisher’s exact test was performed to identify the factors associated with either group stratified by k-means clustering. Indicated factors were compared between the left major cluster (cluster 1) and right major cluster (cluster 2) in the heat map of (A); (C) Kaplan-Meier curves of OS and progression-free survival in our cohort stratified by k-means clustering as clusters 1 and 2. Univariate Cox analysis was used to calculate hazard ratio (HR). HR, 95% confidence interval, and p value are shown.
      We merged the clinical information and mutational profile with the gene expression data and performed Fisher’s exact test to determine the factors related to either group: the left cluster (cluster 1) and the right cluster (cluster 2). The results reported that the proportion of higher-stage (stage ≥IIA) patients in cluster 2 was significantly greater than that in cluster 1 (p = 0.003, Fig. 2B).
      OS and RFS of cluster 2 were significantly worse than those of cluster 1 (OS hazard ratio [HR] = 6.82, 95% confidence interval [CI]: 2.50–18.7, p < 1 × 10−5; RFS HR = 8.97, 95% CI: 3.36–23.9, p < 1 × 10−7). Considering that higher-stage cancer is enriched in cluster 2, it is suggested that the gene expression profile of cluster 2 is associated with advanced cancer (Fig. 2C).
      GSEA revealed that the gene set of “E2F_TARGETS” was enriched in cluster 2 and in stage II or higher cancers (Supplementary Fig. 9 and Supplementary Table 2). There was no significant difference in OS and RFS between the driver mutation-positive and -negative cancer groups (OS HR = 1.31, 95% CI = 0.54–3.17, p = 0.5; RFS HR = 1.27, 95% CI = 0.56–2.83, p = 0.6) (Supplementary Fig. 10).
      There is a substantial risk for recurrence and death in patients with early-stage LUAD, even after complete surgical resection. The use of adjuvant therapy in LUAD at early stages, particularly stage I, remains controversial because no consistent survival benefit was reported in previous randomized trials. Reliable prognostic biomarkers are critically needed to select patients who are at high-risk for recurrence and who might benefit from additional systemic therapies.
      We analyzed approximately 13,000 genes with a SD of fragments per kilobase of exon per million reads mapped greater than 1.0 to ensure adequate variance. Univariate Cox proportional hazards regression analysis reported that 192 genes were statistically significantly correlated with OS (p ≤ 1×10−4), although genes with lower statistical significance may also be important.
      The 14 genes with a false discovery rate of less than or equal to 0.003 were used for prognostic signature building using the forward conditional stepwise regression with multivariable Cox analysis in our cohort. This procedure selected a prognostic model with three genes: CCL8, MIS18A, and C1orf131.
      We constructed a risk score with the regression coefficients from this model and performed manual selection of a suitable threshold at the 75th percentile (Fig. 3A). High-risk patients, as defined by the three-gene signature-based risk score, had significantly worse OS for all stages (HR = 14.3, 95% CI: 5.03–40.7, p = 1 × 10−10) and for stage I patients (HR = 9.83, 95% CI: 2.45–39.4, p = 7 × 10−5) in our cohort independent of age, sex, smoking index, stage, and gene mutations (Fig. 3B).
      Figure thumbnail gr3
      Figure 3Three-gene prognostic signature in lung adenocarcinoma in Japanese who have never or only lightly smoked. (A) Three-gene expression and risk score distribution in our cohort by z-score. Here, red indicates higher expression, whereas light blue indicates lower expression. The risk scores for all patients are plotted in ascending order and marked as low risk (blue) or high-risk (red), as divided by the threshold (vertical black line). The risk score threshold is 4.86; (B) Kaplan-Meier curves of OS and progression-free survival for all stages (upper) or for stage I (lower) in our cohort stratified by three-gene prognostic signature into those at high and low risk. Univariate Cox analysis was used to calculate the hazard ratio (HR). HR, 95% confidence interval, p value, and median survival are shown; (C) heatmap of the top 200 genes differentially expressed between those at high and low risk, with red indicating higher expression and blue indicating lower expression; (D) statistically significant gene sets identified by GSEA to be differentially overexpressed in high-risk tumors. presents the full GSEA results.
      To understand the biology underpinning high-risk tumors, we identified the top 100 genes overexpressed and the top 100 genes underexpressed in high-risk tumors (Fig. 3C). Our findings revealed significant (p < 0.05) enrichment of fusion-positive cases in the high-risk group, whereas in the cases with other driver genes, no difference was identified between the low- and high-risk groups (Supplementary Table 3). We found significant (p < 0.01) enrichment for the high-risk tumors for gene sets related to cancer biology, including E2F targets, MYC targets, and G2M checkpoint (Fig. 3D and Supplementary Table 2).
      EGFR status could be a strong confounding factor to the three-gene prognostic signature when the patients were treated with EGFR tyrosine kinase inhibitors (EGFR TKIs). To exclude this possibility, we analyzed the performance of the three-gene prognostic signature in patient subsets with the wild-type or mutant status of EGFR. The three-gene prognostic signature risk group provided significant OS stratification in the EGFR wild-type patients (103/125, 82.4%) (HR = 14.1, 95% CI: 4.38–45.7, p = 2 × 10−8) and the EGFR-mutant patients (22/125, 17.6%) (HR = 10.5, 95% CI: 1.08–101, p = 0.01) (Supplementary Fig. 11).
      Finally, in a multivariable Cox analysis that includes EGFR and ALK alteration status, we found that the risk score was still statistically significant (HR = 12.61, p = 1.81 × 10−5) (Table 2). None of the mutation statuses was statistically significant in the multivariable analysis.
      Table 2Cox Proportional Hazard Models in Japanese Lung Adenocarcinoma Cohort
      FactorUnivariateMultivariable
      HRp
      Two-sided likelihood ratio test. Age, <50 versus ≥50; Smoking index, never smoker versus light smoker + heavy smoker; stage, stage I versus stage ≥II.
      HRp
      Two-sided likelihood ratio test. Age, <50 versus ≥50; Smoking index, never smoker versus light smoker + heavy smoker; stage, stage I versus stage ≥II.
      Risk score14.312.05E–0812.611.81E–05
      Sex0.820.721.150.82
      Age1.010.991.450.64
      Smoking index2.380.111
      Stage6.65.38E–054.140.01
      Fusion4.980.080.590.54
      MET0.410.320.170.1
      BRAF0.320.180.590.63
      EGFR1.040.950.760.65
      ERBB21.210.761.440.6
      MAP2K13.76E–080.119.88E–080.99
      HR, hazard ratio.
      Two-sided likelihood ratio test. Age, <50 versus ≥50; Smoking index, never smoker versus light smoker + heavy smoker; stage, stage I versus stage ≥II.
      Using the same risk score threshold as selected in our cohort, we found that the three-gene prognostic signature risk group significantly stratified the TCGA cohort for OS (HR = 1.58, 95% CI: 1.18–2.10, p = 2 × 10−3), which was independent of age, sex, and stage (Supplementary Fig. 12 and Supplementary Tables 4 and 5).

      Clinicopathologic and Genomic Characterization of LUAD

      Figure 4A presents the clinical and pathologic characteristics of the patients with oncogenic drivers. A total of 679 patients were shown to have driver oncogenes (68.2%). Overall, each activating alteration was found to be distributed throughout the histologic subtypes such as preinvasive, minimally invasive, and invasive adenocarcinomas.
      Figure thumbnail gr4
      Figure 4Clinicopathologic characterization and genomic status of lung adenocarcinoma. (A) The types of driver alterations are indicated for each pathologic classification of lung adenocarcinoma; and below (B) the types of driver alterations are indicated for each age group. The right pie chart indicates the types of driver mutations found in patients aged 40 years and below (young adults).
      We also identified mutant EGFR, KRAS, ERBB2, BRAF, and MAP2K1, or rearranged ALK and ROS1 alterations in AIS; this supported the idea that these alterations are early pathogenic events in LUAD. Interestingly, BRAF was the second most usually mutated gene (9%) in AIS, followed by MAP2K1 (4%) and KRAS (4%) mutations.
      We next compared the frequency of oncogenic alterations in patients aged less than or equal to 40 years (young adults) with that observed in patients aged greater than or equal to 41 years. Among the young adults, there was a high proportion of patients with activating EGFR alterations (46.7% versus 25.0%; p = 0.0538), whereas ERBB2 alterations reported significant differences (20.0% versus 1.12%; p < 0.0001). In addition, in nine young adults (45%), no apparent oncogenic drivers were identified (Fig. 4B and Supplementary Table 6). With respect to rearranged ALK, the patients ranged in age from 41 to 85 years (median 65.5 y).

      Discussion

      In this study, we performed mutational profiling on LUAD in Japanese patients who had never smoked or only smoked lightly. We identified a distinct mutational profile compared with that in the TCGA study, the data of which were mainly from a non-Asian population. Compared with never smokers or light smokers in the TCGA cohort, there were fewer mutations per tumor in total in our cohort. This suggests that the mechanism underlying the generation of driver mutations might differ between the two cohorts. This difference could be explained by inherited genetic variations or environmental stress such as second-hand tobacco smoke, viral infections, or hazardous chemicals.
      In this study, it was revealed that more than half of the patients previously diagnosed as driver oncogene-negative (i.e., negative for EGFR major mutations, ALK fusions, RET fusions, and ROS1 fusions) still had actionable mutations within EGFR, BRAF, MAP2K1, ERBB2, or MET or had fusion oncogenes of NRG1/2 or FGFR2. This reveals the importance of NGS-based clinical sequencing for identifying driver mutations in individual cancers. In particular, RNA sequencing identified CD74-NRG2α, FGFR2-MBIP, and CD74-NRG1, suggesting the clinical utility of RNA-seq that may offer patients the opportunity to enroll in clinical trials that are specific for genomic alterations.
      A key finding in this study is the identification of the novel CD74-NRG2α fusion. There are now several global trials targeting NRG1 fusions in solid malignancies (e.g., Merus’ bispecific HER2/HER3 antibody MCLA-128). Thus, it is possible that the treatment could be extended to NRG2α fusions and theoretically to NRG3 and NRG4 fusions albeit that a HER4 antibody will have to employed in certain cases.
      As it has recently been shown that NRG1 fusion is involved in various solid tumors, including breast, head, and neck, renal, lung, ovarian, pancreatic, prostate, and uterine cancers,
      • Drilon A.
      • Somwar R.
      • Mangatt B.P.
      • et al.
      Response to ERBB3-directed targeted therapy in NRG1-rearranged cancers.
      ,
      • Heining C.
      • Horak P.
      • Uhrig S.
      • et al.
      NRG1 fusions in KRAS wild-type pancreatic cancer.
      it is also interesting to investigate the prevalence of NRG2α fusion among solid tumors. The fact that CD74-NRG2α fusion could not be identified in TCGA RNA-seq data of various cancer types
      • Gao Q.
      • Liang W.W.
      • Foltz S.M.
      • et al.
      Driver fusions and their implications in the development and treatment of human cancers.
      ,
      • Yoshihara K.
      • Wang Q.
      • Torres-Garcia W.
      • et al.
      The landscape and therapeutic relevance of cancer-associated transcript fusions.
      suggests that it may be very rare or more specific to LUAD in Asians. In contrast, NTRK1/2 fusion was not identified in the 996 cases by either immunohistochemistry for pan-TRK, FISH, or RNA-seq suggesting that NTRK fusions in LUAD may not be so common in Asians as in Caucasians.
      Immunohistochemically, tumor cells of CD74-NRG2α-positive case were negative for p-HER3, whereas tumor cells of NRG1-fusion-positive cases were focally positive for p-HER3. These data may indicate that NRG2α fusion may possess oncogenic potential through activation of other HER family members.
      Our cohort revealed a difference in the distribution of driver oncogenes compared with those in TCGA, the Genomics Evidence Neoplasia Information Exchange project, or the MSK-IMPACT study. For example, in our study, the mutation hotspots for BRAF and MAP2K1 are p.K601E and p.E102_I103del, respectively, but those of the TCGA study were p.G469A/L/R/S/V or p.V600E for BRAF and p.K57N/T for MAP2K1. This might be explained by the difference in smoking status between the two cohorts. The MAP2K1 exon 3 deletions found in this study were only identified in nonsmokers, whereas K57N/T has been reported to be associated with smoking in a previous study.
      • Arcila M.E.
      • Drilon A.
      • Sylvester B.E.
      • et al.
      MAP2K1 (MEK1) mutations define a distinct subset of lung adenocarcinoma associated with smoking.
      In contrast to the case for oncogenic mutations, interestingly, the k-means clustering of our expression data enabled patients to be separated into groups with favorable and poor outcomes. Even in stage I cancers, the prognosis of which is generally favorable, the k-means clustering could still identify patients with a poor prognosis. This motivates us to identify the optimal gene set associated with prognosis, especially in stage I.
      We defined a three-gene set for predicting the aggressive type of LUAD. This gene set might be used as a biomarker for high-risk patients for whom careful follow-up should be performed after the surgical resection of tumors. The three-gene set was further confirmed to be useful in predicting the prognosis of LUAD in TCGA cohort. However, its clinical utility was not so clear in the TCGA cohort, in which HR for OS/RFS was 1.6, compared with that in the Japanese cohort. This discrepancy in the robustness of the three-gene set in both cohorts might be because TCGA study include patients with known driver mutations who had probably been treated with corresponding TKIs. In that case, genomic status could be a confounding factor regarding the outcome of the patients. Therefore, a large-scale cohort study should be performed to evaluate if the utility of this gene set as a prognostic biomarker for stage I LUAD is limited to Japanese or applicable to other races. Although current standard clinical practice does not support the routine use of genomic/transcriptomic testing in early-stage, reliable prognostic biomarkers may be helpful to select patients who are at high-risk of recurrence and who might benefit from additional systemic therapies. We have indeed included the three genes in the clinical RNA sequencing panel that we designed
      • Kohsaka S.
      • Tatsuno K.
      • Ueno T.
      • et al.
      Comprehensive assay for the molecular profiling of cancer by target enrichment from formalin-fixed paraffin-embedded specimens.
      and plan to investigate their utility to guide decisions on adjuvant systemic therapy in a prospective LUAD cohort.
      One limitation of this study is that patients who may have had co-mutations or co-alteration could have been missed because we did not perform WES if RNA-seq identified any driver oncogenes such as ERBB2 mutation or MET exon 14 skipping variants. Another limitation is that the three-gene set was evaluated only in the tumors at the initial surgical resection. The prognostic value of the marker for patients in recurrent cancer should be investigated in future studies.
      The discoveries made here can be easily applied in a clinical setting. Our data indicate the value of applying gene expression profiling for predicting the prognosis after a surgical operation, and that the identification of actionable mutations is important for optimizing targeted drugs. CD74-NRG2α and FGFR2-MBIP are novel actionable fusions and could be targeted by several TKIs currently under development. We believe that our genomic and transcriptomic analyses highlight the importance of precise tumor profiling to provide the best possible care to patients.

      Acknowledgments

      This study was financially supported in part through grants from the Program for Integrated Database of Clinical and Genomic Information under grant number JP18kk0205003, the Leading Advanced Projects for Medical Innovation under grant number JP18am0001001, the Practical Research for Innovative Cancer Control under grant number JP18ck0106252, and the Project for Cancer Research And Therapeutic Evolution under grant number JP18cm0106502 from the Japan Agency for Medical Research and Development. This work was also supported in part by a grant from Eisai Co., Ltd. The authors would like to thank A. Maruyama and H. Tomita for technical assistance.

      Supplementary Data

      References

        • Siegel R.L.
        • Miller K.D.
        • Jemal A.
        Cancer statistics, 2017.
        CA Cancer J Clin. 2017; 67: 7-30
        • Lynch T.J.
        • Bell D.W.
        • Sordella R.
        • et al.
        Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib.
        N Engl J Med. 2004; 350: 2129-2139
        • Paez J.G.
        • Janne P.A.
        • Lee J.C.
        • et al.
        EGFR mutations in lung cancer: correlation with clinical response to gefitinib therapy.
        Science. 2004; 304: 1497-1500
        • Rosell R.
        • Carcereny E.
        • Gervais R.
        • et al.
        Erlotinib versus standard chemotherapy as first-line treatment for European patients with advanced EGFR mutation-positive non-small-cell lung cancer (EURTAC): a multicentre, open-label, randomised phase 3 trial.
        Lancet Oncol. 2012; 13: 239-246
        • Kwak E.L.
        • Bang Y.J.
        • Camidge D.R.
        • et al.
        Anaplastic lymphoma kinase inhibition in non-small-cell lung cancer.
        N Engl J Med. 2010; 363: 1693-1703
        • Shaw A.T.
        • Kim D.W.
        • Nakagawa K.
        • et al.
        Crizotinib versus chemotherapy in advanced ALK-positive lung cancer.
        N Engl J Med. 2013; 368: 2385-2394
        • Solomon B.J.
        • Mok T.
        • Kim D.W.
        • et al.
        First-line crizotinib versus chemotherapy in ALK-positive lung cancer.
        N Engl J Med. 2014; 371: 2167-2177
        • Shaw A.T.
        • Ou S.H.
        • Bang Y.J.
        • et al.
        Crizotinib in ROS1-rearranged non-small-cell lung cancer.
        N Engl J Med. 2014; 371: 1963-1971
        • Frampton G.M.
        • Ali S.M.
        • Rosenzweig M.
        • et al.
        Activation of MET via diverse exon 14 splicing alterations occurs in multiple tumor types and confers clinical sensitivity to MET inhibitors.
        Cancer Discov. 2015; 5: 850-859
        • Paik P.K.
        • Drilon A.
        • Fan P.D.
        • et al.
        Response to MET inhibitors in patients with stage IV lung adenocarcinomas harboring MET mutations causing exon 14 skipping.
        Cancer Discov. 2015; 5: 842-849
        • Drilon A.
        • Rekhtman N.
        • Arcila M.
        • et al.
        Cabozantinib in patients with advanced RET-rearranged non-small-cell lung cancer: an open-label, single-centre, phase 2, single-arm trial.
        Lancet Oncol. 2016; 17: 1653-1660
        • Drilon A.
        • Siena S.
        • Ou S.I.
        • et al.
        Safety and antitumor activity of the multitargeted pan-TRK, ROS1, and ALK inhibitor Entrectinib: combined results from two Phase I trials (ALKA-372-001 and STARTRK-1).
        Cancer Discov. 2017; 7: 400-409
        • Planchard D.
        • Kim T.M.
        • Mazieres J.
        • et al.
        Dabrafenib in patients with BRAF(V600E)-positive advanced non-small-cell lung cancer: a single-arm, multicentre, open-label, phase 2 trial.
        Lancet Oncol. 2016; 17: 642-650
        • Kris M.G.
        • Camidge D.R.
        • Giaccone G.
        • et al.
        Targeting HER2 aberrations as actionable drivers in lung cancers: phase II trial of the pan-HER tyrosine kinase inhibitor dacomitinib in patients with HER2-mutant or amplified tumors.
        Ann Oncol. 2015; 26: 1421-1427
        • Mazieres J.
        • Barlesi F.
        • Filleron T.
        • et al.
        Lung cancer patients with HER2 mutations treated with chemotherapy and HER2-targeted drugs: results from the European EUHER2 cohort.
        Ann Oncol. 2016; 27: 281-286
        • Planchard D.
        • Smit E.F.
        • Groen H.J.M.
        • et al.
        Dabrafenib plus trametinib in patients with previously untreated BRAF(V600E)-mutant metastatic non-small-cell lung cancer: an open-label, phase 2 trial.
        Lancet Oncol. 2017; 18: 1307-1316
        • Campbell J.D.
        • Alexandrov A.
        • Kim J.
        • et al.
        Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas.
        Nat Genet. 2016; 48: 607-616
        • Cancer Genome Atlas Research Network
        Comprehensive molecular profiling of lung adenocarcinoma.
        Nature. 2014; 511: 543-550
        • Mitsudomi T.
        Molecular epidemiology of lung cancer and geographic variations with special reference to EGFR mutations.
        Transl Lung Cancer Res. 2014; 3: 205-211
        • Pelosof L.
        • Ahn C.
        • Gao A.
        • et al.
        Proportion of never-smoker non-small cell lung cancer patients at three diverse institutions.
        J Natl Cancer Inst. 2017; 109
        • Sun S.
        • Schiller J.H.
        • Gazdar A.F.
        Lung cancer in never smokers–a different disease.
        Nat Rev Cancer. 2007; 7: 778-790
        • Vahakangas K.H.
        • Bennett W.P.
        • Castren K.
        • et al.
        p53 and K-ras mutations in lung cancers from former and never-smoking women.
        Cancer Res. 2001; 61: 4350-4356
        • Slebos R.J.
        • Hruban R.H.
        • Dalesio O.
        • Mooi W.J.
        • Offerhaus G.J.
        • Rodenhuis S.
        Relationship between K-ras oncogene activation and smoking in adenocarcinoma of the human lung.
        J Natl Cancer Inst. 1991; 83: 1024-1027
        • Pao W.
        • Miller V.
        • Zakowski M.
        • et al.
        EGF receptor gene mutations are common in lung cancers from “never smokers” and are associated with sensitivity of tumors to gefitinib and erlotinib.
        Proc Natl Acad Sci U S A. 2004; 101: 13306-13311
        • Kratz J.R.
        • He J.
        • Van Den Eeden S.K.
        • et al.
        A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international validation studies.
        Lancet. 2012; 379: 823-832
        • Wistuba I.I.
        • Behrens C.
        • Lombardi F.
        • et al.
        Validation of a proliferation-based expression signature as prognostic marker in early stage lung adenocarcinoma.
        Clin Cancer Res. 2013; 19: 6261-6271
        • Beer D.G.
        • Kardia S.L.
        • Huang C.C.
        • et al.
        Gene-expression profiles predict survival of patients with lung adenocarcinoma.
        Nat Med. 2002; 8: 816-824
        • Shedden K.
        • Taylor J.M.
        • et al.
        • Director's Challenge Consortium for the Molecular Classification of Lung Adenocarcinoma
        Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study.
        Nat Med. 2008; 14: 822-827
        • Chen H.Y.
        • Yu S.L.
        • Chen C.H.
        • et al.
        A five-gene signature and clinical outcome in non-small-cell lung cancer.
        N Engl J Med. 2007; 356: 11-20
        • Febbo P.G.
        • Kantoff P.W.
        Noise and bias in microarray analysis of tumor specimens.
        J Clin Oncol. 2006; 24: 3719-3721
        • Robinson D.G.
        • Wang J.Y.
        • Storey J.D.
        A nested parallel experiment demonstrates differences in intensity-dependence between RNA-seq and microarrays.
        Nucleic Acids Res. 2015; 43: e131
        • Shukla S.
        • Evans J.R.
        • Malik R.
        • et al.
        Development of a RNA-seq based prognostic signature in lung adenocarcinoma.
        J Natl Cancer Inst. 2017; 109
      1. Travis WD, Brambilla E, Nicholson AG, et al. The 2015 World Health Organization classification of lung tumors: impact of genetic, clinical and radiologic advances since the 2004 classification. J Thorac Oncol. 10:1243–1260.

        • Takamochi K.
        • Takahashi F.
        • Suehara Y.
        • et al.
        DNA mismatch repair deficiency in surgically resected lung adenocarcinoma: microsatellite instability analysis using the Promega panel.
        Lung Cancer. 2017; 110: 26-31
        • Li H.
        • Durbin R.
        Fast and accurate short read alignment with Burrows-Wheeler transform.
        Bioinformatics. 2009; 25: 1754-1760
        • Yoon S.
        • Xuan Z.
        • Makarov V.
        • Yem K.
        • Sebat J.
        Sensitive and accurate detection of copy number variants using read depth of coverage.
        Genome Res. 2009; 19: 1586-1592
        • Beroukhim R.
        • Getz G.
        • Nghiemphu L.
        • et al.
        Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma.
        Proc Natl Acad Sci U S A. 2007; 104: 20007-20012
        • Mermel C.H.
        • Schumacher S.E.
        • Hill B.
        • Meyerson M.L.
        • Beroukhim R.
        • Getz G.
        GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers.
        Genome Biol. 2011; 12: R41
        • Drilon A.
        • Somwar R.
        • Mangatt B.P.
        • et al.
        Response to ERBB3-directed targeted therapy in NRG1-rearranged cancers.
        Cancer Discov. 2018; 8: 686-695
        • Jones J.T.
        • Akita R.W.
        • Sliwkowski M.X.
        Binding specificities and affinities of egf domains for ErbB receptors.
        FEBS Lett. 1999; 447: 227-231
        • Alexandrov L.B.
        • Nik-Zainal S.
        • Wedge D.C.
        • et al.
        Signatures of mutational processes in human cancer.
        Nature. 2013; 500: 415-421
        • Imielinski M.
        • Berger A.H.
        • Hammerman P.S.
        • et al.
        Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing.
        Cell. 2012; 150: 1107-1120
        • Govindan R.
        • Ding L.
        • Griffith M.
        • et al.
        Genomic landscape of non-small cell lung cancer in smokers and never-smokers.
        Cell. 2012; 150: 1121-1134
        • Jordan E.J.
        • Kim H.R.
        • Arcila M.E.
        • et al.
        Prospective comprehensive molecular characterization of lung adenocarcinomas for efficient patient matching to approved and emerging therapies.
        Cancer Discov. 2017; 7: 596-609
        • Heining C.
        • Horak P.
        • Uhrig S.
        • et al.
        NRG1 fusions in KRAS wild-type pancreatic cancer.
        Cancer Discov. 2018; 8: 1087-1095
        • Gao Q.
        • Liang W.W.
        • Foltz S.M.
        • et al.
        Driver fusions and their implications in the development and treatment of human cancers.
        Cell Rep. 2018; 23: 227-238
        • Yoshihara K.
        • Wang Q.
        • Torres-Garcia W.
        • et al.
        The landscape and therapeutic relevance of cancer-associated transcript fusions.
        Oncogene. 2015; 34: 4845-4854
        • Arcila M.E.
        • Drilon A.
        • Sylvester B.E.
        • et al.
        MAP2K1 (MEK1) mutations define a distinct subset of lung adenocarcinoma associated with smoking.
        Clin Cancer Res. 2015; 21: 1935-1943
        • Kohsaka S.
        • Tatsuno K.
        • Ueno T.
        • et al.
        Comprehensive assay for the molecular profiling of cancer by target enrichment from formalin-fixed paraffin-embedded specimens.
        Cancer Sci. 2019; 110: 1464-1479

      Linked Article

      • Is NRG2α Fusion a “Doppelgänger” to NRG1α/β Fusions in Oncology?
        Journal of Thoracic OncologyVol. 15Issue 6
        • Preview
          Neuregulins (NRGs) are cellular signaling proteins that contain epidermal growth factor (EGF)–like domains and play important roles in the development of the nervous and cardiovascular systems. NRG1 and NRG2 are members of six distinct NRG genes (NRG1, NRG2, NRG3, NRG4, NRG5, [tomoregulin-1, transmembrane protein with EGF like and two follistatin like domains 1 {TMEFF1}] and NRG6 [Neuroglycan-C, chondroitin sulfate proteoglycan 5 {CSPG5}, chicken acidic leucine-rich EGF-like domain-containing brain protein {CALEB}) that share an EGF-like domain.
        • Full-Text
        • PDF
        Open Archive