Abstract Diagnosing lung cancer at a curable stage offers the opportunity for a favorable prognosis. The emerging epigenomics analysis on pl
Abstract Diagnosing lung cancer at a curable stage offers the opportunity for a favorable prognosis. The emerging epigenomics analysis on plasma cell-free DNA (cfDNA), including 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) modifications, has acted as a promising approach facilitating the identification of lung cancer. And, integrating 5mC biomarker with chest computed tomography (CT) image features could optimize the diagnosis of lung cancer, exceeding the performance of models built on single feature. However, the clinical applicability of integrated markers might be limited by the potential risk of overfitting due to small sample size. Hence, we prospectively collected peripheral blood sample and the paired chest CT images of 2032 patients with indeterminate pulmonary nodules across 5 centers, and constructed a large-scale, multi-institutional, multiomics database that encompass CT imaging data and plasma cfDNA fragmentomic in 5mC-, 5hmC-enriched regions. To our best knowledge, this dataset is the first radio-epigenomic dataset with the largest sample size, and provides multi-dimensional insights for early diagnosis of lung cancer, facilitating the individuated management for lung cancer.