Abstract Background Drug development is lengthy and costly, making drug repurposing an attractive alternative. Identifying repurposing candi
Abstract Background Drug development is lengthy and costly, making drug repurposing an attractive alternative. Identifying repurposing candidates from vast biomedical literature is challenging. Natural language processing (NLP) offers potential for literature-based discovery. We present and evaluate a novel, accessible NLP-based method using the Word2Vec algorithm to identify, test, and validate candidate medications for repurposing, demonstrated by seeking treatments for psychotic disorders. Results A Word2Vec model trained on 2.3 million PubMed abstracts (2000–2023) identified potential repurposing candidates based on cosine similarity to a known antipsychotic drug. We tested one candidate, a cephalosporin antibiotic, across independent datasets: MIMIC-IV, CRIS, and BRATECA. Cephalosporin antibiotics in MIMIC-IV demonstrated a reduced hazard ratio (aHR) for psychosis hospitalisation overall (0.94, 95% CI: 0.90–0.99) and more substantially for severe mental illness (0.52, 95% CI: 0.45–0.60). However, CRIS showed an increased risk (HR: 3.56, 95% CI: 2.66–4.77). BRATECA lacked suitable diagnostic data for analysis. Conclusions This methodological framework demonstrates the potential for machine learning approaches to systematically identify drug repurposing candidates, while highlighting population-specific variations in therapeutic effectiveness that warrant caution in translational applications. Our findings should not be interpreted as cephalosporins to be recommended for or against treating psychosis. Our findings merely offer validity to the application of our repurposing methodology, and should serve as a foundation for further investigation rather than direct clinical application.