There remain many unknowns regarding the onset and clinical course of the ongoing COVID-19 pandemic. We used a combination of classic epidemiological methods, natural language processing (NLP), and machine learning (for predictive modeling), to analyse the electronic health records (EHRs) of patients with COVID-19. We explored the unstructured free text in the EHRs within the SESCAM Healthcare Network (Castilla La-Mancha, Spain) from the entire population with available EHRs (1,364,924 patients) from January 1st to March 29th, 2020. We extracted related clinical information upon diagnosis, progression and outcome for all COVID-19 cases, focusing in those requiring ICU admission. A total of 10,504 patients with a clinical or PCR-confirmed diagnosis of COVID-19 were identified, 52.5% males TRANS, with a mean age TRANS of 58.2 and S.D. 19.7 years. Upon admission, the most common symptoms were cough MESHD cough HP, fever MESHD fever HP, and dyspnoea, but all in less than half of cases. Overall, 6% of hospitalized patients required ICU admission. Using a machine-learning, data-driven algorithm we identified that a combination of age TRANS, fever MESHD fever HP, and tachypnoea was the most parsimonious predictor of ICU admission: those younger than 56 years, without tachypnoea, and temperature <39 C, (or >39 C without respiratory crackles HP), were free of ICU admission. On the contrary, COVID-19 patients aged TRANS 40 to 79 years were likely to be admitted to the ICU if they had tachypnoea and delayed their visit to the ER after being seen in primary care. Our results show that a combination of easily obtainable clinical variables ( age TRANS, fever MESHD fever HP, and tachypnoea with/without respiratory crackles HP) predicts which COVID-19 patients require ICU admission.