I am a data scientist with a background in quantitative social science and product management. At Nesta, I use natural language processing and machine learning to understand the changing skill demand landscape from millions of job adverts.
There is no publicly available data on the skills that are commonly required in UK online job adverts, despite this information being useful for a range of use cases. To address this, we have built an open source skills extraction python library using spaCy and huggingface. Our approach is twofold: we train a named entity recognition model to extract skill entities from job adverts then map them onto any standardised skills taxonomy. By applying this algorithm to a dataset of scraped online job adverts, we are then able to find skill similarities amongst occupations, and regional differences in skill requirements.