Jack Vines
Jack Vines is Lead Data Engineer at Nesta. He is interested in socially impactful applications of data science and open data. His work centres around large scale data collection and pipelines, and has included building infrastructure to collect over 3 million online job adverts per year, whilst applying large natural language models efficiently to enrich them.
Sessions
There is no publicly available data on the skills that are commonly required in UK online job adverts, despite this information being useful for a range of use cases. To address this, we have built an open source skills extraction python library using spaCy and huggingface. Our approach is twofold: we train a named entity recognition model to extract skill entities from job adverts then map them onto any standardised skills taxonomy. By applying this algorithm to a dataset of scraped online job adverts, we are then able to find skill similarities amongst occupations, and regional differences in skill requirements.