You can find the CORD-19 dataset on Kaggle — if you want to replicate it, I would just recommend forking this notebook which has similar code. Thanks, and happy coding!
https://www.kaggle.com/dgunning/browsing-research-papers-with-a-bm25-search-engine