LLMGeneLinker (LGL)

LLMGeneLinker uses a domain-specific transformer like SciBERT finetuned on AllenAI drug dataset, BC5CDR disease, NCBI disease, DrugProt and GeneTAG datasets. The resulting SciBERT model performs Named Entity Recognition to tag drug, protein, gene, diseases in input text. Sentence embedding of SciBERT is then fed into BERT This was made during the LLMs for Bio Hackathon organised by 4Catalyzer and SGInnovate.
Made by Team GeneLink (Nicholas, Yew Chong, Ting Wei, Brendan)


Note: Performance is noted to be poorer on genes, acronyms, and receptors (named entities that may be targets for drugs or genes).
Original notebook adapted from jsylee/scibert_scivocab_uncased-finetuned-ner

Output

Text Examples

Examples
Pages: