Hello, Guest!

Space

IBM Collaborates With NASA for Large Language Model Toolkit to Support Scientific Research

LLM toolset

IBM Collaborates With NASA for Large Language Model Toolkit to Support Scientific Research

IBM and NASA’s Interagency Implementation and Advanced Concepts Team have jointly developed a comprehensive toolset of large language models open for researchers to access curated scientific data from diverse sources. 

Called INDUS, the toolkit enables research on five scientific domains, including Earth science and astrophysics. The LLM suite also offers improved data access from peer-reviewed studies about biological and physical sciences and planetary sciences. 

Researchers can comprehend complicated scientific concepts and explore new research ideas more efficiently through the vast data sources that INDUS unlocks, NASA said Tuesday

Encoders and sentence transformers comprise the toolkit’s two models. Through the LLM, the encoders convert natural language text into numeric coding. Bishwaranjan Bhattacharjee, IBM researcher, said INDUS has achieved “superior performance” with its custom vocabulary and a good encoder model training strategy. 

NASA’s Goddard Earth Sciences Data and Information Services Center fine-tuned INDUS and categorized sources from the labeled data provided by domain experts.

Researchers can access the INDUS models on the Hugging Face website and watch for the NASA-IBM team’s release of benchmark datasets on some further requirements, such as Earth science extractive question answering and multi-domain information retrieval.   

Potomac Officers Club Logo
Become a Potomac Officer Club Insider
Sign up for our weekly email & get exclusive event, and speaker updates, and find networking opportunities to connect with GovCon decision makers.

Category: Space