The National Institute for Materials Science has developed Research Data Express, a data management system designed to automate the processing of experimental data into AI-ready formats for materials research. Published in Science and Technology of Advanced Materials: Methods, the system addresses significant bottlenecks in data-driven materials science by transforming manufacturer-specific formats and inconsistent terminology into standardized, reusable datasets.
Materials research generates vast amounts of data that traditionally require considerable manual effort for format conversion, metadata assignment, and characteristics extraction. These time-consuming tasks have historically discouraged data sharing and hindered the advancement of data-driven work, particularly as the field increasingly relies on AI-driven materials discovery that demands high-quality datasets. RDE automatically interprets experimental data from raw files and manually inputted measurements, then restructures and stores this information in formats with enhanced readability.
The system's core innovation is its Dataset Template approach, which defines and directs how data from different types of experiments should be processed. Unlike similar systems that rigidly define data formats, RDE allows researchers to configure templates to interpret data from various sources, such as spreadsheets of X-ray measurements from different instruments. The system then automatically performs advanced analyses and creates visualizations to provide immediate overviews of the data. Multiple templates can be prepared for different materials research themes, and individual researchers can easily create custom templates when necessary.
Jun Fujima, corresponding author and researcher at NIMS's Materials Data Platform, explains that RDE significantly reduces the burden of routine data processing for researchers and enhances data findability, interoperability, reusability, and traceability. The system's unique approach allows researchers to freely define data structures tailored to their instruments while enabling automatic massive data structuring and metadata extraction. Since its launch in January 2023, RDE has demonstrated significant scalability with over 5,000 users, more than 1,900 Dataset Templates for various experimental methods, over 16,000 datasets created, and more than three million data files accumulated.
The system serves as data infrastructure for major national initiatives, including the Materials Research DX Platform initiative promoted by Japan's Ministry of Education, Culture, Sports, Science and Technology. To encourage broader adoption, the NIMS team has released an open-source software toolkit called RDEToolKit. The research paper detailing the system is available at https://doi.org/10.1080/27660400.2025.2597702, and additional information about the journal can be found at https://www.tandfonline.com/STAM-M.
For business and technology leaders, RDE represents a significant advancement in research infrastructure that could accelerate materials discovery across industries including semiconductors, batteries, pharmaceuticals, and advanced manufacturing. By automating data processing and creating standardized, AI-ready datasets, the system addresses a critical bottleneck in materials innovation pipelines. The widespread adoption within Japan's research community suggests potential for global implementation, potentially reducing time-to-discovery for new materials and enabling more efficient collaboration across research institutions and industrial R&D departments.


