Massively Parallel Processing of Full Text Articles using DEISA
|Research Area||Data and Text Mining, Parallel Computing|
|Principal Investigator(s)||Dr. Firat Tekiner|
This project represents a combination of expertise in text mining and high performance computing to enable and run massively parallel text mining applications to scale beyond thousands of processors, since there is an urgent need to find amenable solutions to tackle the problem of data deluge for large-scale text mining applications. The motivation is to process large text datasets from multiple scientific domains within reasonable time. Processing full text articles instead of abstracts will allow researchers/scientists across the world to find increased relationships within text that was not known before. This will only be possible with a system that exploits storage capabilities and the parallel nature of high performance computing platforms by porting a number of advanced text mining techniques to the DEISA platform.