Research Statement My primary research interest lies in developing biostatistical and time series related research as well as a considerable portion of my applied and collaborative work address statistical learning methods to analysis high dimensional data which routinely arise in survival analysis and genetics and health related issues

Research Statement
My primary research interest lies in developing biostatistical and time series related research as well as a considerable portion of my applied and collaborative work address statistical learning methods to analysis high dimensional data which routinely arise in survival analysis and genetics and health related issues. I began my research career by analyzing financial time series and survival analysis particularly measuring child mortality, modeling mixed ARCH and GARCH modeling and ARIMA modeling on agricultural data. My current research is focused on to identify the clinical risk factors of recurrence of tumors among breast cancer patients in the presence of genetic information. We obtained significant gene list by regularized Cox proportional hazard model for time-to-event through three Bioconductor packages such as Limma, deseq2 and edgeR. We will currently find the combined effect of significant genes and clinical variables for time to single event and recurrent events. My motivation to develop penalization on different recurrent event models on other cancer data. As a continuation of my career, I see myself using my strong statistical background and advanced programming skills to develop statistical methods and software to tackle the important scientific questions in data science. Following, I describe my scientific works and my plans for future researches.

Current Research:
We also involve some other research based on breast cancer data besides my dissertation. We will try to integrate prior genetic information for network-based classification of breast cancer using lasso and graphical lasso. Since network-based methods provide a natural framework to study the relations among genes. In this investigation, we will try to integrate prior genetic information for biological network inference using penalization methods such as lasso and graphical lasso on a data-driven network model. We will also address some important question related to breast cancer survivorship for patients with a primary and metastatic tumor. In this study, we will examine the efficiency of several methods that are generally used to estimate survival functions in presence of censored data. Moreover, we want to model and survival analysis of breast cancer through graphical lasso vs decision tree approach to compare the efficacy of breast cancer treatment. Finally, we will investigate various problems associated with breast Cancer and lung cancer patients. In this study, we will see the recurrence time of breast cancer patients who receive different treatments are compared in an overall sense, considering the historical type. Also, the relationship between relapse time and other variables and predict relapse time will bear in mind to further understand.

Other Collaborative Work:
In addition to my work in statistical learning and survival analysis, I have also worked on other collative projects dealing with many types of data. I have worked on the cleaning and preprocessing stock exchange data by Dr. Kazi Shah Nawaz Ripon from Computer Science Discipline at Khulna University which model and examine a hybridized bio-inspired system for forecasting in exchange for last trade off, closing price and the maximum/minimum price. Along with Dr. Md. Abdul Jabbar and Md. Tanvir Hossain from Sociology Discipline at Khulna University, I performed statistical analysis to investigated aged women’s situation in water-logged areas especially Kanaidia village under Jamalpur Union, Tala Upazilla, Satkhira district in Bangladesh and theirs needs and problems to survive throughout water logging problems. We also discovered the harmful impacts of logged water over women’s physical and psychological health. In collaboration with the group supervised by Dr. Munni Begum at Ball State University, I have also designed and analyzed raw accelerometer data to develop and compare physical activities in older adults using begging and random forest method. In my second semester at Ball State University, I worked in a group research project and submitted a poster in a student symposium at the ball State University on “Classification of RNA-Seq Data by Supervised versus Unsupervised Learning Methods: A Case Study of the Human Pathogen Candida Albicans”. In this research, C-albicans have considered two types such as wild and mutant as well as each type had two treatments like room temperature and 370 Fahrenheit. Then, we had compared three computational pipelines (Combination of Python-based software, Combination of R packages, R sub-read package) for differentially expressed analysis of RNA-Seq data. Comparisons are considered within/between species of C-Albicans.

Previous Research:
My first paper, “Measuring Child Mortality Impact by Cox Proportional Hazard Model with Two Time Scales” deal with dual time number (lifetime and calendar time) using two way Cox proportional hazard model in survival analysis. After that, we are modeling ARIMA to forecast foodgain requirement in Bangladesh. Next, we investigate sensitivity of radial basis functions such as General, Gaussian, Multiquadric and Thin plate spline to investigate the risk of share market. Subsequently, we measured different factors that affecting early marriages. Then, we worked on measure performance and comparison of various volatility model on dhaka stock exchange by GARCH family, identify significant factors of the growth in hyperglycemia incidence, optimization technique on Dickey-Fuller test to improve sign and boundary problems of the parameters, compare and analysis SARIMA model and traditional models on weather variables of Bangladesh. Finally, I have submitted another paper where we identify breastfeeding duration factors by polytomous logistic regression Modelling.

Future Research Direction:
I plan to expand on my current work in a number of directions which are given below:
Recurrent Event Modeling:
My dissertation research focuses on developing penalization on different recurrent event models and time to single event model, particularly set up lasso penalization on a single event and recurrent event models to identify the combined effect of significant genes and clinical variables. In future, I would like to start penalization on other recurrent event models such as frailty model with more than one frailty term, frailty model for data with censoring and truncation features, correlated and joint modeling through frailty model and frailty model with the accelerated time.

Missing Value estimation:
New methods for situations where data are missing not at random (MNAR) are also essential in individualized medicine, as missing completely at random and missing at random assumptions are rarely justified in clinical settings. As part of my recurrent event models for breast cancer, I have faced problems of estimating MNAR that accommodates MNAR observations of both the true cancer state. I plan on pursuing this research further and developing this approach into a more general framework for the broader application.

Learning health systems:
Individual medicine is most completely realized in the context of a learning health system, that is, a system with the ability to continuously integrate patient data and medical knowledge to optimize patient care. As an example, my breast cancer analysis model can be adapted to incorporate advances in scientific understanding or newly available clinical measurements without discarding earlier (now incomplete) data. In future research, I would like to establish best practices for statistical modeling to support learning health systems.

Mata Analysis for Cancer data:
In cancer study, several problems hold capacity for statistical meta-analysis to synthesize the results of previously conducted research; however, they do have procedural concerns. I will use a statistical meta-analysis of a large collection of analysis results from individual studies for the purpose of integrating the findings. To identify genes possibly importance in cancer, I would like to integrate multiple microarray disparate data sets potentially multiple cancer types to identify a meta-signature of cancer progression.

Data Science in Health sector:
Government, Local government, Non-government organizations and many other organizations such as World Bank, National Cancer Institute, UK Bio-Bank have lots of data that do not use effectively. If we can do analytics on these historical data, these organizations and other beneficiaries can make better decisions. Already, many large organizations or countries have started to do similar research already. However, having access to a tool doing these analytics, many small courtiers and NGO’s can be hugely benefited. I see myself working with researchers from Doctors, Medical Informatics, Social Science, etc, to create these analytics on different data sources.

To recap, I am very interested in a host of computational problems involving recurrent events, cancer data and missing value estimation as well as apply big data analysis in academia to find solutions to problems around us.

References
L. Ershad Ali, Masudul Islam, Ishita Shahid Sams and Afroza Polin, “Measuring Child Mortality Impact by Cox Proportional Hazard Model with Two Time Scales”, Khulna University Studies, vol.11 ; 12, issue. 1 ; 2, pp.193-202, 2013.

L. Ershad Ali, Masudul Islam, Rashed Kabir and Faruque Ahmmad, “Forecasting Production of food grains using ARIMA Model and its requirement in Bangladesh”, Journal of Mechanics Continua and Mathematical Sciences, vol.7, issue.2, pp.1056-1066, January 2013.

L. Ershad Ali, Masudul Islam and Shamima Sultana, “Mathematical modeling with local volatility surface by radial basic function approach”, Annals of Pure and Applied Mathematics, vol.9, issue.2, pp.191-200, 2015.

Md. Salauddin Khan, Masudul Islam, Md. Rasel Kabir and Lasker Ershad Ali, “A Comparative Study of Forecasting Agricultural Time Series: Some Selected Foodgrain in Bangladesh”, Journal of Mechanics Continua and Mathematical Sciences, vol.10, issue.2, pp.1500-1510, January 2016.

Md. Rasel Kabir, Masudul Islam, Md. Salauddin Khan, Raju Roy, “Factors Affecting Early Marriage in Bangladesh”, Global Journal of Science Frontier Research, vol.16, issue.1, pp.1-6, May 2016.
Masudul Islam, Sirajum Munira, Raju Roy and Md. Salauddin Khan, “Performance Measurement of Some Selected Company of Dhaka Stock Exchange by GARCH family”, Journal of Mathematics and Informatics, vol.5, pp.57-61, June 2016.
Imran Parvez, Md. Moyazzem Hossain and Masudul Islam, “Comparison of Different Volatility Model on Dhaka Stock Exchange”, Gobal Journal of Science Frontier Research, vol. 17, issue. 3(1), pp. 41-52, 2017.

Sohani Afroja, Masudul Islam, Mohammad Emran Hossen and Tapos Kumar Biswas, “Determines Hyperglycaemia Spreads in Generations with Multiple Complications that Imposing Towards Death”, International Journal of Systems Science and Applied Mathematics, vol. 3(2), pp. 16-23, July 13, 2018.

Masudul Islam, Afroza Akhtar, Sirajum Munira, Md. Salauddin Khan, Md Monzur Murshed, “Optimized Dickey-Fuller Test Refines Sign and Boundary Problems Compare to Traditional Dickey-Fuller Test”, International Journal of Statistics and Probability, vol. 7(5), pp. 19-27, September 2018.

Md. Salauddin Khan, Masudul Islam, Sajal Adhikary, Md. Murad Hossain, Sohani Afroja, “Analysis and Predictions of Seasonal Affected Weather Variables of Bangladesh: SARIMA Models vs. Traditional Models”, International Journal of Business and Management,vol.13(11), November 2018.

Masudul Islam, Sohani Afroja, Animesh Biswas, Md. Salauddin Khan, Sara Khandker (Accepted) “Polytomous Logistic Regression Modelling to Assess Affecting Breastfeeding Duration Factors in Bangladesh” Family medicine and primary care review.