Skip to main content

Posts

Showing posts from May, 2019

Summer - Week 10

This week, I am working to get the dynamic time warping functionality into my program. The process of doing so includes re-processing the features to include the time series, putting each series back together when we construct sequences, and then performing the DTW to generate a number that will be used to compute the kNN of each sequence which can then be used for predictions with the models. The processing time of these activities has gone up significantly since we have been using five different metrics with each of the F phase datasets. I am returning to school next week, and once I've completed the DTW processing all that will remain before we put together our second paper (The date for the reach journal we would like to submit it to is October 1), I am hoping I will have time to look again into the Agglomerative Hierarchical Clustering concept, which I did not successfully complete when we explored it earlier in the summer and then changed focus to the paper. We heard back

Week 40

This week, I want to spend some time to look back at what we’ve done thus far and regroup as I prepare for the summer. I want to clean up my code, tie up loose ends and see if there are any unanswered questions that we’ve passed over but that I may have time to review over the summer. This semester, we’ve covered a lot of ground by way of data exploration but not as much in terms of actual predictions. I want my work going forward to be primarily based on what we’ve already done and focused on obtaining actual results.

Week 39

This week, we are going to merge the “P” approach discussed last week with a kNN grouping to generate sets of similar instances that can be used to create a custom regression model for an individual instance. This introduces several new variables in addition to k and the set of attributes used as nearness indicators, as I discussed in Week 35, as well as the P value. Since we are now using kNN as a means to make groups, we are able to try out different regressors to generate models from these groups. The models we will use in our experiments wil be decision trees, support vector models, random forest models, bayes models and linear regression models. We will use the packages in scikit learn to easily switch between regressors. We will also vary the subgroupings used as the pool from which kNN are selected, as we did with the original kNN model. These results will be compared to those obtained by generating a model with the same regressor, subgrouping and P val

Week 38

With our paper out of the way, I began looking at making predictions based on data from multiple days preceding an instance. This would be helpful, because it would give us more features and take into account more than only the two periods preceding the period to predict. We can then compare the effectiveness of this approach to the initial one, and if we find that epochs with more historical data can generate more accurate results, we can conclude that we should look more into the patterns leading up to the period we want to predict. Additionally, I want to add a new feature to each epoch called “DAYS_SINCE_EVENT” which contains the number of days that have passed since the patient had either a TBI or Stroke. We will refer to the number of periods before an instance as P. So P 1 will be a set of epochs containing only features from a nighttime period being used to predict the following daytime activity or features from a daytime period being used to predict