Skip to main content

Summer - Week 10

This week, I am working to get the dynamic time warping functionality into my program. The process of doing so includes re-processing the features to include the time series, putting each series back together when we construct sequences, and then performing the DTW to generate a number that will be used to compute the kNN of each sequence which can then be used for predictions with the models. The processing time of these activities has gone up significantly since we have been using five different metrics with each of the F phase datasets. I am returning to school next week, and once I've completed the DTW processing all that will remain before we put together our second paper (The date for the reach journal we would like to submit it to is October 1), I am hoping I will have time to look again into the Agglomerative Hierarchical Clustering concept, which I did not successfully complete when we explored it earlier in the summer and then changed focus to the paper. We heard back

Week 34

My exploration of inactive minutes vs. Dr. Skornyakov's sleep minutes yielded interesting results. For the overwhelming majority of patients, I found correlation between the two features to be above 0.8, indicating a strong relationship. However, three patients' correlations fell significantly lower, one being only 0.18. There was not a significant difference in the correlation between sleep minutes vs inactive minutes and previous/next days' active minutes. In all but three cases, the inactivity minute count was, on average, less than the number of sleep minutes.

With this information going forward, my plan is to keep using both features and see how the accuracy of predictions varies for each. As the semester comes to a close, I am going to be working on developing a kNN regression model to predict the number of active minutes for a day. I am going to calculate the accuracy of these predictions using combinations of: sleep minutes (previous night), active minutes (previous day), inactive minutes (previous night), transitions (previous night), longest sleep bout (previous night) and sleep onset latency (previous night). I will generate a set of all possible attribute sets and remove those which contain both inactive minutes and sleep minutes. I will then find the average mean squared error for each of the attribute sets and record my findings on what appear to be the most significant attributes in predicting a day's activity.

Time permitting, I am also going to look at the difference between how accurately each light group (red light, blue light) was classified.

Comments