Editing Finding relations between variables in time series
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 2: | Line 2: | ||
Most personal science projects require finding relationships between different variables of the type 'time series'<ref>Core-Guide_Longitudinal-Data-Analysis_10-05-17.pdf (duke.edu)</ref>. An example could be the question "does my daily chocolate consumption correlate with my daily focus score?". | Most personal science projects require finding relationships between different variables of the type 'time series'<ref>Core-Guide_Longitudinal-Data-Analysis_10-05-17.pdf (duke.edu)</ref>. An example could be the question "does my daily chocolate consumption correlate with my daily focus score?". | ||
− | You could do experiments if you control everything rigidly or if the effects are strong and quick, like less than a week. Old data may be useable as Baseline | + | You could do experiments if you control everything rigidly or if the effects are strong and quick, like less than a week. Old data may be useable as Baseline. |
Finding more complicated relationships require better statistical tests and algorithms and data science skills. Apps that would do this automatically or at least easily are not yet available. See below. Most internet resources treat time series as (regular cyclical) series, which is not useful as most of the tracked variables have irregular patterns and don't even have a regularly cyclical component. | Finding more complicated relationships require better statistical tests and algorithms and data science skills. Apps that would do this automatically or at least easily are not yet available. See below. Most internet resources treat time series as (regular cyclical) series, which is not useful as most of the tracked variables have irregular patterns and don't even have a regularly cyclical component. | ||
Line 18: | Line 18: | ||
==== Curedao ==== | ==== Curedao ==== | ||
− | + | github.com/curedao/decentralized-fda Correlation over bins and lags selecting the biggest effect. | |
==== Data Flexor ==== | ==== Data Flexor ==== | ||
Line 43: | Line 43: | ||
==== young.ai and [http://www.aging.ai/ aging.ai] ==== | ==== young.ai and [http://www.aging.ai/ aging.ai] ==== | ||
− | + | deep learning predictor of age based on human blood tests and young.ai makes recommendations | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
[[Gyroscope]] | [[Gyroscope]] | ||
Line 64: | Line 58: | ||
Wellness FX | Wellness FX | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
== List of very technical tools == | == List of very technical tools == | ||
− | Some people do all the data science by themselves, by using programming languages such as R and Python in notebooks or apps. Coding platforms such as the notebooks on [[Open Humans]], Kaggle | + | Some people do all the data science by themselves, by using programming languages such as R and Python in notebooks or apps. Coding platforms such as the notebooks on [[Open Humans]], Kaggle or GitHub can help. |
Programming languages for statistics; Matlab, R, Python, Julia. | Programming languages for statistics; Matlab, R, Python, Julia. | ||
Line 85: | Line 71: | ||
[https://forum.quantifiedself.com/t/my-baseline-network-physiology-10-days-of-eeg-egg-ekg-cgm-temperature-activity-and-food-logs/5671/19 Wavelet coherence] is one potential solution. | [https://forum.quantifiedself.com/t/my-baseline-network-physiology-10-days-of-eeg-egg-ekg-cgm-temperature-activity-and-food-logs/5671/19 Wavelet coherence] is one potential solution. | ||
− | + | [http://www.tylervigen.com/spurious-correlations Spurious Correlations] mostly shows that if two things are trending in one direction and are checked for correlation they will show a very significant correlation. Practice effect is a subset. Another is one instance of an event increases the chances of the same event happening soon after. Economists suggest unit root. | |
− | |||
− | [http://www.tylervigen.com/spurious-correlations Spurious Correlations] mostly shows that if two things are trending in one direction and are checked for correlation they will show a very significant correlation. Practice effect is a subset. Another is one instance of an event | ||
− | |||
− | |||
Lag. What if eating pizza on one day causes heartburn the next? | Lag. What if eating pizza on one day causes heartburn the next? | ||
− | Build up. What if it takes two days of eating pizza to cause heartburn? | + | Build up. What if it takes two days of eating pizza to cause heartburn?.. |
− | + | Few positive instances but they are important. Went to a specific restaurant twice got sick soon after twice. Only ever got sick with similar symptoms five times. Or. Two large rare humps happen almost one after the other, similar to previous example if treated as events, adding the fact that lots of samples showing their similarity in shape too. | |
− | |||
− | |||
− | + | Different sampling rates need to be interpolated to be compared. Window. Since removing the effects of other variables makes the variable of interest's effect stand out, machine learning must be used. Common approach would be to bin predictor variables multiple ways based on time from effect being checked, mean or other aggregator method and window of the aggregator. | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | Machine learning also has limits on the kind of patters it can detect. | |
− | + | Types of data. [Exercised] is an event with specific occurrence moment and length while [tired] is a vaguer value user could use to try to describe feelings past 4 hours. | |
=== What to expect from the complete analysis tool === | === What to expect from the complete analysis tool === | ||
− | User without experience in statistical analysis will not be able to tell the difference between correctly computed correlations and poorly computed ones. However, a genuinely complete analysis produces | + | User without experience in statistical analysis will not be able to tell the difference between correctly computed correlations and poorly computed ones. However, a genuinely complete analysis produces graphs which should include most of the following: |
Interpolation for irregular time series. | Interpolation for irregular time series. | ||
Line 124: | Line 98: | ||
Cycles decomposition using a model like ARIMA. Ex. kayak season is in the summer or lunch is at exactly 1pm. | Cycles decomposition using a model like ARIMA. Ex. kayak season is in the summer or lunch is at exactly 1pm. | ||
− | Detection of repeated shapes implying similar events that are not cyclical; like dinner is anywhere between 4pm and 10pm and causes a particular 2 hour spike in glucose. | + | Detection of repeated shapes implying similar events that are not cyclical; like dinner is anywhere between 4pm and 10pm and causes a particular 2 hour spike in glucose. |
== References == | == References == | ||
[[Category:Data analysis]] | [[Category:Data analysis]] |