Talk:Finding relations between variables in time series
Categorization[edit | edit source]
Currently this page is sorted in as a tool, though it's rather a meta-article. I wonder whether it would make more sense to file it under Topic instead but would love to hear second opinions on this! - Gedankenstuecke (talk) 12:53, 30 November 2021 (UTC)
- if you think thats better then do it DG (talk)
- I've moved it to a topic page and also restructured/renamed the page slightly to fit into the topic dimension. - Gedankenstuecke (talk) 08:56, 2 December 2021 (UTC)
todo suggestion[edit | edit source]
mp April 7th at 1:51 PM @ oh an additional thought, post chat — I think a “fast up / slow down” pattern would be reflected in a Markov probability distribution where any given number has a low probability of later numbers being higher than it. e.g. if prior was 7, then 7-or-lower are likely (gradual decline), but 8+ very unlikely.I was struggling to remember the language here, but I think a “first order” Markov model is where probability distributions at any given step are based only according to the previous step (and no further “memory” in the system). A “second order” is influenced by the two previous steps (a bit more memory).
First! Carefully compare variance internally to days and between days. If internal is too high this variable has too poor "distinctness". Can also look for "stability" in derivatives and between any time measured such as 12 hours or a month... or maybe strongly unlikely Markov model? Also maybe first, multimodality and outlier and anomalies.
potential sources of solutions[edit | edit source]
Convert time series into bunch of splines? Especially MARS! https://www.google.com/search?client=firefox-b-1-d&q=splines+time+series
https://stats.stackexchange.com/questions/tagged/time-series wow really so much I forgot! Anomalies and Events and Periodicity! mentions QS : https://stats.stackexchange.com/questions/17623/how-to-detect-a-significant-change-in-time-series-data-due-to-a-policy-change/17661
https://hermandevries.nl/2020/09/23/relationships-between-hrv-sleep-and-physical-activity-in-personal-data/ suggested by gedankenstuek
http://beautifuldata.net/2015/01/how-to-analyze-smartphone-sensor-data-with-r-and-the-breakoutdetection-package/#comment-37605 for raw sensor data.
https://www.nature.com/articles/s41398-021-01445-0 personalized time series machine learning it is. Fairly commonly recommended procedures to Data scientists. I suspect faults from not takeing into account issues with time series; no mention of unit root for example. "Analytics code is available upon request from the corresponding author."
https://forum.quantifiedself.com/t/interventions-to-improve-sleep/9599/15 just linear lasso but lag and other issues discussed.
https://github.com/fasiha/ebisu#the-math intense math for flashcard prediction and timing adjustment!
https://www.physiq.com/ "physIQ is the only company that uses FDA-cleared, AI-based analytics to “learn” and detect even the most subtle changes in an individual’s own unique physiology 24/7."
https://play.google.com/store/apps/details?id=edu.brown.selfe&hl=en_CA&referrer=utm_source%3Dgoogle%26utm_medium%3Dorganic%26utm_term%3D%22self-e%22+app formal elf experiment support app from brown. definitely not advanced analytics but still
do not forget https://en.wikipedia.org/wiki/Multimodal_distribution is also a source
https://correlaid.org/en/ where to headhunt data scientists
other potential source[edit | edit source]
https://www.microsoft.com/en-us/research/group/alice/ just let microsoft do it. https://econml.azurewebsites.net/spec/motivation.html
https://github.com/gianlucatruda/quantified-sleep analysis before designing an intervention
https://ml4qs.org/ hoogendoorn and funke
to test algorithm generate data https://old.reddit.com/r/rstats/comments/nhenrm/recommend_r_packages_to_generate_data/ but is it time series? https://www.cs.cmu.edu/afs/cs/project/jair/pub/volume13/cheng00a-html/node15.html
extra... https://forum.quantifiedself.com/search?q=matlab https://www.google.com/search?q=non+independence+of+observations https://www.google.com/search?q=time+series+distributions https://www.google.com/search?q=time+series+kernel+binning
maybe ask here again https://old.reddit.com/r/AskStatistics/ maybe datasets https://old.reddit.com/r/datasets/search?q=time+health+subreddit%3Adatasets&include_over_18=on&sort=relevance&t=all
correlation analysis https://www.frontiersin.org/articles/10.3389/fdgth.2020.00003/full
https://www.sciencedaily.com/releases/2021/11/211124154126.htm what was their meta-analysis?? It might be relevant for individual analysis not just meta.
even more sources[edit | edit source]
From and more at; https://arxiv-sanity-lite.com/?q=health+time&rank=search&tags=&pid=&time_filter=&svm_c=0.01&skip_have=no&page_number=3
https://arxiv.org/abs/2206.08178 just survival analysis https://arxiv.org/abs/2205.11680 also EHR
https://arxiv.org/abs/2206.09107 time series EHR rare binary features
https://arxiv.org/abs/2206.11505 possibly automatic generation
https://arxiv.org/abs/2207.06414 ER time series "robustness of this approach" deep learning interpretable https://arxiv.org/abs/2207.06414 DEEP, irregular time intervals, EHR, Long-term Dependencies and Short-term Correlations
https://arxiv.org/abs/2206.12414 !!!? marked temporal point processes DEEP missing events
https://arxiv.org/abs/2207.04305 https://arxiv.org/abs/2207.04308 DEEP
https://arxiv.org/abs/2207.08159 !!!!! Gaus mix model, autoencoders similarities among different time series distance metric
https://arxiv.org/abs/2004.02319 ! anomaly detection
https://arxiv.org/abs/2107.03502 Time Series Imputation autoregressive models score-based diffusion models
https://arxiv.org/abs/2110.05357 !! irregular sampling, graph neural network, dynamics of sensors purely from observational data, classify time series, healthcare
https://arxiv.org/abs/2204.00961 ... ? LSTM DEEP REINFORCEMENT recommend exercise routines to user sepcific needs https://arxiv.org/abs/2106.03211 extreme events RNN s&p500 stocks https://arxiv.org/abs/2107.05489 !!! ML for time series LSTM ....walk-forward algorithm that also calculates point-wise confidence intervals for the predictions
https://arxiv.org/abs/2108.13461 !!!!!!! healthcare predictive analytics, DEEP ?feature selection is not an issue? " feature engineering to capture the sequential nature of patient data, which may not adequately leverage the temporal patterns" " representations of key factors (e.g., medical concepts or patients) and their interactions from high-dimensional raw" summarises key research streams
https://arxiv.org/abs/2204.13451 EHR predicting "The common time-series representation is indirect in extracting such information from EHR because it focuses on detailed dependencies between values in successive observations, not cumulative information. "
https://arxiv.org/abs/2205.15598 !!! Disease prediction with ML. heterogeneity complex factors at the individual level. phase diagram
https://old.reddit.com/r/QuantifiedSelf/comments/wfuy03/personalized_digital_health_and_medicine_at_jsm/ "the g-formula (i.e., standardization, back-door adjustment) under serial interfer- ence. It estimates stable recurring effects, as is done in n-of-1 trials and single case experimental designs. We compare our approach to standard methods (with possible confounding) to show how to use causal inference to make better personalized recommendations for health behavior change, "
https://www.gwern.net/Replication crisis easy nice read. Since will be data dredging eventually and making stats test this useful.
https://old.reddit.com/r/wearables/comments/xmn06r/using_wearables_and_apps_to_characterize_your_own/" Well, the experimental design of n-of-1 trials and SCEDs actually checks for causation, not just correlation. This is why randomized controlled trials (RCTs) in clinical research are a gold-standard technique for figuring out if a new intervention or treatment actually works. “Flipping the coin” in a way balances everything else that might confuse or “confound” the way the treatment might impact the health-related outcome."
https://www.lesswrong.com/posts/9kNxhKWvixtKW5anS/you-are-not-measuring-what-you-think-you-are-measuring 2 rules- takeways You are not measuring what you think you are measuring but enough data sources and types of things you measure you may find out what that is.
also eric j daza 's papers.
funny on this stat analysis[edit | edit source]
Quick write I made here for later.[edit | edit source]
Collection is really just a matter of finding the right devices and taking the time to use them. Analysis outside of immediate obvious effect can become difficult. If the effect is subtle and drowned in other effects, or hard to measure. If the intervention is not something user can easily or wants to reproduce. If the effect take long time to build up, or is shifted in time from intervention. If the successful effect only happens under several conditions or several interventions together. If the spray and pray approach is dangerous. If the spray and pray approach only hits gold once in a while. Multiple comparison problem (see wikipedia). If user is bad at keeping records. There are probably more. There are many many apps that just do correlation and none that do anything more. Here is a list of both problems and apps.
new section : single variable validity[edit | edit source]
how to prove that what you are measuring really is what you are trying to measure. aka construct validation
quick way; compare to a scientifically validated standard.
also consider en.wikipedia.org/wiki/Convergent_validity many tests all agree more or less and "divergent validity" they do not correlate with things that they should not
en.wikipedia.org/wiki/Nomological_network several constructs and their relationships to each other such as ageing causes memory loss