Changes

Jump to navigation Jump to search
no edit summary
Line 1: Line 1: −
This page is mainly for DIYers with at least some knowledge of [[Excel]] or other spreadsheet software.
+
{{Topic Infobox}}
 +
This page is mainly for DIYers with at least some knowledge of [[Excel]] or other statistical analysis software.  Previous steps would be choosing a [[:Category:Tools|Tool]] or [[Aggregators|Aggregator]]. Some tools give data straight to the user. This page is the set of steps after receiving data. Check if the device or app produces correct data soon after first use.  
    
==== File Format ====
 
==== File Format ====
Line 5: Line 6:     
==== Structure ====
 
==== Structure ====
Statistical analysis of self tracking data is usually done on tabular data, like spreadsheets, with rows representing individual observations.<ref>https://en.wikipedia.org/wiki/Relational_database</ref> In all but a few cases this is sufficient structure.  
+
Statistical analysis of self tracking data is usually done on tabular data, like spreadsheets, with rows representing individual observations.<ref>https://r4ds.had.co.nz/tidy-data.html</ref><ref>https://en.wikipedia.org/wiki/Relational_database</ref> In all but a few cases this is sufficient structure. See also [[Dates and Times]].  
      Line 18: Line 19:       −
States that are written once and apply until changed to something else. For example, place of residence or whether a brace is being worn continuously. This structure is similar to a simple "event" with just a when-time and what though duration is calculated from replacement.  
+
States that are written once and apply until changed to something else. For example, place of residence or whether a brace is being worn continuously. This structure is similar to a simple "event" with just a 'when-time' and 'what' though 'duration' is calculated based on when the state is changed to something new.  
      Line 24: Line 25:       −
[[Tools for journaling, thoughts and note taking|Journal]] entries and notes.  Often journal entries are written texts describing the day.  
+
[[Tools for journaling, thoughts and note taking|Journal]] entries and notes.  Often journal entries are written texts describing the day.
    
==== Variables ====
 
==== Variables ====
Line 32: Line 33:  
* A variable that depends on previous values of this same variable is not independent and is called auto-correlative? and non-stationary. For example skills at playing the guitar.
 
* A variable that depends on previous values of this same variable is not independent and is called auto-correlative? and non-stationary. For example skills at playing the guitar.
 
* Randomness of Missingness. Similar to independence but its not the value of the variable but whether other measured variables could correlate with higher incidence of missing values. For example forgetting to charge the smart band because of tiredness and having a night without it on.
 
* Randomness of Missingness. Similar to independence but its not the value of the variable but whether other measured variables could correlate with higher incidence of missing values. For example forgetting to charge the smart band because of tiredness and having a night without it on.
* Target. Level. Is this variable something you want to improve, or a variable likely to affect those or just an intermediary background variable measured because it was easy and provided context?
+
* Target. Level. Is this variable something you want to improve, or a variable likely to affect those, or just an intermediary background variable measured because it was easy and provided context? If this is a target variable, mention the purpose of of tracking such as [[Life extension]], your doctor told you based on [[Lab tests]], or are you trying to improve performance [[Sports]].
 
* Similarity. Proxy. Is this variable measuring something very similar to what another variable is measuring. The most common example is [[Tools for heart rate or pulse|heart rate]] as many wearable measure it and the avid self tracker always has a few.
 
* Similarity. Proxy. Is this variable measuring something very similar to what another variable is measuring. The most common example is [[Tools for heart rate or pulse|heart rate]] as many wearable measure it and the avid self tracker always has a few.
 +
* Sign. Positivity. If variable is a target, are higher values better or the opposite. Sometimes some middle value is best like with BMI.
 +
* Scale and fact of [[Self assessment]]. Whether variable is anchored to objective standard or subjective or even relative to previous measurement. Also mention that it is self assessment.
 +
* Is this target variable a measure of a problem, like pain, an accomplishment like playing guitar better, or both like a scale of cleverness in conversation?
    
==== Data Cleaning ====
 
==== Data Cleaning ====
Check if the device or app produces correct data soon after first use. Correct, remove or impute outliers (very extreme values) produced by errors but not real events. In the rare case that the data is raw sensor like [[Accelerometry]], aggregate it into something more manageable. Consumer wearables make "steps per 10 minutes" for which open source script is likely available. Finally, compare against other data to remove errors like exercising in the middle of sleep. I have not seen a script for this yet. [[User:DG|DG]] ([[User talk:DG|talk]]) 02:34, 29 May 2022 (UTC)
+
Correct, remove or impute outliers (very extreme values) produced by errors but not real events. In the rare case that the data is raw sensor like [[Accelerometry]], aggregate it into something more manageable. Consumer wearables make "steps per 10 minutes" for which open source script is likely available. Finally, compare against other data to remove errors like exercising in the middle of sleep. I have not seen a script for this yet. [[User:DG|DG]] ([[User talk:DG|talk]])
    
== References ==
 
== References ==
 
<references />
 
<references />
{{Topic Queries}}
+
 
[[Category:Topics]]
+
[[Category:Data analysis]]

Navigation menu