Difference between revisions of "Dates and Times"

From Personal Science Wiki
Jump to navigation Jump to search
 
(14 intermediate revisions by 6 users not shown)
Line 1: Line 1:
{{Topic Infobox
+
{{Topic Infobox}}
|Related topics=Tremor
+
Virtually all data one might want to collect as part of doing personal science, has some form of timestamp associated with it; that is, the moment when the data was collected. Unfortunately handling dates and times isn't as straightforward as one might assume. The challenge of correctly doing so is often only half-jokingly called "one of computer science's hardest problems". Why are dates and times difficult to catalog? There are [https://www.zainrizvi.io/blog/falsehoods-programmers-believe-about-time-zones/ many reasons], including traveling across time zones and daylight savings time.
|Related tools=One Button Tracker
 
}}
 
==Introduction to topic==
 
Virtually all data one might want to collect as part of doing personal science have some form of timestamp associated with it that is: When was the data collected? Unfortunately handling dates & times isn't as straightforward as one might assume and the problem of correctly doing so is often only half-jokingly called one of computer sciences hardest problems. Why are dates & times hard? There's [https://www.zainrizvi.io/blog/falsehoods-programmers-believe-about-time-zones/ many reasons], including traveling across timezones and daylight saving time.
 
  
{{Topic Queries}}
 
 
===Users interested in this topic (add your name to the list below!)===
 
===Users interested in this topic (add your name to the list below!)===
 
[[User:Gedankenstuecke|Gedankenstuecke]] ([[User talk:Gedankenstuecke|talk]])
 
[[User:Gedankenstuecke|Gedankenstuecke]] ([[User talk:Gedankenstuecke|talk]])
Line 22: Line 17:
  
 
==== Data in UTC ====
 
==== Data in UTC ====
An often used strategy is to save all observations in [https://en.wikipedia.org/wiki/Coordinated_Universal_Time Coordinated Universal Time], which during non-daylight-saving time defaults to ''Greenwich Mean Time'' in the UK. Typically the data is then converted to the current local time when displayed in e.g. apps or data visualizations.  
+
An often used strategy is to save all observations in [https://en.wikipedia.org/wiki/Coordinated_Universal_Time Coordinated Universal Time], which during non-daylight-saving time is the same as ''Greenwich Mean Time'' in the UK. Typically the data is then converted to the current local time when displayed in e.g. apps or data visualizations.  
  
 
While this makes the storage and calculation of correct times relatively straightforward it becomes impossible to reliably calculate the actual local time at which the data was collected when changes in time zones start to come into play.  
 
While this makes the storage and calculation of correct times relatively straightforward it becomes impossible to reliably calculate the actual local time at which the data was collected when changes in time zones start to come into play.  
  
E.g. In the simplest scenario half of your data will be off by one for data recorded during daylight saving time but displayed when being in standard time and vice versa. Things get even more tricky if you travel or move between time zones. For all of these cases the only way to reliably calculate the local time is by knowing the geographic location for each observation.
+
E.g. In the simplest scenario half of your data will be off by one for data recorded during daylight saving time but displayed when being in standard time and vice versa. Things get even more tricky if you travel or move between time zones. For all of these cases the only way to reliably calculate the local time is by knowing the geographic location for each observation. If
 +
you have reliable geolocation information for yourself, then consistently storing all data in UTC might be the easiest approach, as you don't have to handle different timezones when merging data.
  
 
==== Data in local time ====
 
==== Data in local time ====
Line 34: Line 30:
 
Both above mentioned problems can be avoided when observations are timestamped not only with the local time, but also the present offset from UTC/the time zone they are experienced in. Some frequently used tools for personal science (such as the Apple Watch/Apple Health) store the data in this way, but this seems to be the exception from the rule.  
 
Both above mentioned problems can be avoided when observations are timestamped not only with the local time, but also the present offset from UTC/the time zone they are experienced in. Some frequently used tools for personal science (such as the Apple Watch/Apple Health) store the data in this way, but this seems to be the exception from the rule.  
  
If you are designing your own data collection protocol for your personal science project and you expect changes in time zones (either due to travel or daylight saving time) a good rule of thumb would be to collect the local time alongside the timezone information.
+
If you are designing your own data collection protocol for your personal science project and you expect changes in time zones (either due to travel or daylight saving time) a good rule of thumb can be to collect the local time alongside the timezone information. A problem that can occur with this strategy is when the timezone information is faulty or gets lost along the way of the data processing, so one needs to be cautious when using this approach.  
  
 
=== What is a day? ===
 
=== What is a day? ===
 
Boundaries of days are another common issue when it comes to handling dates & times for personal science. While our calendar and clock progress to the next day every midnight, for many people this does not accurately reflect how they think of a ''single day''. Instead, a ''day'' is experienced as the time period between wake-up and going to sleep again. Despite this, many tools and in particular wearable devices (such as Apple Watch and Fitbits) enforce a day break at midnight, meaning that e.g. activity, heart rate and other variables recorded after midnight (but before going to sleep) is already recorded for the new day.  
 
Boundaries of days are another common issue when it comes to handling dates & times for personal science. While our calendar and clock progress to the next day every midnight, for many people this does not accurately reflect how they think of a ''single day''. Instead, a ''day'' is experienced as the time period between wake-up and going to sleep again. Despite this, many tools and in particular wearable devices (such as Apple Watch and Fitbits) enforce a day break at midnight, meaning that e.g. activity, heart rate and other variables recorded after midnight (but before going to sleep) is already recorded for the new day.  
  
At the same time not all devices default to the calendar-based standard: E.g. the Oura Ring devices provide daily aggregate meetings which measure a ''day'' as the time period between sleep periods. This means one needs to be cautious when comparing daily summary statistics from different sources, as they might use different ways to define a ''day''.
+
At the same time not all devices default to the calendar-based standard: E.g. the [[Oura Ring]] devices provide daily aggregate meetings which measure a ''day'' as the time period between sleep periods. This means one needs to be cautious when comparing daily summary statistics from different sources, as they might use different ways to define a ''day''.
  
 
=== Which day is it? ===
 
=== Which day is it? ===
 
Not only can are the boundaries of days problematic: For activities which typically span across day-boundaries, one might take different approaches regarding to which day the data should be associated. Tracking sleep is a typical example of this: If you go to sleep before midnight on day X and wake up on day X+1, for which date do you want to associate the summary statistics of your night's sleep?  
 
Not only can are the boundaries of days problematic: For activities which typically span across day-boundaries, one might take different approaches regarding to which day the data should be associated. Tracking sleep is a typical example of this: If you go to sleep before midnight on day X and wake up on day X+1, for which date do you want to associate the summary statistics of your night's sleep?  
  
Some tools (e.g. Fitbit) will choose day X+1 as the day to associate the sleep, i.e. the day on which you wake up. Other tools (e.g. the Oura Ring) will choose day X as the day for the sleep summary statistics, i.e. the day you went to sleep (Interestingly this even happens if you go to sleep ''after'' midnight).  
+
Some tools (e.g. [[Fitbit]]) will choose day X+1 as the day to associate the sleep, i.e. the day on which you wake up. Other tools (e.g. the [[Oura Ring]]) will choose day X as the day for the sleep summary statistics, i.e. the day you went to sleep (Interestingly this even happens if you go to sleep ''after'' midnight).  
  
 
It can be important to understand this distinction both when comparing different sleep metrics to each other, in order to compare data from the correct dates but also when trying to understand how sleep might affect other variables. Not understanding for which day the sleep is recorded can easily result in being off-by-one.
 
It can be important to understand this distinction both when comparing different sleep metrics to each other, in order to compare data from the correct dates but also when trying to understand how sleep might affect other variables. Not understanding for which day the sleep is recorded can easily result in being off-by-one.
 +
 +
 +
[[Category:Data analysis]]

Latest revision as of 17:48, 28 February 2023

Topic Infobox Question-icon.png
Linked pages on this wiki Tools (1),

Projects (1),

People (0)

Virtually all data one might want to collect as part of doing personal science, has some form of timestamp associated with it; that is, the moment when the data was collected. Unfortunately handling dates and times isn't as straightforward as one might assume. The challenge of correctly doing so is often only half-jokingly called "one of computer science's hardest problems". Why are dates and times difficult to catalog? There are many reasons, including traveling across time zones and daylight savings time.

Users interested in this topic (add your name to the list below!)[edit | edit source]

Gedankenstuecke (talk)

Common Date/Time issues[edit | edit source]

Time Zone Changes[edit | edit source]

Many issues around properly recording the dates & times for observations center around changing the clock. While commonly associated with travel, this is not the only problem, as many countries change their clocks twice a year to move from/to daylight saving time.

Unfortunately many commercial tools used for personal science, such as a lot of wearables, do not properly record the time zones in which data was recorded, but rather use one of two different approaches for this, which have their own drawbacks:

  1. Saving all data in UTC time
  2. Saving all data in local time of recording

Data in UTC[edit | edit source]

An often used strategy is to save all observations in Coordinated Universal Time, which during non-daylight-saving time is the same as Greenwich Mean Time in the UK. Typically the data is then converted to the current local time when displayed in e.g. apps or data visualizations.

While this makes the storage and calculation of correct times relatively straightforward it becomes impossible to reliably calculate the actual local time at which the data was collected when changes in time zones start to come into play.

E.g. In the simplest scenario half of your data will be off by one for data recorded during daylight saving time but displayed when being in standard time and vice versa. Things get even more tricky if you travel or move between time zones. For all of these cases the only way to reliably calculate the local time is by knowing the geographic location for each observation. If you have reliable geolocation information for yourself, then consistently storing all data in UTC might be the easiest approach, as you don't have to handle different timezones when merging data.

Data in local time[edit | edit source]

The problem of changes in time zones can be minimized by storing each observation/timepoint in the local time at which it was experienced. Many tools record the local time, but unfortunately omit saving the actual time zone. This minimizes the problematic cases that are a result of solely storing UTC data, but is still not ideal. A simple example: When switching from daylight saving time to standard time, one hour of time is effectively duplicated if no time zone information is provided, with no way to differentiate them from each other. The problem becomes even more pronounced when traveling through a larger number of time zones.

Data in local time with timezone information[edit | edit source]

Both above mentioned problems can be avoided when observations are timestamped not only with the local time, but also the present offset from UTC/the time zone they are experienced in. Some frequently used tools for personal science (such as the Apple Watch/Apple Health) store the data in this way, but this seems to be the exception from the rule.

If you are designing your own data collection protocol for your personal science project and you expect changes in time zones (either due to travel or daylight saving time) a good rule of thumb can be to collect the local time alongside the timezone information. A problem that can occur with this strategy is when the timezone information is faulty or gets lost along the way of the data processing, so one needs to be cautious when using this approach.

What is a day?[edit | edit source]

Boundaries of days are another common issue when it comes to handling dates & times for personal science. While our calendar and clock progress to the next day every midnight, for many people this does not accurately reflect how they think of a single day. Instead, a day is experienced as the time period between wake-up and going to sleep again. Despite this, many tools and in particular wearable devices (such as Apple Watch and Fitbits) enforce a day break at midnight, meaning that e.g. activity, heart rate and other variables recorded after midnight (but before going to sleep) is already recorded for the new day.

At the same time not all devices default to the calendar-based standard: E.g. the Oura Ring devices provide daily aggregate meetings which measure a day as the time period between sleep periods. This means one needs to be cautious when comparing daily summary statistics from different sources, as they might use different ways to define a day.

Which day is it?[edit | edit source]

Not only can are the boundaries of days problematic: For activities which typically span across day-boundaries, one might take different approaches regarding to which day the data should be associated. Tracking sleep is a typical example of this: If you go to sleep before midnight on day X and wake up on day X+1, for which date do you want to associate the summary statistics of your night's sleep?

Some tools (e.g. Fitbit) will choose day X+1 as the day to associate the sleep, i.e. the day on which you wake up. Other tools (e.g. the Oura Ring) will choose day X as the day for the sleep summary statistics, i.e. the day you went to sleep (Interestingly this even happens if you go to sleep after midnight).

It can be important to understand this distinction both when comparing different sleep metrics to each other, in order to compare data from the correct dates but also when trying to understand how sleep might affect other variables. Not understanding for which day the sleep is recorded can easily result in being off-by-one.