Dates and Times
|Linked pages on this wiki||,
Virtually all data one might want to collect as part of doing personal science, has some form of timestamp associated with it; that is, the moment when the data was collected. Unfortunately handling dates and times isn't as straightforward as one might assume. The challenge of correctly doing so is often only half-jokingly called "one of computer science's hardest problems". Why are dates and times difficult to catalog? There are many reasons, including traveling across time zones and daylight savings time.
Users interested in this topic (add your name to the list below!)[edit | edit source]
Common Date/Time issues[edit | edit source]
Time Zone Changes[edit | edit source]
Many issues around properly recording the dates & times for observations center around changing the clock. While commonly associated with travel, this is not the only problem, as many countries change their clocks twice a year to move from/to daylight saving time.
Unfortunately many commercial tools used for personal science, such as a lot of wearables, do not properly record the time zones in which data was recorded, but rather use one of two different approaches for this, which have their own drawbacks:
- Saving all data in UTC time
- Saving all data in local time of recording
Data in UTC[edit | edit source]
An often used strategy is to save all observations in Coordinated Universal Time, which during non-daylight-saving time is the same as Greenwich Mean Time in the UK. Typically the data is then converted to the current local time when displayed in e.g. apps or data visualizations.
While this makes the storage and calculation of correct times relatively straightforward it becomes impossible to reliably calculate the actual local time at which the data was collected when changes in time zones start to come into play.
E.g. In the simplest scenario half of your data will be off by one for data recorded during daylight saving time but displayed when being in standard time and vice versa. Things get even more tricky if you travel or move between time zones. For all of these cases the only way to reliably calculate the local time is by knowing the geographic location for each observation. If you have reliable geolocation information for yourself, then consistently storing all data in UTC might be the easiest approach, as you don't have to handle different timezones when merging data.
Data in local time[edit | edit source]
The problem of changes in time zones can be minimized by storing each observation/timepoint in the local time at which it was experienced. Many tools record the local time, but unfortunately omit saving the actual time zone. This minimizes the problematic cases that are a result of solely storing UTC data, but is still not ideal. A simple example: When switching from daylight saving time to standard time, one hour of time is effectively duplicated if no time zone information is provided, with no way to differentiate them from each other. The problem becomes even more pronounced when traveling through a larger number of time zones.
Data in local time with timezone information[edit | edit source]
Both above mentioned problems can be avoided when observations are timestamped not only with the local time, but also the present offset from UTC/the time zone they are experienced in. Some frequently used tools for personal science (such as the Apple Watch/Apple Health) store the data in this way, but this seems to be the exception from the rule.
If you are designing your own data collection protocol for your personal science project and you expect changes in time zones (either due to travel or daylight saving time) a good rule of thumb can be to collect the local time alongside the timezone information. A problem that can occur with this strategy is when the timezone information is faulty or gets lost along the way of the data processing, so one needs to be cautious when using this approach.
What is a day?[edit | edit source]
Boundaries of days are another common issue when it comes to handling dates & times for personal science. While our calendar and clock progress to the next day every midnight, for many people this does not accurately reflect how they think of a single day. Instead, a day is experienced as the time period between wake-up and going to sleep again. Despite this, many tools and in particular wearable devices (such as Apple Watch and Fitbits) enforce a day break at midnight, meaning that e.g. activity, heart rate and other variables recorded after midnight (but before going to sleep) is already recorded for the new day.
At the same time not all devices default to the calendar-based standard: E.g. the Oura Ring devices provide daily aggregate meetings which measure a day as the time period between sleep periods. This means one needs to be cautious when comparing daily summary statistics from different sources, as they might use different ways to define a day.
Which day is it?[edit | edit source]
Not only can are the boundaries of days problematic: For activities which typically span across day-boundaries, one might take different approaches regarding to which day the data should be associated. Tracking sleep is a typical example of this: If you go to sleep before midnight on day X and wake up on day X+1, for which date do you want to associate the summary statistics of your night's sleep?
Some tools (e.g. Fitbit) will choose day X+1 as the day to associate the sleep, i.e. the day on which you wake up. Other tools (e.g. the Oura Ring) will choose day X as the day for the sleep summary statistics, i.e. the day you went to sleep (Interestingly this even happens if you go to sleep after midnight).
It can be important to understand this distinction both when comparing different sleep metrics to each other, in order to compare data from the correct dates but also when trying to understand how sleep might affect other variables. Not understanding for which day the sleep is recorded can easily result in being off-by-one.