Dealing With Dates in Python - Part 1#
Welcome back to Cameron’s Corner! This week, I want to get our hands on some code and talk about some of the approaches for dealing with datetime
s in Python. Additionally, I want to discuss some common considerations you’ll need when implementing dates and datetime
s in your own code. Let’s dive in!
What is a datetime
?#
A datetime
is a specific point-in-time, referring to an instance. As the name suggests, these typically contain both a date and a time component: the date is some combination of year, month, and day, and the time is some combination of hours, minutes, and seconds, down to some pre-defined level of specificity.
In Python, we can work with dates using the datetime
module, which is actually a fairly “old battery” in the standard library. The reason I point this out is because datetime
actually uses the non-pep-8 compliant pattern of naming classes in lower-case instead of SnakeCase.
from datetime import datetime # lower-case class name!
# Construct a datetime
dt = datetime(
year=2023, month=1, day=3, hour=5, minute=30, second=15, microsecond=10
)
print(
f'{dt = }',
f'{dt.year = }', # access a part of the datetime
sep='\n'
)
dt = datetime.datetime(2023, 1, 3, 5, 30, 15, 10)
dt.year = 2023
What is a datetime
used for?#
As I mentioned earlier, when using datetime
, we typically want to represent a point-in-time, or a single instance. A common example of this is logging. If our running program emits a message, it can be very beneficial for us to record exactly when that message was emitted. We often use datetime
s as a timestamp (hence why pandas calls their point-in-time object a Timestamp
) for a given measurement or value.
from datetime import timedelta
from random import Random
from collections import namedtuple
from itertools import product, chain
from functools import reduce
from operator import mul
# Create some hirearchy of our messaging generation system
message_bank = {
'level': [('info', .5), ('warning', .35), ('critical', .15)],
'target': [('database', .25), ('microservice', .25), ('intranet', .25), ('application', .25)]
}
message_pool, weights = [], []
for lev, tar in product(*message_bank.values()):
m, w = zip(lev, tar)
message_pool.append(':'.join(m))
weights.append(reduce(mul, w))
rnd = Random(0)
Record = namedtuple('Record', ['timestamp', 'level', 'target'])
timestamp = datetime(2023, 1, 18) # unspecified hours, mins, seconds = 0
for _ in range(10):
message = rnd.choices(message_pool, weights=weights, k=1)[0]
record = Record(timestamp, *message.split(':'))
print(record)
timestamp += timedelta(hours=rnd.randint(0, 24), minutes=rnd.randint(0, 60), seconds=rnd.randint(1, 60))
Record(timestamp=datetime.datetime(2023, 1, 18, 0, 0), level='warning', target='application')
Record(timestamp=datetime.datetime(2023, 1, 19, 0, 56, 27), level='info', target='database')
Record(timestamp=datetime.datetime(2023, 1, 19, 17, 27, 53), level='critical', target='microservice')
Record(timestamp=datetime.datetime(2023, 1, 20, 2, 58, 16), level='warning', target='database')
Record(timestamp=datetime.datetime(2023, 1, 20, 9, 30, 25), level='info', target='intranet')
Record(timestamp=datetime.datetime(2023, 1, 21, 9, 37, 5), level='warning', target='application')
Record(timestamp=datetime.datetime(2023, 1, 22, 3, 22, 57), level='warning', target='microservice')
Record(timestamp=datetime.datetime(2023, 1, 22, 7, 42, 4), level='warning', target='intranet')
Record(timestamp=datetime.datetime(2023, 1, 23, 5, 3, 35), level='warning', target='database')
Record(timestamp=datetime.datetime(2023, 1, 23, 16, 30, 56), level='warning', target='microservice')
In the above code, we created ten random messages of different levels about different
parts of an imaginary infrastructure. We paired these messages with an incrementing
datetime
to provide a timestamp of when these messages were created.
A timedelta
represents the difference between two datetime
objects. This is measured in the
number of weeks, days, etc. between two datetime
s.
delta = datetime(year=2023, month=1, day=5) - datetime(year=2023, month=1, day=1)
print(
f'{type(delta) = }',
f'{delta = }',
f'{delta.total_seconds() = }',
sep='\n'
)
type(delta) = <class 'datetime.timedelta'>
delta = datetime.timedelta(days=4)
delta.total_seconds() = 345600.0
The timedelta
objects enable us to perform addition and subtraction with our datetimes
!
This has all sorts of applications, but first and foremost it provides one
of the many ways to represent a span-of-time instead of a single point-in-time.
If datetime
exists, then why date or time?#
In addition to the datetime
object, the datetime
module also defines separate
date
and time
objects that we can use:
from datetime import date, time
d = date(year=2022, month=1, day=3)
t = time(hour=5, minute=30, second=15, microsecond=10)
print(
f'{d = }',
f'{t = }',
sep='\n'
)
d = datetime.date(2022, 1, 3)
t = datetime.time(5, 30, 15, 10)
While these objects (datetime
, date
, time
) all appear fairly similar to one another,
they end up having very different use cases.
The date
and time
objects see a very different use case, often being used to
represent a span-of-time (in a similar but different way than we might use a timedelta
).
For instance, it would not be very beneficial to log messages according to just
their date as we lose important fidelity. However, if we wanted to create a summary
report for a given day, then using a date
to represent that information may be useful.
(I’ll save further discussion of dates and times in next week’s edition of Camerons Corner!)
Wrap Up#
This week, we discussed some key differences in the representation of time:
point-in-time
span-of-time
We also discussed how these are represented in Python. We can leverage the datetime
module
to perform some convenient time-series analysis, but have only scratched the surface
of the topic.
Tune in next week for further discussion around span-of-time, as well as a dive into the subtleties of working with timezones and Daylight Saving Time (also known as the bane of working with datetime
s).
Talk to you all next week!