Not sure how data storage is relevant here, I was only focusing on query optimization. Lets say that most of the data isnt moving (history data). However, objects can be changed and therefore new revisions are added and the previous revisions updated (their end_date is updated). If you run queries that involve the end_date very common (in order to get the most recent revision of objects) it will be better to set this column as a partition column instead just having an index on this col. In this way, getting all the recent revisions of a specific object is reached by log(m) [m is the number of most recent revisions] instead of logn [n is the number of revisions u have] and n is by far bigger than m. Correct me I'f I'm wrong, this topic is quite interesting ..