The document explains data protection mechanism in Brickstor OS. This document provides a high level of the concepts in order to properly configure data protection snapshot and retention policies. For more specific questions or more in-depth knowledge please reach out to our support.
When are snapshots are taken?
When the data protection service is enabled on a dataset, snapshots are actually only taken when there is a change made after the last snapshot was created. Depending upon the retention schedule this may lead to instances where one expects to see a snapshot, but a snapshot was not created because there was no change in the data, leading to confusion. If for example based on the schedule one expects hourly snapshots, yet some appear to be missing resulting in fewer than 24 snapshots created during last 24 hours, it is important to remember that unless there is change to warrant a new snapshot to be created, snapshots will be skipped, because they would end-up being empty, i.e. there will be gaps. An empty snapshot is still a point of reference however, and it is possible to create snapshots manually, on a filesystem that is not changing, which will result in one or multiple empty snapshots. This can only happen manually however, and not considered to be desirable for the purposes of data protection.
If you are wondering what this really means in terms of recovery, here's a brief summary. When you attempt to recover a file which may have had multiple modifications over last 24 hours, for example, you should be able to identify modifications made during that period with a 1 hour recovery point objective (RPO), because modifications will be captured by one snapshot, assuming hourly snapshots were enabled. However, if for example a snapshot was taken some number of hours prior to you attempting to recover the data, even if hourly snapshots are enabled, if there were no changes since that snapshot was taken, you won't see any subsequent snapshots. The latest version of the file will be in the last snapshot. Daily snapshots and weekly snapshots should capture at least some of the changes that were captured by the hourly snapshots, but some of those changes may be lost between two intra-day snapshots.
Good Retention Strategy
Of course no two organizations are alike in terms of how they use data and how they update it, as well as how they protect it. Snapshots are a way of undoing changes which occurred mistakenly, such as destruction of data, modification with need for recovery to previous state, etc.
- It is a good idea for most organizations to have at least a daily snapshot which is retained for some number of days.
- Of course, if data is considered to be disposable after some amount of time, it does not make sense maintaining snapshots any longer than that period.
- One may be concerned with too many snapshots, which results in rapid depletion of available capacity. This is a concern when the system is low on disk space. Data protection has intelligence built into it to protect a system from running out of space. This is an area where as administrators we don't need to be concerned very much.
- If data is turned over heavily, which could happen in cases where data are files representing VM disks, swaps, temporary holds and the like, one may not be able to maintain the number of snapshots they would like to, due to rapid consumption of storage by the frequent changes. Consider length of time snapshots are stored in this instance. The oldest snapshots will certainly lock in most of the change, therefor resulting in the largest consumer of capacity. The user interface gives a visual display of how much data is being consumed by snapshots.
- Increasing periods between snapshots taken reduces one's change capture granularity, and may reduce capacity depletion. It will reduce capacity depletion if lots of data was modified, removed or deleted. If the working data set just has additions of data this will have no affect.
- Use some combination of daily, weekly and monthly snapshots when value of data does not rapidly degrade with time. For example, in the case of Virtual Machines it may not make sense to retain snapshots for long, simply because there is so much context that may be lost of one restores data from weeks ago that the data is made useless in the process.
- Related data should be on same retention schedule. Make sure that if there is data across data sets that is in some ways related, and restore of some subset from one data set may also require a restore of some subset from another data set, do not vary schedules between these two data sets as that may make it impossible to recover data from both data sets from exact same time period. Temporal differences may make this data useless.
The policy for snapshot frequency and retention is updated periodically throughout the day and is not instantaneously applied to all data sets. Once the policy is updated your new snapshot and retention policies will take affect.
The expiration policy is run once per day. This policy removes all snapshots that are expired when the service runs except for any snapshots with a hold tag. If you see a snapshot on the system that is expired but not removed you should expect for it to be removed the next time the policy runs. So you shouldn't see any snapshots without a hold tag that are not removed 24 hours after their expiration date and time.
There is not a single policy that addresses all problems. Mixing schedules allows for a flexible approach to reduction of granularity with time. This is what we recommend under normal circumstances.