cameraDatafy Snapshots

circle-info

Datafy Snapshots are supported from version 1.30.0! To restore volumes from Datafy Snapshots make sure you have the correct version installed before attaching the new volumes

Datafy Snapshot Lifecycle

  1. Create a Datafy Snapshot - use the Datafy create-snapshot API instead of the AWS CreateSnapshot API.

  2. Manage Retention - use your existing logic with the relevant volume tags to manage the retention of Datafy Snasphots.

  3. Restoration - use the Datafy create-volume-from-snapshot and attach endpoints to create and attach a new autoscaling volume.

Creating a Datafy Snapshot

To create a new Datafy Snapshot, call the create-snapshot endpoint with the volume ID of the original volume managed by Datafy (as it appears in the Datafy dashboard). You should call the Datafy API wherever you currently call the AWS CreateSnapshot API.

circle-info

Datafy snapshots need to created directly, and cannot be created using AWS Data Lifecycle Manager (DLM)

Fast Datafy Snapshots will create a separate snapshot for each underlying Datafy volume associated with the autoscaling volume. These snapshots have tags that identify them as Datafy snapshots, and help track their lineage. You can use these tags to identify the original volume the snapshot is associated with, and to identify all of the snapshots that were created together and are associated with each other.

circle-check
chevron-rightDatafy Snapshot Tagshashtag
Tag
Description

Managed-By: Datafy.io

Identifies the snapshot as managed by Datafy

datafy:source-volume:id

Volume ID of the original volume before autoscaling. The volume ID associated with the snapshot by AWS is the smaller volume after autoscaling.

datafy:snapshot:id

A snapshot ID generated by Datafy (designated with the prefix dsnap) , shared by all of the related snapshots created when snashotting an autoscaling volume.

datafy:snapshot:generation-id

Generation number of the datafy snaphot. The first snapshot taken during migration is generation 0, and every following snapshot the generation is increased by 1

Snapshot Retention

Snapshots created by Datafy are stored in your AWS account like regular EBS snapshots, and their retention should be handled by your existing retention tools and policies, with the following adjustments:

  • Use the datafy:source-volume:id tag to determine the volume ID of the original non-autoscaling volume associated with the snapshot.

If you use a backup management tool to manage snapshot retention as part of a larger backup object, check out how to integrate with backup management tools below.

circle-info

Be aware that AWS Data Lifecycle Manager (DLM) will not automatically manage snapshots created by third-party tools, including Datafy. You must apply your own logic to clean up these snapshots.

Restoring a Volume From a Datafy Snapshot

Datafy snapshots represent the underlying volumes created by Datafy when a volume is autoscaling. As such, they can be restored to autoscaling volumes with the following adjustments to your restoration flow:

  1. Use the tags datafy:source-volume:id and datafy:snapshot:id and the snapshot timestamp to find all of the snapshots for the volume and time you need.

  2. After selecting the desired snapshots and making sure you have all of the snapshots with the same datafy:snapshot:id, create a new autoscaling volume using the Datafy create-volume-from-snapshot endpoint.

  3. Attach the new autoscaling volume to an EC2 instance with Datafy AutoScaler installed using the Datafy attach endpoint. Mount the volume as you would any EBS volume.

Offline Restoration

Datafy snapshots are intended to be restored to an instance with Datafy AutoScaler installed and access to the Datafy API. For emergencies where this isn't possible, we provide local offline tools that enable access to the data saved in Datafy snapshots without any additional dependencies.

cloud-slashOffline Restoration - Autoscalingchevron-right
  • For restoring Datafy snapshots when the Datafy API is unavailable

cloud-slashOffline Restoration - Non Autoscalingchevron-right
  • For complete offline restoration of Datafy snapshots, without Datafy Autoscaler

Integration with Backup Managers

Datafy snapshots can be integrated into the workflows of your existing backup management tools. They can be used seamlessly with any backup tool that uses the K8s CSI, and with minimal integration with custom backup systems.

circle-info

Use a different type of backup manager? Ask us about integration with other tools!

Kubernetes CSI Integration

Datafy integrates directly with the Kubernetes CSI Snapshotterarrow-up-right, allowing Kubernetes-native backup tools like Velero to seamlessly create snapshots of Datafy-managed volumes. The resulting snapshots are associated with specific PVCs, and can be used directly in the rest of the backup and restoration flow.

To enable this integration:

1

Make sure your cluster is set up for taking snapshots through the CSI:

circle-info

You can test this by creating a snapshot directly through Kubernetes, by creating a VolumeSnapshot resource

chevron-rightVolumeSnapshot Exampleshashtag
2

Configure your backup tool to use the CSI Snapshotter (and not the AWS native snapshotter).

When a snapshot is triggered, it will call the Datafy API in the background for Autoscaling volumes.

Custom Backup Systems

If you use a custom backup manager or automation script, adjust your logic to use Datafy snapshots at each part of the snapshot lifecycle.

  1. Snapshot creation - Use the Datafy API instead of the AWS snapshot API to create snapshots of AutoScaling volumes.

  2. Snapshot retention - Retention policies that target all snapshots in the account will apply to Datafy snapshots a well. If you use targeted retention policies, make sure to apply them to snapshots with the tag Managed-By: Datafy.io.

  3. Restoration from snapshots - As detailed above, restoration of volumes from Datafy snapshots follows a similar process as restoring volumes from regular EBS snapshots. Your restoration logic should:

    1. Ensure that Datafy AutoScaler is installed on the instances the new volumes will be attached to.

    2. Identify the relevant snapshots using the datafy:source-volume:id and datafy:snapshot:id tags, and use the Datafy API instead of AWS to create new volumes from the snapshots and to attach them.

Last updated

Was this helpful?