# Datafy Snapshots

{% hint style="info" %}
Datafy Snapshots are supported from version `1.30.0`!\
To restore volumes from Datafy Snapshots make sure you have the correct version installed before attaching the new volumes
{% endhint %}

## Datafy Snapshot Lifecycle

1. [**Create a Datafy Snapshot**](#creating-a-new-snapshot) - use the Datafy [`create-snapshot`](/resources/api.md#post-api-v1-volumes-volumeid-create-snapshot) API instead of the AWS `CreateSnapshot` API.
2. [**Manage Retention**](#snapshot-retention) - use your existing logic with the relevant volume tags to manage the retention of Datafy Snasphots.
3. [**Restoration**](#restoring-a-volume-from-a-datafy-snapshot) - use the Datafy [`create-volume-from-snapshot`](/resources/api.md#post-api-v1-volumes-create-from-snapshots) and [`attach`](/resources/api.md#post-api-v1-volumes-volumeid-attach) endpoints to create and attach a new autoscaling volume.

### Creating a Datafy Snapshot

To create a new Datafy Snapshot, call the [`create-snapshot` endpoint](/resources/api.md#post-api-v1-volumes-volumeid-create-snapshot) with the volume ID of the original volume managed by Datafy (as it appears in the Datafy dashboard). You should call the Datafy API wherever you currently call the AWS `CreateSnapshot` API.

{% hint style="info" %}
Datafy snapshots need to created directly, and cannot be created using AWS Data Lifecycle Manager (DLM)
{% endhint %}

Datafy Snapshots will create a separate snapshot for each underlying Datafy volume associated with the autoscaling volume. These snapshots have tags that identify them as Datafy snapshots, and help track their lineage. You can use these tags to identify the original volume the snapshot is associated with, and to identify all of the snapshots that were created together and are associated with each other.

{% hint style="success" %}
Datafy snapshots work for any volume managed by Datafy, including unattached volumes
{% endhint %}

<details>

<summary>Datafy Snapshot Tags</summary>

<table><thead><tr><th width="270.40985107421875">Tag</th><th>Description</th></tr></thead><tbody><tr><td><code>Managed-By: Datafy.io</code></td><td>Identifies the snapshot as managed by Datafy</td></tr><tr><td><code>datafy:source-volume:id</code></td><td>Volume ID of the original volume before autoscaling. The volume ID associated with the snapshot by AWS is the smaller volume after autoscaling.</td></tr><tr><td><code>datafy:snapshot:id</code><br></td><td>A snapshot ID generated by Datafy (designated with the prefix <code>dsnap</code>) , shared by all of the related snapshots created when snashotting an autoscaling volume.</td></tr><tr><td><code>datafy:snapshot:generation-id</code></td><td>Generation number of the datafy snaphot. The first snapshot taken during migration is generation 0, and every following snapshot the generation is increased by 1</td></tr></tbody></table>

</details>

### Snapshot Retention

Snapshots created by Datafy are stored in your AWS account like regular EBS snapshots, and their retention should be handled by your existing retention tools and policies, with the following adjustments:

* Use the `datafy:source-volume:id` tag to determine the volume ID of the original non-autoscaling volume associated with the snapshot.

If you use a backup management tool to manage snapshot retention as part of a larger backup object, check out how to integrate with backup management tools [below](#integration-with-backup-manager).

{% hint style="info" %}
Be aware that AWS Data Lifecycle Manager (DLM) will not automatically manage snapshots created by third-party tools, including Datafy. You must apply your own logic to clean up these snapshots.
{% endhint %}

### Restoring a Volume From a Datafy Snapshot

Datafy snapshots represent the underlying volumes created by Datafy when a volume is autoscaling. As such, they can be restored to autoscaling volumes with the following adjustments to your restoration flow:

1. Use the tags `datafy:source-volume:id` and `datafy:snapshot:id` and the snapshot timestamp to find all of the snapshots for the volume and time you need.
2. After selecting the desired snapshots and making sure you have all of the snapshots with the same `datafy:snapshot:id`, create a new autoscaling volume using the Datafy [`create-volume-from-snapshot`](/resources/api.md#post-api-v1-volumes-create-from-snapshots) endpoint.
3. Attach the new autoscaling volume to an EC2 instance with Datafy AutoScaler installed using the Datafy [`attach`](/resources/api.md#post-api-v1-volumes-volumeid-attach) endpoint. Mount the volume as you would any EBS volume.

#### Offline Restoration

Datafy snapshots are intended to be restored to an instance with Datafy AutoScaler installed and access to the Datafy API. For emergencies where this isn't possible, we provide local offline tools that enable access to the data saved in Datafy snapshots without any additional dependencies.

{% columns %}
{% column %}
{% content-ref url="/pages/LRYOScr01Q3bvz6MaaBn" %}
[Offline Restoration - Autoscaling](/volume-lifecycle/datafy-snapshots/offline-restoration-autoscaling.md)
{% endcontent-ref %}

* For restoring Datafy snapshots when the Datafy API is unavailable
  {% endcolumn %}

{% column %}
{% content-ref url="/pages/myH4NbhfZafh0pDaZimO" %}
[Offline Restoration - Non Autoscaling](/volume-lifecycle/datafy-snapshots/offline-restoration-non-autoscaling.md)
{% endcontent-ref %}

* For complete offline restoration of Datafy snapshots, without Datafy Autoscaler
  {% endcolumn %}
  {% endcolumns %}

## Integration with Backup Managers

Datafy snapshots can be integrated into the workflows of your existing backup management tools. They can be used seamlessly with any backup tool that uses the K8s CSI, and with minimal integration with custom backup systems.

{% hint style="info" %}
Use a different type of backup manager? Ask us about integration with other tools!
{% endhint %}

### Kubernetes CSI Integration

Datafy integrates directly with the [Kubernetes CSI Snapshotter](https://kubernetes.io/docs/concepts/storage/volume-snapshots/), allowing Kubernetes-native backup tools like Velero to seamlessly create snapshots of Datafy-managed volumes. For an overview of how Datafy extends the CSI driver in your cluster, see [AutoScaler on Kubernetes](/how-it-works/autoscaler-on-kubernetes.md). The resulting snapshots are associated with specific PVCs, and can be used directly in the rest of the backup and restoration flow.

To enable this integration:

{% stepper %}
{% step %}
Make sure your cluster is set up for taking snapshots through the CSI:

* The [`snapshot-controller`](https://docs.aws.amazon.com/eks/latest/userguide/csi-snapshot-controller.html) EKS add-on is installed on your cluster, and the [`csi-snapshotter`](https://kubernetes.io/docs/concepts/storage/volume-snapshots/) sidecar is installed on your cluster's CSI-controller pod
* An appropriate [`VolumeSnapshotClass`](https://kubernetes.io/docs/concepts/storage/volume-snapshot-classes/) resource is defined in the cluster

{% hint style="info" %}
You can test this by creating a snapshot directly through Kubernetes, by creating a `VolumeSnapshot` resource
{% endhint %}

<details>

<summary>VolumeSnapshot Examples</summary>

{% code title="Define a VolumeSnapshotClass" %}

```yaml
apiVersion: snapshot.storage.k8s.io/v1
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: aws-ebs-snapshot-class
driver: ebs.csi.aws.com
deletionPolicy: Delete
```

{% endcode %}

{% code title="Create a Snapshot" %}

```yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: ebs-volume-snapshot
spec:
  volumeSnapshotClassName: aws-ebs-snapshot-class
  source:
    persistentVolumeClaimName: aws-ebs-volume
```

{% endcode %}

</details>
{% endstep %}

{% step %}
Configure your backup tool to use the CSI Snapshotter (and not the AWS native snapshotter).

When a snapshot is triggered, it will call the Datafy API in the background for Autoscaling volumes.
{% endstep %}
{% endstepper %}

### Custom Backup Systems

If you use a custom backup manager or automation script, adjust your logic to use Datafy snapshots at each part of the [snapshot lifecycle](#datafy-snapshot-lifecycle).

1. **Snapshot creation** - Use the [Datafy API](/resources/api.md#post-api-v1-volumes-volumeid-create-snapshot) instead of the AWS snapshot API to create snapshots of AutoScaling volumes.
2. **Snapshot retention** - Retention policies that target all snapshots in the account will apply to Datafy snapshots a well. If you use targeted retention policies, make sure to apply them to snapshots with the tag `Managed-By: Datafy.io`.
3. **Restoration from snapshots** - As detailed [above](#restoring-a-volume-from-a-datafy-snapshot), restoration of volumes from Datafy snapshots follows a similar process as restoring volumes from regular EBS snapshots. Your restoration logic should:
   1. Ensure that Datafy AutoScaler is installed on the instances the new volumes will be attached to.
   2. Identify the relevant snapshots using the `datafy:source-volume:id` and `datafy:snapshot:id` tags, and use the Datafy API instead of AWS to [create new volumes](/resources/api.md#post-api-v1-volumes-create-from-snapshots) from the snapshots and to [attach](/resources/api.md#post-api-v1-volumes-volumeid-attach) them.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.datafy.io/volume-lifecycle/datafy-snapshots.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
