Automating Backups of On-Premises File Shares to AWS S3 Glacier Deep Archive with AWS Storage Gateway

For many organizations, the journey to the cloud involves a hybrid approach, where existing on-premises infrastructure coexists with cloud services. A critical aspect of this hybrid strategy is data protection, specifically ensuring that valuable on-premises data is securely and cost-effectively backed up offsite. While traditional tape libraries or offsite data centers offer solutions, they often come with high operational costs, slow recovery times, and limited scalability.

This is where AWS Storage Gateway on-premises backup shines, offering a seamless bridge to the cloud for your backup needs. Combined with the ultra-low-cost, durable storage of S3 Glacier Deep Archive for long-term storage, you can create a powerful and efficient hybrid cloud backup solution AWS that automates your data protection strategy.

This article will guide you through the process of setting up AWS Storage Gateway to automate backups of your on-premises file shares directly to AWS S3 Glacier Deep Archive, providing a practical blueprint for achieving resilient, cost-optimized, and scalable data archival.

The Challenge of On-Premises Backups

Traditional on-premises backup solutions face several hurdles:

  • Cost: Hardware (tape drives, disks, servers), offsite storage facilities, power, cooling, and maintenance all contribute to significant CapEx and OpEx.
  • Scalability: Expanding storage capacity requires purchasing and integrating new hardware, a time-consuming process.
  • Durability and Resilience: Tape media can degrade, and physical offsite locations are vulnerable to regional disasters. Managing multiple copies across diverse locations adds complexity.
  • Recovery Time: Retrieving data from offsite tapes can take hours or even days.
  • Management Overhead: Manual processes for tape rotation, media management, and offsite transportation are labor-intensive and error-prone.

Introducing AWS Storage Gateway: Your Bridge to Cloud Storage

AWS Storage Gateway is a hybrid cloud storage service that connects an on-premises software appliance with cloud-based storage. It allows you to seamlessly store data in AWS cloud storage for scalable and cost-effective solutions. For backups, its “File Gateway” and “Tape Gateway” types are particularly relevant.

  • File Gateway: Presents network file system (NFS or SMB) shares to your on-premises applications. Data written to these shares is asynchronously uploaded to Amazon S3. For long-term archival, you can configure lifecycle policies to transition objects to S3 Glacier Flexible Retrieval and then S3 Glacier Deep Archive.
  • Tape Gateway: Provides a virtual tape library (VTL) interface to your existing backup software (e.g., Veeam, Veritas NetBackup). Your backup application writes data to virtual tapes, which are then uploaded to Amazon S3 and can be seamlessly transitioned to S3 Glacier Deep Archive.

For file shares, the File Gateway is often the simplest and most direct path to leverage AWS Storage Gateway on-premises backup without changing your existing file access patterns.

S3 Glacier Deep Archive for Long-Term Storage: The Ultra-Low-Cost Solution

When it comes to highly durable, long-term, and extremely cost-effective data archival, Amazon S3 Glacier Deep Archive stands out.

  • Extremely Low Cost: It’s the lowest-cost storage class in AWS, designed for data that is accessed infrequently (once or twice a year).
  • High Durability: Offers 99.999999999% (11 nines) durability over a given year, meaning your data is highly resilient against loss.
  • Retrieval Options: While designed for infrequent access, it offers different retrieval speeds, from standard (within 12 hours) to expedited (within minutes for smaller amounts of data). This balance makes it suitable for disaster recovery archives where immediate access isn’t always the primary concern.
  • Integration: Seamlessly integrates with S3 lifecycle policies, making it easy to transition data from other S3 storage classes (like S3 Standard or S3 Intelligent-Tiering).

This combination of Storage Gateway and Glacier Deep Archive creates a compelling hybrid cloud backup solution AWS.

Building Your Hybrid Cloud Backup Solution AWS (File Share Focus)

Let’s walk through the steps to set up AWS Storage Gateway (File Gateway) to back up on-premises file shares to S3 Glacier Deep Archive.

Step 1: Deploy and Activate AWS Storage Gateway (File Gateway)

  1. Choose a Host Platform: Deploy the Storage Gateway appliance on a supported platform in your data center. Options include VMware ESXi, Microsoft Hyper-V, Linux KVM, or an EC2 instance (for testing/hybrid cloud-to-cloud).
    • Tip: Ensure your host has sufficient CPU, RAM, and dedicated local disk for cache.
  2. Download and Deploy the VM/Image: From the AWS Storage Gateway console, select “File Gateway” and download the appropriate virtual appliance image. Deploy it on your chosen virtualization platform.
  3. Activate the Gateway: Once the VM is running, get its local IP address. Go back to the AWS console, enter the VM’s IP, and follow the activation steps. You’ll link it to your AWS account and choose a region.

Step 2: Configure Local Disks for Cache and Upload Buffer

During activation, you’ll configure local disks on your VM for:

  • Cache Disk: Used to store frequently accessed data and a local copy of recent data, improving performance.
  • Upload Buffer: Temporarily stores data written to the file shares before it’s uploaded to AWS S3. This ensures writes are fast even if network bandwidth to AWS is limited.

Step 3: Create an S3 Bucket for Backups

This S3 bucket will be the destination for your file share data.

  1. Go to S3 Console: Navigate to Amazon S3.
  2. Create Bucket: Choose “Create bucket.”
    • Bucket name: Choose a unique, descriptive name (e.g., my-onprem-backups).
    • Region: Select the same region as your Storage Gateway.
    • Block Public Access: Keep this enabled (default) for security.
    • Versioninng: Enable Versioning on the bucket. This is crucial for data protection and recovery of older versions or accidentally deleted files.
  3. Create Bucket: Click “Create bucket.”

Step 4: Configure S3 Lifecycle Policy for Glacier Deep Archive

This policy will automatically transition data from S3 Standard to S3 Glacier Deep Archive after a specified period.

  1. Select Your Backup Bucket: In the S3 console, click on your newly created backup bucket.
  2. Go to “Management” tab.
  3. Create lifecycle rule: Click “Create lifecycle rule.”
    • Rule name: TransitionToDeepArchive
    • Scope: Apply to all objects in the bucket (or specific prefixes if needed).
    • Lifecycle rule actions:
      • Check “Transition current versions of objects between storage classes.”
      • Choose Transition to S3 Intelligent-Tiering (Optional, but good for cost optimization if access patterns are unknown first) OR directly Transition to S3 Glacier Instant Retrieval (a good intermediate step for speedier recovery than Deep Archive, but still very low cost).
      • Add a new transition: Transition to S3 Glacier Deep Archive after X days from object creation (e.g., 90 days, 180 days, or 365 days for long-term archive).
      • Important: Consider also adding “Permanently delete noncurrent versions” and “Delete expired object delete markers” after a certain number of days to manage older versions and delete markers for cost optimization, but ensure your retention policy is met.
  4. Create rule: Click “Create rule.”

Step 5: Create a File Share on Storage Gateway

Now, link your on-premises file share to the S3 bucket.

  1. Go to Storage Gateway Console: Navigate back to the AWS Storage Gateway console.
  2. Select Your Gateway: Click on your activated File Gateway.
  3. Create File Share: Click “Create file share.”
    • Amazon S3 bucket name: Select the S3 bucket you created (e.g., my-onprem-backups).
    • Access Point: Leave as default.
    • Access options: Choose NFS or SMB depending on your on-premises clients.
    • Allowed clients: Specify the IP ranges of your on-premises servers that will access this share.
    • Audit logs: (Optional) Enable CloudWatch logging for activity.
  4. Create File Share: Click “Create file share.”

Step 6: Mount the File Share On-Premises and Start Backing Up

  1. Get Mount Command: After creating the file share, the Storage Gateway console will provide the mount command for NFS or the network path for SMB.
    • NFS Example: sudo mount -t nfs -o vers=4.1,hard,timeo=600,retrans=2,noresvport <gateway-ip>:/<share-name> /mnt/gateway-share
    • SMB Example: \\<gateway-ip>\<share-name>
  2. Mount the Share: On your on-premises server (the one with the file share you want to back up), use the provided command/path to mount the Storage Gateway file share.
  3. Copy/Sync Data:
    • Manual Copy: Simply copy files to the mounted share.
    • Scheduled Copy: Use rsync (Linux), Robocopy (Windows), or your existing backup software (configured to copy to a network share) to copy the data to the mounted Storage Gateway share on a schedule.

Step 7: Monitor and Verify

  1. Storage Gateway Console: Monitor the “File Shares” tab to see the upload status and health.
  2. S3 Console: Verify that files are appearing in your S3 backup bucket.
  3. CloudWatch: Monitor CloudWatch metrics for your Storage Gateway (e.g., BytesUploaded, CacheHitPercent) and your S3 bucket.
On-Premises to Glacier Deep Archive Backup Flow

Conclusion: A Resilient and Cost-Optimized Hybrid Cloud Backup Solution AWS

By combining AWS Storage Gateway on-premises backup capabilities with the unparalleled cost-effectiveness of S3 Glacier Deep Archive for long-term storage, you can establish a robust and highly automated hybrid cloud backup solution AWS. This approach eliminates the complexities and costs of traditional tape backups, offers massive scalability, and ensures your critical on-premises data is securely archived in the cloud, ready for recovery whenever needed. Embrace this powerful pattern to future-proof your data protection strategy and unlock significant operational efficiencies.

🚀 Explore Popular Learning Tracks