Day 9: AWS Solutions Architect Professional Prep — Storage Deep Dive
The right choice of digital storage service determines your speed, your ability to share data, and ultimately, your monthly bill so it is a very critical decision. Amazon Web Services (AWS) offers several primary ways to store data and today’s lesson examines how to pick the right one
Here are four key questions that can guide us in selecting and optimizing AWS storage:
Q1: What are the three main types of storage, and when do I use each one?
There are three primary services, differentiated by the kind of data access they offer:
1. EBS (Elastic Block Store): It is like a personal hard drive. It is best for the computer’s operating system (boot volumes) and crucial, constantly changing data, like customer databases. Only one virtual computer can use it at a time
2. EFS (Elastic File System): This is comparable to a shared network folder. It is best for sharing data among many virtual computers simultaneously. It grows and shrinks automatically as you add files. EFS is described as being fully managed and elastic, meaning it handles all the growing and shrinking on its own.
Because it is a regional service, the data is spread across multiple locations (AZs) for high availability.
3. FSx Family: FSx is like a specialized enterprise server. It provides tailored file systems for specific enterprise tasks.It is best for handling complex business needs, like running specific Windows applications, Active Directory integration, or doing heavy-duty data crunching (HPC).
It offers specialized, fully managed File systems (like SMB or Lustre protocols).
For Windows: FSx for Windows File Server works perfectly if your company uses Microsoft’s protocols (SMB) and Active Directory. It supports high availability across multiple locations (Multi-AZ).
For Supercomputers: FSx for Lustre is built for extreme speed, great for things like machine learning and big data analytics. It can even automatically sync data with S3 storage
Q2: How do these services handle growth and sharing?
The key differences are in scaling and shared access:
EBS generally allows only a single instance to access the volume – athough there is a Multi-Attach feature for io1/io2 volumes in the same AZ. A manual process is required to scale.
EFS is designed for multiple instances and uses automatic, elastic scaling so it can grow and shrink automatically as you add files. It is a regional service so it is not limited to one Availability Zone (AZ).
FSx also supports multiple instances but uses managed scaling.
Q3: How can I save money on my cloud storage?
You can implement lifecycle management and choose cost-effective volume types: gp3 and EFS Infrequent Access.
For EBS, the recommendation is to always choose the gp3 volume type over the older gp2. The gp3 volume provides custom IOPS and throughput, and, for the same performance, using it can be about 20% cheaper.
EFS Lifecycle Management is the biggest cost-saver. You can enable policies to automatically move files that haven’t been touched after a certain number of days (say, 30 days) into the Infrequent Access (IA) storage class. This move can cut your storage costs for that cold data by up to 92%.
General Optimization: Automate EBS snapshot creation using the Data Lifecycle Manager (DLM) and regularly delete unused snapshots.
Q4: Which storage options are best for the absolute highest performance?
To optimize performance, faster drive types like io2 can be used.
For extreme speed for specialized or critical workloads, there are specific options:
For Databases, use the io2 or io2 Block Express types, which are designed for high-performance, latency-critical workloads and support up to 256,000 IOPS.
For Supercomputing, use FSx for Lustre. This file system is purpose-built for high-performance computing (HPC), machine learning (ML), and analytics. It can integrate directly with S3 buckets and automatically sync changes.
To conclude, EBS is block storage and is best used for Databases, OS, Boot Volumes. EFS is file storage and is best used for shared application data. FSx is file storage and is best used for Enterprise Workloads, High Performance Computing (HPC).
