Amazon Elastic Block Store (Amazon EBS)
Amazon EBS provides designated storage for Amazon EC2 (Elastic Compute Cloud) instances.
We’ll dive into EC2 next week - for now, EC2 is a service that lets us create and use customised virtual computers.
Here are some things to note about Amazon EBS:
- If you stop or terminate an Amazon EC2 instance, all the data on the attached EBS volume remains available.
- It’s important to back up the data stored in EBS. You can take backups of EBS volumes by creating Amazon EBS snapshots.
- An EBS volume must be in the same Availability Zone as the EC2 instance it's attached to. This constraint makes sure there is fast access to the data stored in the EBS volume.
There are two storage types with Amazon EBS:
- Solid state drives (SSD) are great for heavier applications that need to retrieve data quickly. If you're a gamer, using SSD would help you load your video games faster. Businesses also love using SSD to store and retrieve data in their databases.
- Hard disk drives (HDD) are cost-effective, making them great for archival storage/backups. While they're typically slower than SSD, they are still faster than storage with S3.
So... what’s the difference between EBS and Amazon S3?
- We use EBS to store applications, databases, and operating systems for EC2 instances.
- Amazon S3, on the other hand, is object storage commonly used for backup, archiving, and sharing media over the internet.
- EBS HDD and S3 in particular can feel very similar. The big advantage of using HDD to remember is that it offers direct, local access to block-level storage for applications.
- For example: In video editing software, where real-time access to large media files is super important, an EC2 instance with attached HDDs can provide faster access compared to fetching data from a remote storage service like S3. This helps the video editing software give its users smooth playback and responsive editing.
- Another example: If you are storing video/image files for websites or backup files (i.e. where quick access is not a concern), S3 is a strong choice.
Amazon EC2 Instance Store
Imagine your computer having a built-in flash drive (those small USB storage devices). This flash drive is super-fast, but there's a catch: if you turn off your computer or if it crashes, everything saved on that flash drive disappears. For temporary tasks or data you don’t need to keep long-term, this flash drive is handy.
Amazon EC2 Instance Store is like a built-in, temporary flash drive, but for your virtual computer (or "instance") in AWS. It provides temporary block-level storage for your EC2 instance.
Data stored in instance store volumes is ephemeral - meaning it will be lost if the instance is stopped, terminated, or if it fails. This makes it ideal for information that changes frequently and don't need to be saved in the long term, such as buffers, caches and scratch data. Ooo, lots of new words here - let's break it down:
- Buffers are like short-term storage spaces for data during its journey from one place to another. An example is a loading screen during video streaming.
- Caches (also called quick-access memory) stores frequently used information for quick access and speedy processing. When you visit a website, your browser stores caches elements of that site, like images and frequently used files, on your computer's hard drive. This cache allows your browser to load those images/files faster the next time you visit!
- Scratch data is information you're using right now but don't need to keep once you're done. An example computer could be the clipboard content - you copy text or a file, you paste it somewhere else, and then you don't need it in the clipboard anymore.
Amazon Elastic File System (Amazon EFS)
Amazon EBS works for EC2 instance storage, and S3 is good for backups and archiving data that doesn't change often.
Amazon Elastic File System (Amazon EFS) fills another need:
- EFS is a shared drive on the cloud where different users can access and edit files in real-time.
- It's great for apps that need a common space for files, and is often used with EC2 instances.
Amazon EFS is automatically scalable.
- Your applications won't have any problems if data suddenly increases - storage will scale accordingly.
- If the data decreases - the amount of storage will be reduced, so you won't pay for any unused storage.
- That's why EFS is especially helpful for storing the application code, files and data that comes with running servers, big data analysis, and any scalable work you can think of.
Amazon EBS vs. Amazon EFS:
- An Amazon EBS volume stores data in a single Availability Zone. To attach an EC2 instance to an EBS volume, they must be in the same Availability Zone.
- Amazon EFS is a regional service. It stores data in and across multiple Availability Zones. The duplicate storage lets you access data from all Availability Zones in the Region where a file system is located. On-premises servers can also access Amazon EFS using AWS Direct Connect.
Amazon FSx
In AWS, Amazon FSx is a series of services that provide fully managed file storage tailored to specific needs. They're great for processing big volumes of media and machine learning. FSx = File System for x, where x is a placeholder for different types of traditional file systems that it can virtualise.
Currently, there are two main flavours: FSx for Windows File Server, for Microsoft Windows applications, and FSx for Lustre*.
*Lustre is a powerful computer file system that helps you manage and process huge amounts of data quickly. It's like a super-speedy library for data that's popular for scientific research and big data projects. Lustre's name comes from Linux and cluster (i.e. interconnected computers used for large-scale computing).
Note that two services have joined the family - FSx for NetApp ONTAP and FSx for OpenZFS - but they're beyond the scope of the course!
Note: what's the difference between FSx and EFS?
- Amazon EFS is suited for a wide range of use cases, especially when you need a shared file system for Linux and Windows apps. It's the go-to option for file storage.
- FSx is more specialised, and designed for apps or scenarios that rely on Windows-specific functions or Lustre-based computing.
AWS Storage Gateway
Imagine you have a physical office with computers and servers (your on-premises environment), but you want to store some of your data in the cloud.
AWS Storage Gateway makes it super easy! AWS Storage Gateway is a software that you can install on your on-premises environment. Once installed, it acts as a bridge between your on-premises applications and AWS cloud storage services like Amazon S3. This software makes it seamless for your on-premises systems to interact with and use cloud storage as if it were a local folder. No complicated set ups required!
Let's say you have a photo editing app like Photoshop installed on your laptop (your on-premise environment), and you've configured AWS Storage Gateway to use Amazon S3 for storage.
- Use your photo editing app: You launch your photo editing application on your computer. After making some edits to a photo, you click "Save" within the photo editing app.
- Save location: A panel pops up that lets to pick where in your laptop you'd like to save this file. Instead of choosing your local Downloads folder, you choose another "local folder" that's actually the AWS Storage Gateway.
- Storage Gateway magic: The Storage Gateway takes your saved photo and efficiently transfers it to the designated location in Amazon S3, which is your cloud storage. Your photo editing app doesn't need to know that the data is going to the cloud. It just thinks it's just saving the photo to a local folder, thanks to the seamless integration provided by AWS Storage Gateway.
AWS Elastic Disaster Recovery (AWS DRS)
Imagine you've built an elaborate domino setup, representing hours of hard work. It's intricate and valuable. Suddenly, a gust of wind (or a mischievous cat) knocks a part of it over. You're devastated. But then, you remember you've taken a photo of each step of the setup. Using those images, you can rebuild it without starting from scratch.
DRS is a bit like having those step-by-step photos. It lets you rewind your critical applications and database to what they looked like at specific points in time.
Special Mention: AWS Backup
Think of AWS Backup as a time machine for your digital information stored in the AWS Cloud.
AWS Backup is highly relevant to Amazon's storage services as it provides a centralised and managed backup solution for critical data stored in services like Amazon S3 and Amazon EBS. It simplifies the restoration of data in case of accidental deletions, corruptions, or other data loss scenarios, ensuring quick recovery without the need for complex manual procedures. For example, it can store backups of entire buckets and objects stored in Amazon S3.