The best S3 storage option depends on your access patterns:
Standard S3: For frequently accessed data.
S3 Intelligent-Tiering: For variable access patterns; automatically moves data between frequent and infrequent access tiers.
S3 One Zone-IA: For infrequently accessed data that can be recreated if lost.
Glacier Storage: For long-term archiving with retrieval times of minutes to hours.
Glacier Deep Archive: For the lowest cost, with retrieval times within 12 hours.
Amazon S3, or Amazon Simple Storage Service, is a managed cloud storage service.
Managed means AWS will handle the storage infrastructure, maintenance, security and management – so users only need to care about storing and accessing the data! This is similar to Google Drive.
We don’t need to worry about updating the software at all when using it. This means S3 is also a serverless service, meaning the scaling up and down of how much storage you need is done for you.
S3 is designed to store unlimited amounts of data (with different formats) on AWS and retrieve any amount of data from anywhere on the web.
Concepts of Amazon S3
1. Buckets: Data in Amazon S3 is stored in containers called "buckets." Think of a bucket as a top-level folder or directory. Each bucket has a unique name globally across all of AWS.
For each bucket, you can:
Manage permissions for it (add, remove, and view items in the bucket).
Access the logs to see who has viewed it and its contents.
Select the geographic area in which you want to place the bucket and its data.
Some other things to note:
The name of the bucket needs to be distinctive and not used before.
Initially, each AWS account allows you to set up a maximum of 100 buckets.
Once created, the Region of the bucket (i.e. the geographic area it’s stored in) cannot change.
You can set up your bucket to host static websites.
If your S3 bucket has 100,000 objects or more, you can't remove it using the Amazon S3 console. Additionally, if versioning is turned on, you can't remove the S3 bucket through the AWS CLI.
2. Objects: Within buckets, data is stored as "objects." Objects can be almost any data file such as documents, images or videos, and a unique key (a name that identifies the object; these are called “object keys”).
3. Pricing: Pay only for what you use with no minimum charge. You will be charged when you store something in S3, and when you take it out after it’s been stored. The price of storing and retrieving with S3 is calculated per GB, and the rate you get charged is different depending on the storage class you use. Learn more about the different rates here.
Amazon S3 storage classes
Amazon S3 provides various storage options tailored to your specific needs in terms of data accessibility, durability, and pricing. For instance, each S3 storage class is designed to offer the most cost-effective solution for distinct data access scenarios.
Big tip: all storage classes provide 11 9's of durability - which means there is a 99.999999999% chance that S3 stores your objects safely. With such high chances, most of us will never lose data that was stored in S3. (this is something the cloud practitioner exam loves asking about this)!
Amazon S3 Standard: Amazon S3 Standard is designed primarily for frequently accessed data. It's a suitable choice for big data analytics (analysing big amounts of data to gain insights and make decisions), mobile and gaming applications, content distribution (sharing content to users across different locations quickly), and backups. Think of Amazon S3 Standard as a huge, secure, and smart online storage locker where you can put all kinds of digital stuff such as photos, documents, videos, etc. Whenever you need them, you can get them back quickly, whether it’s just one file or a ton of them.
Amazon S3 Standard-IA: Amazon S3 Standard-IA is designed for data that you access less frequently (around once a month) but would want it really quickly when it is needed. It provides the robust durability, high throughput (i.e. ability to handle a large amount of data efficiently), and swift response of Amazon S3 Standard but at lower storage and retrieval prices.
Amazon S3 One Zone-IA: The object data is stored in just one Availability Zone. It's more cost-effective than Standard-IA, but if that specific Availability Zone faces issues, the data could be at risk.
Amazon S3 Intelligent-Tiering: Unlike the Standard storage class, where you pay a fixed rate regardless of how frequently you access your data, Intelligent Tiering helps you save costs by automatically transitioning objects between access tiers based on changing usage patterns. S3 Intelligent-Tiering is a popular choice because of these cost savings, but the Standard storage class works out cheaper if data access is consistent and frequent - Intelligent Tiering charges a monitoring fee to keep track of access patterns.
Amazon S3 Glacier Instant Retrieval: This offers milliseconds retrieval of data in a low-cost archive S3 storage class. You might wonder – what is the difference from Standard-IA? As an archive solution, files shouldn’t be intended for access, but in this case it simply needs to be rapidly available if the need arises. The storage price is much cheaper than Standard-IA, but the retrieval prices are higher.
Amazon S3 Glacier Flexible Retrieval: Minutes to 12 hours retrieval and with free bulk retrievals of data in a lower cost archive S3 storage class. It is an ideal solution for backup, disaster recovery, and offsite data storage needs. When some data occasionally needs to be retrieved in minutes, you don’t want to worry about costs.
Amazon S3 Glacier Deep Archive: Utilise for storing data that's infrequently accessed. By default, retrieving information from the S3 Glacier Deep Archive takes about 12 - 48 hours. It is designed for customers that retain data sets for 7-10 years or longer to meet regulatory compliance requirements. S3 Glacier Deep Archive can also be used for backup and disaster recovery use cases. It supports long-term retention and digital preservation for data that might be accessed once or twice a year.
Amazon S3 Outposts: Think of a Local Bookshelf with Books from a Giant Library: Imagine you often visit a vast library filled with numerous books (which represents AWS's cloud infrastructure). While you love the vast collection of books there, sometimes you wish you could have a few shelves from that library right in your home for immediate access without having to travel (which represents data transfer latency to the cloud). Amazon S3 on Outposts delivers object storage to your on-premises AWS Outposts environment. It’s like having a few bookshelves from that massive library right next to you, stocked with the specific things you want most.
Now that we know the basics and different classes of S3, we can get into some of the other features of S3....
Features of S3
Amazon S3 access management and security
Amazon S3 offers comprehensive tools and features to manage access and ensure security for your stored data. Properly configured, these tools can help prevent unauthorised access and safeguard sensitive information.
By default, S3 buckets and the objects in them are private. Only the resources you create in S3 are accessible to you. To grant granular resource permissions that support your specific use case or to audit the permissions of your Amazon S3 resources, you can use the following features:
S3 Block Public Access: A setting in your S3 bucket that's enabled by default, meaning all contents in your bucket are not available to the public.
Recommended to keep enabled unless specific public access is necessary.
S3 object ownership: A setting that makes all objects in the bucket 'owned' by the bucket owner, even if the objects were uploaded by other accounts.
AWS Identity and Access Management (IAM): IAM is a service for controlling people's access to your resources, including S3. You can grant people (or groups of people) access to your S3 buckets by making IAM policies.
Bucket policies: Specific to S3, bucket policies allow you to define who can access the specific contents of a bucket and what actions they can make in the bucket.
S3 endpoints: Use Amazon Virtual Private Cloud (you'll learn more about this later!) to create a private connection between your Private Cloud and S3 without using the internet, enhancing security.
Access control lists (ACLs): A traditional method for controlling permissions on individual buckets and objects. We'll learn more about access control lists later in the course!
Amazon S3 Versioning
Imagine you're working on a document everyday, and you suddenly realise you accidentally deleted important content a few days ago. Wouldn't it be great if you could go back and access each version of this document to retrieve previous work?
Amazon S3 Versioningcomes-in! It’s like a super “undo” feature for your files (or objects) in S3.
When you enable versioning for an S3 bucket (note that Versioning is disabled by default):
Every time you change or delete a file, S3 keeps the older versions so you'll never accidentally delete something forever.
S3 versioning safeguards against accidental file changes and deletion, if you make any mistakes like overwriting or deleting, you can always fetch the version you want and correct those unintended changes.
You can permanently delete an object by specifying the version you want to delete. Only the owner of an Amazon S3 bucket can permanently delete a version.
Amazon S3 static websites
Staticwebsites look the same for everyone - it displays the same photos, videos, texts, gifs and other content, regardless of who you are. A common example of a static website is a personal portfolio or company website.
The opposite is a dynamic website, which means content changes according to the user's location, actions on the website and user generated content. Social media sites (e.g. Facebook) and most e-commerce stores (e.g. Amazon) are dynamic websites because they show personalised recommendations!
You can set up an Amazon S3 bucket to store your static website. It's easy, cheap, and safe, making it a great option for simple websites, portfolios, or project demos. We'll be doing an exercise on this to show you how.
Amazon S3 Replication
When you store a file (like a photo, video, or document) in an S3 bucket, you can set up S3 Replication to automatically make a copy of that file in another bucket, which might even be in a different part of the world or within the same Region. Even if your original file gets damaged or lost, these copies will be available as backup.
Amazon S3 Encryption
In the digital world, encryption is a locked safe for your files.
When you store anything in S3, bucket encryption ensures that files are turned it into unreadable code.
Only with the right "key" can it be translated back into something that makes sense.
It's crucial to not lost these keys - without them, the data remains unreadable to you too!
Once set up, Amazon S3 handles the encryption in the background.
Types of encryptions on Amazon S3
Encryption in transit: Securing data while it's being transferred to a different place - for example, if a copy of your data is being sent to a different region in the world.
Encryption at rest: Securing data while it's sitting in storage.
Methods of encryption in S3
Server-Side Encryption (SSE): Amazon S3 encrypts your data as it writes it to disks in its data centres and decrypts it for you when you access it. Think of this as S3 automatically locking the safe for you.
SSE-S3: Amazon handles everything, from generating keys to applying encryption.
SSE-KMS: You get more control, using a service called Key Management Service (KMS) to handle the keys.
SSE-C: You provide the encryption key, and Amazon handles the rest.
Client-Side Encryption: You encrypt (lock up) the data yourself before you send it to Amazon S3. It's like you lock the safe, and then give the safe to Amazon for safekeeping.
Amazon S3 Transfer Acceleration
Think of Transfer Acceleration as an express courier service for your digital files!
Transfer Acceleration speed ups file uploading and downloading, making it handy for long-distance transfer of large files.
For example, a streaming platform like Netflix can use Transfer Acceleration to quickly deliver high-definition videos to viewers worldwide.