What is a database?
Databases are organised collections of structured data that let you store, manage, and retrieve information efficiently.
They're like digital filing cabinets that keep all your precious information in perfect order. Whether you're storing cat pictures, customer records, or your secret recipe for the world's best chocolate chip cookies, databases are there to keep things tidy and accessible.
Is an Excel spreadsheet a database?
Hmmm it's a good question! Excel can be used for basic data storage and organisation, but it's not the same as a dedicated database system. For starters:
- Databases support a wider range of data types, while Excel is focused on numbers, text, and basic formulas.
- Databases support complex data relationships (e.g. a business connecting everything they know about a customer), while Excel uses single tables.
- Databases can handle much larger volumes of data compared to Excel.
- Databases offer powerful searching and reporting functions, whereas Excel's capabilities are more limited.
- Databases provide all kinds of security features with user authentication and access control, while Excel files don't provide much more than password-protected files.
Here are the popular database management systems (DBMS)* used in the industry today:
- MySQL is reliable, free, and known for its simplicity and speed. Many websites use MySQL to store information like user accounts and product details.
- PostgreSQL is also free and can handle complex data without breaking a sweat. PostgresSQL is often chosen when data integrity** or scalability are critical.
- *Database = how your data is organised; where your data lives.
- DBMS = the toolkit you use to interact with your data.
- It's similar to how blob storage is a way to organise your data, while S3 is the tool you use to actually store and organise that data.
- It can definitely be a little confusing because "database" is a common shorthand for DBMS too! When someone says "MySQL is a database", they actually means "MySQL is a DBMS".
- **Data integrity means that data is accurate, consistent, and free from errors.
- Microsoft SQL Server is a commercial DBMS with a strong focus on integration with Microsoft technologies. It offers advanced features for data warehousing, reporting, and business intelligence. However, it comes at a cost, and some features may be specific to Windows-based environments*.
- *Environment = the specific setup and conditions your applications and data need to run. This includes hardware, software, and settings that can affect how your programs work in the cloud.
Of course, these are not all the DMBS options out there. Oracle, MongoDB, SQLite, MariaDB and Redis are quite popular too (but we'll be here all day if we're going through all of 'em)!
Who uses databases?
Databases are used by a wide range of organisations and industries, including:
- Businesses: For managing customer data, inventory, and sales records.
- Healthcare: To store patient information, medical records, and research data.
- Government: To manage citizen data, tax records, and regulatory information.
With lots of different use cases, databases can look very different from each other! Some key types are:
- Customer databases: Companies like NextWork, Netflix, and Facebook run customer databases to manage user accounts, preferences, and their purchase history.
- E-commerce databases: E-commerce sites like TradeMe, Amazon and Alibaba use databases to manage the products being sold, user reviews, and order processing.
- Inventory management databases: Retailers like Walmart, H&M and New World supermarket use databases to keep an eye on inventory levels, order products, and track shipping going into their warehouses.
But wait, what about storage services?
You've got a good eye 😉. Both databases and storage services do the same thing - store data - but they serve different purposes.
- Databases are more focused on structured data management and query capabilities,
- Storage services are designed for efficient storing and serving a big volume of unstructured data like media files, documents, backups, and other binary objects.
🚨 Important note: 🚨 it's not that databases can't store pictures, videos, or documents.
Databases absolutely can store binary objects* too. It's just that your choice depends on how you plan to use the data you plan to store.
*Binary objects are computer files like photos, videos, music, PDFs, and software. What's interesting is that all these files are actually made up of only two things: 0s and 1s! That's what makes them binary (binary = representing information using only two symbols, typically 0s and 1s). On the other hand, non-binary objects usually contain text data, like code files. So, when we talk about binary, we're talking about the 0s and 1s that make up all these different computer files.
- Do you need to store and retrieve large files like photos, videos, or documents? If yes, then storage services like Amazon S3 are your best bet. They're perfect for managing lots of media content or storing backups.
- Do you plan to organise structured data like customer information, product details, or transaction records? If that's the case, a database is the one for you. They're great for when you want to keep your structured data tidy and accessible.
Are there database services in AWS?
Yes! AWS offers a range of database services for different needs. You'll learn in depth about how to use these, but as a quick sneak peek, here are the popular AWS database services:
- Amazon RDS (Relational Database Service) is the classic database service. It's the cloud version of all kinds of database engines like MySQL.
- Amazon DynamoDB handles data storage for fast and flexible applications, great for when you want your apps to be quick and handle lots of users.
- Amazon Aurora is a great option for big companies to manage lots of data and make sure it's available all the time, sort of like a digital librarian for big collections of information.
- Amazon Redshift helps you analyse large amounts of data quickly, perfect for when you need to find insights and trends in your information.
- Amazon Neptune is a service for building applications that need to work with interconnected data, like building a map of who knows who in a social network, or a recommendation system. That's why it's called a graph database!
- Amazon DocumentDB is great at handling flexible and changing information, making it perfect for web applications that deal with data in various shapes and sizes.
- Amazon ElastiCache is an in-memory database. This means it makes your applications faster by storing frequently used information nearby, a bit like taking library books home for quick access.
So are these services the only way to set up databases?
Nope! The services above are called managed database services, which means a lot of the database set up is done for you.
You also have the option to set up databases in a more manual way by using compute services. More on compute services later in the course!