Introduction to AWS Aurora

Introduction to AWS Aurora

Amazon Aurora is a proprietary technology from Amazon Web Services(AWS). It is not open sourced. It is a fully managed Relational Database engine that's compatible with below Database Management Systems:

  • MySQL
  • PostgreSQL

aurora.png

It provides us the speed and dependability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases.

Aurora is cloud optimized and claims:

  • 3x performance boost over PostgreSQL on RDS
  • 5x performance boost over MySQL on RDS

aurora-comparison.png

Storage

Aurora storage automatically grows as required. An Aurora cluster volume can grow in increments of 10GB to a maimum size of 128TB. So, this helps to prevent continous monitoring the size as we know that it will grow automatically with time.

Aurora automates and standardizes database clustering and replication, which are typically among the most challenging aspects of database configuration and administration.

Aurora offers high availability by replicating 6 copies of data across the 3 Availability Zones (AZs). That is 2 copies of the data in each AZs. Aurora backs up the data to S3 continuously as a precaution from storage failure. It recovers from the physical storage failure in less than 30 seconds.

High Availability and Read Scaling

  • 6 copies of data across 3 AZs (Anytime you write anything - 2 copies on each AZ)
    • Required 4 copies out of 6 for Writes (If 2 AZs are down, we're still good to go!)
    • Required 3 copies out of 6 for Reads (If 3 AZs are down, we're still good to go!)
  • It has a self-healing feature which scans disks and data blocks continuously for errors and repairs them if detected. It also isolates the database buffer cache from database processes, allowing the cache to pull through a database restart.

Architechture

  • One Aurora instance serves as a global database and engages the write operations.
  • On top of the master it can have upto 15 Read Replicas spread across different AZs. (RDS offers only 5 replicas).
  • If a Read Replica fails Aurora can start a new one with minimal lag (~ 30 seconds).
  • If the Write Master fails, one of the Read Replicas gets automatically promoted to serve as a Master Write node, thus maintaining resiliency.
  • Supports Cross-Region Replication

How it works??

aurora-architechture.png

The Master is the only one which will do the write processes.

Now, there may be scenarios where the Writer(Master) DB is changed, or due to a failover incident one of the Reader Replicas start serving as a Master. So, to handle such situations gracefully Aurora provides a Writer Endpoint which is a DNS name and is always pointing to the Master. Even if the Master fails, it is automatically redirected to the serving Master instance available.

Similarly we have Reader Endpoint, it helps with Connection Load Balancing. It connects to all the Read Replicas. Anytime the client connects to the Reader Endpoint it will get connected to one of the Read Replica

Features at a glance

Basic Features

  • Automatic Fail-Over
  • Backup and Recovery
  • Isolation and Security
  • Industry Compliant
  • Auto Scaling
  • Automatic Patching and Repairs with zero downtime
  • Advanced Monitoring
  • Routine Maintainence
  • Backtrack: Restore data at any point of time without using backups. (Continuously backs up the data to Amazon S3 and provides point-in-time recovery).

Security

  • Encryption using KMS.
  • Automated backups, snapshots and replicas will be encrypted.
  • Encryption using SSL.
  • Authnetication using IAM.
  • End user's responsibility for protecting the instance using security groups.

Aurora Serverless

  • Helpful when workload cannot be predicted.
  • instance size is not required.
  • DB clusters starts/scales/shuts down based on the CPU and Connections.
  • Can be migrated from Aurora Cluster and vice-versa.

Pricing

Aurora is more expensive than RDS for the same workloads (~20% more). Aurora pricing is mainly based on instance size and storage is billed according to actual usage. Usage is measured in ACU (Aurora Capacity Units). Database capacity is measured in Aurora Capacity Units (ACUs). 1 ACU provides 2 GiB of memory and corresponding compute and networking.