AWS Solutions Architect

The hands-on AWS depth that sits on top of the fundamentals. IAM internals, EC2 and storage, VPC networking, and managed databases, the services you actually touch when something is on fire.

intermediate36 labs

Recommended first

AWS Cloud Practitioner

You can still start this course now. Earlier courses give you the mental model the labs here assume.

By the end you'll be able to

Lock down IAM without breaking deploys
Diagnose why an EC2 instance can't reach the internet
Read a VPC diagram and find the missing route
Find the right log to read at 2am

Labs

01
IAM users, roles, and policies
Every AWS permission question comes down to three nouns. A user is a person with credentials, a role is an identity something assumes temporarily, and a policy is the JSON that grants the permissions. Get these three straight and the rest of IAM follows.
15 min
Start
02
Policies deep, managed vs inline, statements, conditions
Move from knowing what a policy is to understanding how policies behave. A managed policy is a reusable object, an inline policy lives and dies with one identity, explicit Deny always wins, and a Condition block fences a grant to facts about the request.
15 min
Start
03
Roles for services, EC2 instance roles, OIDC for GitHub Actions
Machines should never hold long-lived keys. Two patterns make that real, an instance role that EC2 assumes through an instance profile, and an OIDC provider that lets GitHub Actions deploy with credentials that expire in minutes. Both rest on the trust policy, the half of a role that says who may assume it.
15 min
Start
04
Least privilege in practice
Wildcard policies feel productive and age into incidents. Least privilege means giving an identity exactly the actions and resources its job needs, then tightening further as real usage shows it needs even less. It is the discipline that turns a leaked credential from a takeover into a contained event.
15 min
Start
05
EC2 instances and their lifecycle
EC2 is rented compute, billed by the second while it runs. Understand the two inputs every launch needs, how to read what an instance type name like t3.micro promises, and the lifecycle from running to stopped to terminated, where stop parks and terminate destroys.
15 min
Start
06
AMIs, the boot template
An AMI is the disk template every EC2 launch starts from. Understand how the image catalog is searched and scoped by owner, how a golden image is baked from a running instance, and how cross-account sharing distributes images without copying data.
15 min
Start
07
Security groups, the stateful firewall
A security group is the firewall wrapped around every instance, deny inbound by default and stateful by design. Understand how ingress rules open ports, why source scoping is the whole game, and what stateful tracking buys you that a stateless network ACL does not.
15 min
Start
08
Key pairs vs SSM Session Manager
Two ways into an instance, an SSH key pair through port 22 or Session Manager through the SSM agent. Understand what each half of a key pair does, how Session Manager authenticates by IAM over an outbound channel, and how to choose between them for a given operational constraint.
15 min
Start
09
S3, buckets, objects, policies, lifecycle
S3 is AWS's object store. It keeps blobs of bytes inside globally named buckets, addressed over HTTP, with no real folders. Understand the data model, how a bucket policy grants or denies access, and how lifecycle rules age data out so a bucket does not grow forever.
15 min
Start
10
EBS, volumes, snapshots, encryption
EBS is the network attached block storage behind EC2. A volume looks like a raw disk to one instance, lives in a single Availability Zone, and is backed up by snapshots that live regionally. Understand the volume model, what gp3 buys you, how snapshots move data across zones, and why encryption is a property a volume is born with.
15 min
Start
11
EFS, shared files vs S3 vs EBS
EFS is the managed NFS filesystem that many instances mount at once. It is the third AWS storage model, alongside block storage (EBS) and object storage (S3). Understand what EFS gives you that the other two cannot, then sort all three by storage model, scope, access method, and capacity behavior so any workload picks itself.
15 min
Start
12
Your first VPC
A VPC is your own private network inside an AWS region, defined by one CIDR block and isolated from everything until you wire it up. Understand what a VPC is, why it starts closed, how its address range is chosen, and how tags let you find resources that otherwise carry only random IDs.
15 min
Start
13
Subnets and route tables
A VPC is one big address block until you carve it into subnets and decide where each subnet's traffic goes. Subnets pin addresses to a single AZ, route tables decide the path out, and the difference between a public and a private subnet is nothing more than which route table it follows.
15 min
Start
14
Internet gateway and NAT gateway
A VPC has no path to the internet until you give it one. An internet gateway makes a subnet reachable from the internet in both directions, a NAT gateway lets private instances reach out without becoming reachable. The deciding question for any flow is who opens the connection.
15 min
Start
15
VPC peering and Transit Gateway
VPCs are isolated by default. Peering links two of them point to point, a Transit Gateway is a hub that connects many at once. The choice between them comes down to the CIDR overlap rule, how the connection count grows, and whether traffic can travel through a middle network.
15 min
Start
16
Launching a managed database with RDS
RDS runs the Postgres engine for you, taking over the host, storage, backups, and failover so you keep only the schema and the queries. Launching one is a single declarative decision that bundles identity, hardware, credentials, and network exposure, and each of those is a choice with consequences.
15 min
Start
17
RDS parameter groups and tuning
On RDS you never get a shell or a postgresql.conf, so a parameter group is how you change engine settings. Some settings apply live, others wait for a reboot, and the skill of tuning is mapping a production symptom to the one setting that controls it.
15 min
Start
18
RDS backups, PITR, and read replicas
Backups answer how much data you can lose, replicas answer how much traffic you can carry, and the two get confused constantly. The honest test of a recovery plan is whether it can state its RPO and RTO in minutes, and the trap is the dropped table, the one failure that replication makes worse.
15 min
Start
19
AWS CLI configuration and named profiles
The aws CLI decides who you are and where you call by walking a fixed precedence chain across command-line flags, environment variables, and two plain text files under ~/.aws. Understand where credentials and settings live, how named profiles let one machine talk to many accounts, and which source wins when they disagree.
15 min
Start
20
Assume a role with STS
Temporary credentials from STS are how engineers and automation cross account and permission boundaries without long-lived keys. Understand what a role is, how its trust policy decides who may step into it, what the three-part temporary credential set contains, and how a tool like aws-vault automates the whole handoff.
15 min
Start
21
Tagging standards and cost allocation
Tags are the only way to slice an AWS bill by team, project, or environment, and they only work when a standard is written down and enforced. Understand what a tag is, why AWS enforces none by default, how a written standard turns into an audit, and why cost reporting quality is decided at tagging time rather than at reporting time.
15 min
Start
22
Budget alerts and reserved capacity math
A budget alert is arithmetic, month-to-date spend projected to month end and compared with a threshold. Reserved capacity is arithmetic too, a committed rate weighed against the on-demand rate. Understand both calculations so the Budgets console and reservation recommendations stop being black boxes.
15 min
Start
23
Elastic Load Balancing
A load balancer takes one stream of incoming traffic and spreads it across many servers, dropping any that fail its health checks. It is the piece that turns a single server into a resilient service and the foundation of nearly every highly available design on AWS.
15 min
Start
24
EC2 Auto Scaling
An Auto Scaling group keeps a fleet of EC2 instances at the size you want, adding instances when demand rises, removing them when it falls, and replacing any that fail. It is what makes a service both elastic and self-healing, and it pairs directly with a load balancer.
15 min
Start
25
Route 53 routing policies
Route 53 is DNS, but its routing policies make it a traffic controller. Beyond turning a name into an address, it can split traffic by weight, send users to the lowest-latency region, fail over to a backup when health checks fail, or route by geography. These policies are how global, resilient designs steer traffic.
15 min
Start
26
Decoupling with SQS and SNS
Connecting components directly makes each one depend on the other being up and fast. Putting a queue or a topic between them breaks that dependency. SQS buffers work in a queue for a consumer to process later, SNS fans a message out to many subscribers, and the result is a system that absorbs spikes and survives a component going down.
15 min
Start
27
Disaster recovery strategies
Disaster recovery is the plan for surviving a major failure, like losing an entire region. Two numbers drive it, how fast you must be back and how much data you can afford to lose, and four strategies trade cost against speed, from cheap-and-slow backup and restore to expensive-and-instant multi-site.
15 min
Start
28
DynamoDB at scale
DynamoDB is a serverless NoSQL database that holds single-digit-millisecond latency no matter how large it grows. It scales horizontally where a relational database scales up, and you design it around access patterns rather than joins. Knowing when it beats RDS, and how its keys work, is the goal.
15 min
Start
29
Aurora, the cloud-native relational database
Aurora is AWS's own relational engine, compatible with MySQL and PostgreSQL but rebuilt for the cloud. It separates compute from a distributed storage layer that auto-grows and keeps six copies across three Availability Zones, giving higher performance, faster failover, and easier read scaling than standard RDS.
15 min
Start
30
Caching with CloudFront and ElastiCache
Caching keeps frequently-used data somewhere faster and closer so it is not fetched the slow way every time. CloudFront caches content at edge locations near users, and ElastiCache holds hot data in memory in front of a database. Both cut latency and take load off the origin.
15 min
Start
31
Serverless with Lambda and API Gateway
Serverless means running code without managing any servers. Lambda runs a function when an event triggers it and charges only for the milliseconds it runs, and API Gateway is the managed front door that turns HTTP requests into those triggers. Together with a database like DynamoDB they build a whole application with nothing to provision.
15 min
Start
32
Containers with ECS, EKS, and Fargate
Containers package an application with its dependencies so it runs the same everywhere. AWS orchestrates them two ways, ECS as its own simpler orchestrator and EKS as managed Kubernetes, and runs them on two launch types, EC2 instances you manage or Fargate where you manage nothing. Knowing which combination fits is the goal.
15 min
Start
33
Secrets management
Applications need passwords and API keys at runtime, and the wrong place for them is hardcoded in code or baked into an image where they leak. Secrets Manager stores and rotates them, Parameter Store offers a lighter option, and both hand secrets out only to IAM-authorized callers, encrypted and audited.
15 min
Start
34
VPC endpoints and PrivateLink
By default, reaching a service like S3 from inside a VPC sends traffic out over the internet. VPC endpoints keep that traffic on the AWS network instead, never touching the internet. Gateway endpoints cover S3 and DynamoDB for free, and interface endpoints, powered by PrivateLink, cover most other services.
15 min
Start
35
Network security, security groups and NACLs
A VPC has two firewall layers. Security groups guard each instance and are stateful, so return traffic is automatic. Network ACLs guard each subnet and are stateless, so both directions must be allowed explicitly. Knowing which does what, and the stateful-versus-stateless trap, is the goal.
15 min
Start
36
S3 storage classes and lifecycle
S3 is not one price. Its storage classes trade retrieval speed and cost, from Standard for hot data down to Glacier for archives, and lifecycle rules move objects between them automatically as they age. Matching the class to how often data is read is one of the most reliable ways to cut a bill.
15 min
Start