Skip to main content
DevOpsLabTH.dev

AWS Solutions Architect

The hands-on AWS depth that sits on top of the fundamentals. IAM internals, EC2 and storage, VPC networking, and managed databases, the services you actually touch when something is on fire.

intermediate36 labs
Recommended first

You can still start this course now. Earlier courses give you the mental model the labs here assume.

By the end you'll be able to
  • Lock down IAM without breaking deploys
  • Diagnose why an EC2 instance can't reach the internet
  • Read a VPC diagram and find the missing route
  • Find the right log to read at 2am

Labs

  1. 01
    IAM users, roles, and policies
    Every AWS permission question comes down to three nouns. A user is a person with credentials, a role is an identity something assumes temporarily, and a policy is the JSON that grants the permissions. Get these three straight and the rest of IAM follows.
    15 min
    Start
  2. 02
    Policies deep, managed vs inline, statements, conditions
    Move from knowing what a policy is to understanding how policies behave. A managed policy is a reusable object, an inline policy lives and dies with one identity, explicit Deny always wins, and a Condition block fences a grant to facts about the request.
    15 min
    Start
  3. 03
    Roles for services, EC2 instance roles, OIDC for GitHub Actions
    Machines should never hold long-lived keys. Two patterns make that real, an instance role that EC2 assumes through an instance profile, and an OIDC provider that lets GitHub Actions deploy with credentials that expire in minutes. Both rest on the trust policy, the half of a role that says who may assume it.
    15 min
    Start
  4. 04
    Least privilege in practice
    Wildcard policies feel productive and age into incidents. Least privilege means giving an identity exactly the actions and resources its job needs, then tightening further as real usage shows it needs even less. It is the discipline that turns a leaked credential from a takeover into a contained event.
    15 min
    Start
  5. 05
    EC2 instances and their lifecycle
    EC2 is rented compute, billed by the second while it runs. Understand the two inputs every launch needs, how to read what an instance type name like t3.micro promises, and the lifecycle from running to stopped to terminated, where stop parks and terminate destroys.
    15 min
    Start
  6. 06
    AMIs, the boot template
    An AMI is the disk template every EC2 launch starts from. Understand how the image catalog is searched and scoped by owner, how a golden image is baked from a running instance, and how cross-account sharing distributes images without copying data.
    15 min
    Start
  7. 07
    Security groups, the stateful firewall
    A security group is the firewall wrapped around every instance, deny inbound by default and stateful by design. Understand how ingress rules open ports, why source scoping is the whole game, and what stateful tracking buys you that a stateless network ACL does not.
    15 min
    Start
  8. 08
    Key pairs vs SSM Session Manager
    Two ways into an instance, an SSH key pair through port 22 or Session Manager through the SSM agent. Understand what each half of a key pair does, how Session Manager authenticates by IAM over an outbound channel, and how to choose between them for a given operational constraint.
    15 min
    Start
  9. 09
    S3, buckets, objects, policies, lifecycle
    S3 is AWS's object store. It keeps blobs of bytes inside globally named buckets, addressed over HTTP, with no real folders. Understand the data model, how a bucket policy grants or denies access, and how lifecycle rules age data out so a bucket does not grow forever.
    15 min
    Start
  10. 10
    EBS, volumes, snapshots, encryption
    EBS is the network attached block storage behind EC2. A volume looks like a raw disk to one instance, lives in a single Availability Zone, and is backed up by snapshots that live regionally. Understand the volume model, what gp3 buys you, how snapshots move data across zones, and why encryption is a property a volume is born with.
    15 min
    Start
  11. 11
    EFS, shared files vs S3 vs EBS
    EFS is the managed NFS filesystem that many instances mount at once. It is the third AWS storage model, alongside block storage (EBS) and object storage (S3). Understand what EFS gives you that the other two cannot, then sort all three by storage model, scope, access method, and capacity behavior so any workload picks itself.
    15 min
    Start
  12. 12
    Your first VPC
    A VPC is your own private network inside an AWS region, defined by one CIDR block and isolated from everything until you wire it up. Understand what a VPC is, why it starts closed, how its address range is chosen, and how tags let you find resources that otherwise carry only random IDs.
    15 min
    Start
  13. 13
    Subnets and route tables
    A VPC is one big address block until you carve it into subnets and decide where each subnet's traffic goes. Subnets pin addresses to a single AZ, route tables decide the path out, and the difference between a public and a private subnet is nothing more than which route table it follows.
    15 min
    Start
  14. 14
    Internet gateway and NAT gateway
    A VPC has no path to the internet until you give it one. An internet gateway makes a subnet reachable from the internet in both directions, a NAT gateway lets private instances reach out without becoming reachable. The deciding question for any flow is who opens the connection.
    15 min
    Start
  15. 15
    VPC peering and Transit Gateway
    VPCs are isolated by default. Peering links two of them point to point, a Transit Gateway is a hub that connects many at once. The choice between them comes down to the CIDR overlap rule, how the connection count grows, and whether traffic can travel through a middle network.
    15 min
    Start
  16. 16
    Launching a managed database with RDS
    RDS runs the Postgres engine for you, taking over the host, storage, backups, and failover so you keep only the schema and the queries. Launching one is a single declarative decision that bundles identity, hardware, credentials, and network exposure, and each of those is a choice with consequences.
    15 min
    Start
  17. 17
    RDS parameter groups and tuning
    On RDS you never get a shell or a postgresql.conf, so a parameter group is how you change engine settings. Some settings apply live, others wait for a reboot, and the skill of tuning is mapping a production symptom to the one setting that controls it.
    15 min
    Start
  18. 18
    RDS backups, PITR, and read replicas
    Backups answer how much data you can lose, replicas answer how much traffic you can carry, and the two get confused constantly. The honest test of a recovery plan is whether it can state its RPO and RTO in minutes, and the trap is the dropped table, the one failure that replication makes worse.
    15 min
    Start
  19. 19
    AWS CLI configuration and named profiles
    The aws CLI decides who you are and where you call by walking a fixed precedence chain across command-line flags, environment variables, and two plain text files under ~/.aws. Understand where credentials and settings live, how named profiles let one machine talk to many accounts, and which source wins when they disagree.
    15 min
    Start
  20. 20
    Assume a role with STS
    Temporary credentials from STS are how engineers and automation cross account and permission boundaries without long-lived keys. Understand what a role is, how its trust policy decides who may step into it, what the three-part temporary credential set contains, and how a tool like aws-vault automates the whole handoff.
    15 min
    Start
  21. 21
    Tagging standards and cost allocation
    Tags are the only way to slice an AWS bill by team, project, or environment, and they only work when a standard is written down and enforced. Understand what a tag is, why AWS enforces none by default, how a written standard turns into an audit, and why cost reporting quality is decided at tagging time rather than at reporting time.
    15 min
    Start
  22. 22
    Budget alerts and reserved capacity math
    A budget alert is arithmetic, month-to-date spend projected to month end and compared with a threshold. Reserved capacity is arithmetic too, a committed rate weighed against the on-demand rate. Understand both calculations so the Budgets console and reservation recommendations stop being black boxes.
    15 min
    Start
  23. 23
    Elastic Load Balancing
    A load balancer takes one stream of incoming traffic and spreads it across many servers, dropping any that fail its health checks. It is the piece that turns a single server into a resilient service and the foundation of nearly every highly available design on AWS.
    15 min
    Start
  24. 24
    EC2 Auto Scaling
    An Auto Scaling group keeps a fleet of EC2 instances at the size you want, adding instances when demand rises, removing them when it falls, and replacing any that fail. It is what makes a service both elastic and self-healing, and it pairs directly with a load balancer.
    15 min
    Start
  25. 25
    Route 53 routing policies
    Route 53 is DNS, but its routing policies make it a traffic controller. Beyond turning a name into an address, it can split traffic by weight, send users to the lowest-latency region, fail over to a backup when health checks fail, or route by geography. These policies are how global, resilient designs steer traffic.
    15 min
    Start
  26. 26
    Decoupling with SQS and SNS
    Connecting components directly makes each one depend on the other being up and fast. Putting a queue or a topic between them breaks that dependency. SQS buffers work in a queue for a consumer to process later, SNS fans a message out to many subscribers, and the result is a system that absorbs spikes and survives a component going down.
    15 min
    Start
  27. 27
    Disaster recovery strategies
    Disaster recovery is the plan for surviving a major failure, like losing an entire region. Two numbers drive it, how fast you must be back and how much data you can afford to lose, and four strategies trade cost against speed, from cheap-and-slow backup and restore to expensive-and-instant multi-site.
    15 min
    Start
  28. 28
    DynamoDB at scale
    DynamoDB is a serverless NoSQL database that holds single-digit-millisecond latency no matter how large it grows. It scales horizontally where a relational database scales up, and you design it around access patterns rather than joins. Knowing when it beats RDS, and how its keys work, is the goal.
    15 min
    Start
  29. 29
    Aurora, the cloud-native relational database
    Aurora is AWS's own relational engine, compatible with MySQL and PostgreSQL but rebuilt for the cloud. It separates compute from a distributed storage layer that auto-grows and keeps six copies across three Availability Zones, giving higher performance, faster failover, and easier read scaling than standard RDS.
    15 min
    Start
  30. 30
    Caching with CloudFront and ElastiCache
    Caching keeps frequently-used data somewhere faster and closer so it is not fetched the slow way every time. CloudFront caches content at edge locations near users, and ElastiCache holds hot data in memory in front of a database. Both cut latency and take load off the origin.
    15 min
    Start
  31. 31
    Serverless with Lambda and API Gateway
    Serverless means running code without managing any servers. Lambda runs a function when an event triggers it and charges only for the milliseconds it runs, and API Gateway is the managed front door that turns HTTP requests into those triggers. Together with a database like DynamoDB they build a whole application with nothing to provision.
    15 min
    Start
  32. 32
    Containers with ECS, EKS, and Fargate
    Containers package an application with its dependencies so it runs the same everywhere. AWS orchestrates them two ways, ECS as its own simpler orchestrator and EKS as managed Kubernetes, and runs them on two launch types, EC2 instances you manage or Fargate where you manage nothing. Knowing which combination fits is the goal.
    15 min
    Start
  33. 33
    Secrets management
    Applications need passwords and API keys at runtime, and the wrong place for them is hardcoded in code or baked into an image where they leak. Secrets Manager stores and rotates them, Parameter Store offers a lighter option, and both hand secrets out only to IAM-authorized callers, encrypted and audited.
    15 min
    Start
  34. 34
    VPC endpoints and PrivateLink
    By default, reaching a service like S3 from inside a VPC sends traffic out over the internet. VPC endpoints keep that traffic on the AWS network instead, never touching the internet. Gateway endpoints cover S3 and DynamoDB for free, and interface endpoints, powered by PrivateLink, cover most other services.
    15 min
    Start
  35. 35
    Network security, security groups and NACLs
    A VPC has two firewall layers. Security groups guard each instance and are stateful, so return traffic is automatic. Network ACLs guard each subnet and are stateless, so both directions must be allowed explicitly. Knowing which does what, and the stateful-versus-stateless trap, is the goal.
    15 min
    Start
  36. 36
    S3 storage classes and lifecycle
    S3 is not one price. Its storage classes trade retrieval speed and cost, from Standard for hot data down to Glacier for archives, and lifecycle rules move objects between them automatically as they age. Matching the class to how often data is read is one of the most reliable ways to cut a bill.
    15 min
    Start