How I Select AWS Services for Running My Apps

As a seasoned full-stack developer, I‘ve deployed hundreds of applications on Amazon Web Services (AWS). One of the most crucial design decisions is selecting which compute services to use for each component of the system. With options like EC2 instances, Docker containers, Elastic Beanstalk, and Lambda functions, there are many factors to consider that can have a major impact on performance, scalability, cost, and DevOps overhead.

In this in-depth guide, I‘ll share my hard-earned lessons for evaluating the needs of an application and systematically mapping them to the optimal AWS services. I‘ll discuss the key technical and business considerations, provide real-world examples of the thought process, and share best practices for implementing each compute option. Whether you‘re a solo developer, an architect at a startup, or an enterprise tech lead, this advice will help you make better decisions and avoid costly mistakes.

Key Factors to Consider

The three main factors I look at when choosing AWS application services are:

  1. Desired level of control and management – Do you need full control over the OS and infrastructure or is a managed service sufficient? IaaS maximizes control, PaaS offloads management, and FaaS abstracts away the servers entirely.

  2. Application usage patterns – Will the workload be "always on" or execute on-demand? Continuously running services need provisioned capacity, while irregularly used apps can take advantage of serverless.

  3. Resource requirements – How much memory, CPU, and I/O does the application need? Will it process large volumes of data or perform complex computations? The task duration and intensity determine the suitable options.

Let‘s dive deeper into each of these areas and how I assess the tradeoffs.

Infrastructure Control and Management

AWS provides services across the spectrum of infrastructure management:

  • Infrastructure as a Service (IaaS) – Maximum control over servers
    • EC2 – Configurable Linux and Windows VMs
    • ECS – Run containerized apps on managed EC2 clusters
    • EKS – Kubernetes as a service
  • Platform as a Service (PaaS) – Managed runtime for apps
    • Elastic Beanstalk – Deploy apps to managed environments
    • Lightsail – Simple app servers with bundled resources
  • Functions as a Service (FaaS) – Serverless compute
    • Lambda – Run code without managing infrastructure
    • Fargate – Serverless compute for containers

IaaS options like EC2 offer the most flexibility and control. You can fully customize the OS, apply security policies, configure the network, and attach storage. They are necessary for legacy workloads that require specific OS versions/packages. You can rightsize instances, use Spot Instances for discounts, and get volume pricing.

But all this control comes with significant management overhead. You have to handle OS patches, security hardening, monitoring, and scaling. AWS estimates 31% of EC2 spend is on people instead of infrastructure.

At the other end, FaaS services like Lambda completely abstract the servers away. Simply upload your code, configure the memory/timeout, and AWS provisions and scales the execution environment automatically. For applications that can be broken into small, independent tasks, Lambda can dramatically reduce ops burden and cost.

In between are PaaS options like Elastic Beanstalk that automate management of the underlying instances while still giving some control over the stack and supporting resources. They provide a comfortable middle ground when you need more than FaaS but less than IaaS.

Personally, I bias towards managed services and serverless as much as possible for new apps. Focusing on the differentiating business logic instead of undifferentiated heavy lifting pays off in faster time to market. But there are situations when IaaS is necessary, like:

  • Strict compliance requirements (e.g., HIPAA, FedRAMP)
  • Proprietary or legacy dependencies
  • Extremely large scale (think top 100 websites)
  • Applications with major investments in IaC

Application Usage Patterns

The expected usage profile of an application is a key driver of the compute model. There are two main profiles I look at:

  1. Consistent, predictable traffic – Applications that need to be highly available and responsive, handling a steady volume of requests. Common examples:

    • Public websites and web apps
    • Backend APIs and microservices
    • Streaming and real-time services
    • Mission-critical business platforms
  2. Intermittent, event-driven workloads – Applications that only need to execute in response to triggers and can tolerate some startup latency. Typical cases:

    • Scheduled jobs and maintenance tasks
    • Data processing pipelines and ETL
    • Asynchronous background jobs
    • Chatbots and voice assistants

Continuously running services require permanent compute resources and favor IaaS/PaaS approaches. With EC2 or Elastic Beanstalk, you always have instances ready to handle incoming requests, resulting in predictable performance. You can scale out when needed using Auto Scaling groups and Elastic Load Balancing.

But that comes at a cost – you‘re paying for those servers 24/7, whether they are processing requests or not. In 2016, AWS surveyed users and found many EC2 instances sat idle for 8+ hours per day. That‘s a lot of wasted spend for unused capacity.

Event-driven apps only need to execute on-demand in response to triggers like HTTP requests, uploaded files, database changes, scheduled events, etc. Serverless technologies are ideal for these workloads. Lambda functions automatically activate when triggered and scale precisely with the request load, so you only pay for computing time actually used.

For example, an image processing app I recently built used Lambda to create thumbnails whenever a new photo was uploaded to S3. The Lambda was invoked maybe 100 times per day and had an average duration of 500ms, so it only cost a few dollars per month. Processing the same volume on dedicated EC2 instances would have been 10x more expensive.

In practice, many of my projects use a hybrid of these patterns, with a core set of critical services running on IaaS/PaaS and on-demand tasks on FaaS. The key is being intentional about the compute for each piece and avoiding servers for components that can be serverless.

Estimating Resource Requirements

Every application has different computing requirements based on the type of processing it performs and the volume of data it handles. Accurately predicting the CPU, memory, and I/O needs is essential for selecting services that can deliver responsive performance without over-provisioning.

For IaaS, this means choosing the optimal instance types and sizes. At the low end, a t2.nano EC2 instance has 0.5GB RAM and minimal CPU, suitable for basic testing. On the high end, a x1.32xlarge has 128 vCPUs and 1952 GB of memory for in-memory databases and big data processing.

To choose, I start by load testing the application to get a baseline performance profile. Tools like Apache Bench or Locust simulate real-world traffic and measure key metrics:

  • Requests per second
  • Average and percentile response times
  • CPU and memory utilization

With this data, I can estimate the computing resources needed at different scales by doing capacity planning calculations. For an API that gets 100 req/sec and each request consumes 100MB memory, I know I need at least 10GB available. If the processing is CPU-bound, I can select instance types with more cores.

It‘s important to have accurate data so you don‘t under or over provision. I‘ve been burned by guessing too low on memory and having out-of-memory crashes in production. Now I always include a safety margin of 2-3x expected usage.

For FaaS, the main limitations are execution duration, memory, and concurrency. Lambda functions can run for up to 15 minutes, use up to 3GB memory, and scale to 1000 concurrent executions (with higher limits available). Most of my Lambda use cases like web APIs, event streaming, and data transformations are well within these bounds.

However, I have run into challenges with long-running and memory-intensive workloads on Lambda. One project used Lambda to process large XML files, but the complex parsing logic caused timeouts. Moving that step to an ECS task with 8GB memory solved it. Another project exceeded the concurrent Lambda limits during a traffic spike, so I switched to API Gateway‘s built-in throttling.

In general, FaaS is best suited for short-duration, stateless, I/O bound tasks. Compute-heavy processing, large data volumes, and long-running jobs are better on IaaS with appropriate sizing.

Matching Services to Workloads

With an understanding of control level, usage pattern, and resource needs, I can narrow down the viable compute options for a given application. Here are a few real-world examples of my thought process:

Public API – Lambda and API Gateway

I frequently build REST APIs to expose microservices or data from backend systems. My default architecture is:

  • Lambda functions for the API logic
  • API Gateway for managed routing and authentication
  • DynamoDB or Aurora for the database

For example, I created a product catalog API to list inventory from an ecommerce system. The Lambda queries DynamoDB and returns products in the requested category. API Gateway secures it with an API key. This stack auto-scales to any load and costs pennies unless there is usage.

Key considerations:

  • Well-defined API contract and minimal business logic fit FaaS perfectly
  • Pay-per-request serverless is ideal for unpredictable loads
  • Quick iteration and deployment using Serverless Framework
  • Easy to add features using other Lambda triggers and services

High Throughput Data Pipeline – AWS Batch

At the other end of the spectrum, I worked on an application to process and analyze terabytes of time-series data from IoT sensors. The scale and complexity was too large for Lambda, so we used AWS Batch:

  • Data uploaded from sensors to S3 throughout the day
  • Batch jobs pulled data into Redshift and ran statistical analysis
  • Results used to train machine learning models and generate reports

Batch let us decouple the data ingestion from the processing and run at massive scale. We could optimize the EC2 instances for memory and CPU separately for each step of the pipeline. Spot instances cut the cost by 70% compared to on-demand. CloudWatch Events scheduled the jobs to run hourly.

Key considerations:

  • Huge data volume and intense processing fit IaaS, not FaaS
  • Batch for scheduling, orchestration, and auto-scaling
  • Spot instances for big cost savings on fault-tolerant workloads

Containerized Web App – ECS on Fargate

Containers have become my preferred way to package and deploy web apps and services. I use Docker to encapsulate the application runtime, dependencies, and configuration into a reproducible image. ECS and Fargate provide a serverless container platform:

  • Build and test container images locally
  • Push to Elastic Container Registry (ECR)
  • Define application and scaling parameters in Task Definition
  • Run on ECS using Fargate for serverless compute

This approach gives an ideal balance of control and automation. I can fine-tune the memory and CPU for each task, while ECS handles the underlying server management. Rolling updates and service discovery make it easy to run highly available and scalable services.

For example, I migrated a Node.js app from Elastic Beanstalk to ECS. It was simple to Dockerize the app and set up an ECS service. The cluster auto-scaled to handle 4x the previous peak load with no downtime. And it was 20% cheaper because I could rightsize the task size and have multiple containers per EC2 instance.

Key considerations:

  • Containers for packaging and runtime consistency across environments
  • ECS as the orchestration layer to manage deployment and availability
  • Fargate for serverless compute when possible, EC2 for more control
  • Service discovery and auto scaling to add capacity seamlessly

Putting It All Together

AWS provides an array of powerful building blocks for deploying applications. But that flexibility means we have important architectural choices to make upfront and continuously reassess.

For modern apps, I follow these principles:

  1. Use serverless first – Start with Lambda and Fargate to minimize ops overhead. Only use EC2 when absolutely needed for performance or control.

  2. Decompose by workload – Break apps into components with clear interfaces. Let the compute fit the workload – web tiers as containers, batch jobs as Batch, event streaming as Lambda.

  3. Leverage higher level services – Don‘t reinvent the wheel. Prefer fully managed services like API Gateway, S3, and DynamoDB to reduce custom code.

  4. Design for iteration – Make choices that enable experimentation and pivoting as requirements evolve. Containerize services, use IaC, and decouple with events and queues.

  5. Automate ruthlessly – Implement CI/CD pipelines, infrastructure as code, and auto scaling from the start. Automation increases agility and reduces risk.

But it‘s not one size fits all. Use the framework of evaluating control, usage pattern, and resource needs to guide decisions. Measure everything and don‘t be afraid to refactor as your scale or requirements change. No application architecture is static.

Ultimately, focus on delivering business value, not managing infrastructure. Choose the simplest deployments that meet requirements. Rightsize resources and always be looking for efficiencies. With the wide range of AWS compute options, there‘s sure to be a fit for your use case. The key is being informed, intentional, and proactive in your choices.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *