amazon Simple Notification Service

Amazon Simple Notification Service (SNS) is a AWS web service that coordinates and manages the delivery or sending of messages to subscribing endpoints or clients. Amazon SNS provides developers a highly scalable, flexible, and cost-effective capability to publish messages from an application and immediately deliver them to subscribers or other applications. Amazon SNS follows the “publish-subscribe” (pub-sub) model, which is a form of asynchronous service-to-service communication used in serverless and microservices architectures.

  • Amazon SNS supports applications to send time-critical messages to multiple subscribers through a “push” mechanism, eliminating the need to periodically check or “poll” for updates.
  • Using Amazon SNS topics, AWS clients publisher systems can fan out messages to a large number of subscriber endpoints for parallel processing, including Amazon SQS queues, AWS Lambda functions, and HTTP/S webhooks. Additionally, SNS can be used to fan out notifications to end users using mobile push, SMS, and email.
  • The Amazon SNS service supports a wide variety of customers needs including event notification, monitoring applications, workflow systems, time-sensitive information updates, mobile applications, and any other application that generates or consumes notifications.
  • SNS supports AWS CloudTrail, an AWS service that records AWS API calls for customers accounts and delivers log files to them. With CloudTrail, AWS customers can obtain a history of such information as the identity of the API caller, the time of the API call, the source IP address of the API caller, the request parameters, and the response elements returned by SNS.

SNS Features

Message fanout occurs when a message is sent to a topic and then replicated and pushed to multiple endpoints. Fanout provides asynchronous event notifications, which in turn allows for parallel processing. 

  • All messages published to Amazon SNS are stored redundantly across multiple geographically separated servers and data centers. 
  • Amazon SNS reliably delivers messages to all supported AWS endpoints, such as Amazon SQS queues and AWS Lambda functions.
  •  Amazon SNS can filter and fanout events to the following destinations to support event-driven computing use cases:
    • Amazon Simple Queue Service
    • AWS Lambda
    • AWS Event Fork Pipelines
    • Webhook (HTTP/S)

Amazon SNS provides encrypted topics to protect customers messages from unauthorized and anonymous access. When customers publish messages to encrypted topics, Amazon SNS immediately encrypts those messages. 

  • The messages are stored in encrypted form, and decrypted as they are delivered to subscribing endpoints (Amazon SQS queues, AWS Lambda functions, HTTP/S webhooks). 
  • All messages published to Amazon SNS are stored redundantly across multiple geographically separated servers and data centers. 
  • Amazon SNS delivers messages to all supported AWS endpoints, such as Amazon SQS queues and AWS Lambda functions. In case the subscribed endpoint isn’t available, Amazon SNS executes message delivery retry policies and can also move messages to dead-letter queues (DLQ).

AWS support customers who want to use GPU scheduling, which allows them to specify the number and type of accelerators their jobs require as job definition input variables in AWS Batch. 

  • Graphics Processing Unit(GPU) is a processor designed to handle graphics operations. This includes both 2D and 3D calculations, though GPUs primarily excel at rendering 3D graphics.
  • AWS Batch will scale up instances appropriate for the customers jobs based on the required number of GPUs and isolate the accelerators according to each job’s needs, so only the appropriate containers can access them.
  • All instance types in a compute environment that will run GPU jobs should be from the p2, p3, g3, g3s, or g4 instance families. If this is not done a GPU job could get stuck in the RUNNABLE status.

Message filtering empowers the subscriber to create a filter policy, so that it only gets the notifications it is interested in, as opposed to receiving every single message posted to the topic.  

  • Customers can monitor their Amazon SNS message filtering activity with Amazon CloudWatch and manage Amazon SNS filter policies with AWS CloudFormation.
  • With Amazon SNS message filtering, subscribing endpoints receive only the messages of interest, instead of all messages published to the topic. 
  • Amazon CloudWatch gives visibility into customers filtering activity, and AWS CloudFormation enables customers to deploy subscription filter policies in an automated and secure manner

Amazon SNS uses cross availability zone message storage to provide high message durability. All messages published are stored redundantly across multiple geographically-separated servers and data centers. 

  • All messages published to Amazon SNS are stored redundantly across multiple geographically separated servers and data centers. 
  • Amazon SNS reliably delivers messages to all supported AWS endpoints, such as Amazon SQS queues and AWS Lambda functions. 
  • If the subscriber endpoint isn’t available, Amazon SNS executes a message delivery retry policy and can also move messages to dead-letter queues (DLQ). 

Amazon SNS supports VPC Endpoints (VPCE) via AWS PrivateLink. AWS customers can use VPC Endpoints to privately publish messages to Amazon SNS topics, from an Amazon Virtual Private Cloud (VPC), without traversing the public internet. This feature brings additional security, helps promote data privacy, and aligns with assurance programs.

  • To use AWS PrivateLink, customers don’t need to set up an Internet Gateway (IGW), Network Address Translation (NAT) device, or Virtual Private Network (VPN) connection. You don’t need to use public IP addresses, either.
  • AWS customers can deploy Amazon VPC endpoints for Amazon SNS with AWS CloudFormation.

Amazon SNS mobile notifications make it simple and cost effective to fanout mobile push notifications to iOS, Android, Fire OS, Windows and Baidu-based devices. 

  • AWS customers can also use SNS to fanout text messages (SMS) to 200+ countries and fanout email messages (SMTP).

SNS Simple API

Amazon SNS allows notifications over multiple transport protocols. Customers can select one of the transports service as part of the subscription requests:

  • “HTTP”, “HTTPS” – Subscribers specify a URL as part of the subscription registration; notifications will be delivered through an HTTP POSTHTTP POST to the specified URL.
  • ”Email”, “Email-JSON” – Messages are sent to registered addresses as email. Email-JSON sends notifications as a JSON object, while Email sends text-based email.
  • “SQS” – Users can specify an SQS standard queue as the endpoint; Amazon SNS will enqueue a notification message to the specified queue (which subscribers can then process using SQS APIs such as ReceiveMessage, DeleteMessage, etc.). Note that FIFO queues are not currently supported.
  • “SMS” – Messages are sent to registered phone numbers as SMS text messages.
Amazon Machine Image

Amazon Machine Image

Amazon SNS provides a set of simple APIs to enable event notifications for topic owners, subscribers and publishers.

Owner  operations:

  • CreateTopic:- Create a new topic.
  • DeleteTopic:- Delete a previously created topic.
  • ListTopics:- List of topics owned by a particular user (AWS ID).
  • ListSubscriptionsByTopic:- List of subscriptions for a particular topic
  • SetTopicAttributes:- Set/modify topic attributes, including setting and modifying publisher/subscriber permissions, transports supported, etc.
  • GetTopicAttributes:- Get/view existing attributes of a topic
  • AddPermission:- Grant access to selected users for the specified actions
  • RemovePermission:- Remove permissions for selected users for the specified actions

Subscriber operations:

  • Subscribe:- Register a new subscription on a particular topic, which will generate a confirmation message from Amazon SNS
  • ConfirmSubscription:- Respond to a subscription confirmation message, confirming the subscription request to receive notifications from the subscribed topic
  • UnSubscribe:- Cancel a previously registered subscription
  • ListSubscriptions:- List subscriptions owned by a particular user (AWS ID)

Publisher operations:

  • Publish:- Publish a new message to the top.

SNS Topic

AWS Batch can be integrated with commercial and open-source workflow engines and languages such as Pegasus WMS, Luigi, Nextflow, Metaflow, Apache Airflow, and AWS Step Functions, enabling you to use familiar workflow languages to model your batch computing pipelines

Video On Demand

Videos on Demand on AWS is a reference implementation that automatically provisions the AWS services necessary to build a scalable, distributed video-on-demand workflow. 

  • The solution leverages Amazon CloudWatch to monitor log files and sends Amazon SNS notifications for encoding, publishing, and errors.

AWS OPS Automator

The AWS Ops Automator is a customizable solution designed to provide a core framework for automated tasks, allowing customers to focus on extending functionality rather than managing underlying infrastructure operations. 

  • Warning and error messages are published to a solution-created Amazon SNS topic which sends messages to a subscribed email address.

AWS Answers

AWS Answers is a repository of instructional documents and solutions developed by AWS Solutions Architects to help customers build and grow their businesses on the AWS Cloud.

  • The AWS Well-Architected Framework provides a consistent approach for customers and partners to evaluate architectures, which includes operational excellence, security, reliability, performance efficiency, and cost optimization.

AWS Limit monitor

The AWS Limit Monitor enables tracking of service usage against quotas. With this easy-to-deploy solution, customers can audit the usage and make informed decisions regarding resources. 

  • If actual usage exceeds 80% of a given service quota, AWS Lambda publishes a message to the Amazon SNS topic which is sent to an email address you specify during setup.

SNS transport protocols

The notification message sent by Amazon SNS for deliveries over HTTP, HTTPS, Email-JSON and SQS transport protocols will consist of a simple JSON object such asi:

  • MessageId: A Universally Unique Identifier, unique for each notification published.
  • Timestamp: The time (in GMT) at which the notification was published.
  • TopicArn: The topic to which this message was published
  • Type: The type of the delivery message, set to “Notification” for notification deliveries.
  • UnsubscribeURL: A link to unsubscribe the end-point from this topic, and prevent receiving any further notifications.
  • Message: The payload (body) of the message, as received from the publisher.
  • Subject: The Subject field – if one was included as an optional parameter to the publish API call along with the message.
  • Signature: Base64-encoded “SHA1withRSA” signature of the Message, MessageId, Subject (if present), Type, Timestamp, and Topic values.
  • SignatureVersion: Version of the Amazon SNS signature used.

Notification messages sent over the “Email” transport only contain the payload (message body) as received from the publisher.

Amazon SNS allows notifications over multiple transport protocols. Customers can select one of the transports service as part of the subscription requests:

  • “HTTP”, “HTTPS” – Subscribers specify a URL as part of the subscription registration; notifications will be delivered through an HTTP POST to the specified URL.
  • ”Email”, “Email-JSON” – Messages are sent to registered addresses as email. Email-JSON sends notifications as a JSON object, while Email sends text-based email.
  • “SQS” – Users can specify an SQS ssimple APIstandard queue as the endpoint; Amazon SNS will enqueue a notification message to the specified queue (which subscribers can then process using SQS APIs such as ReceiveMessage, DeleteMessage, etc.). Note that FIFO queues are not currently supported.
  • “SMS” – Messages are sent to registered phone numbers as SMS text messages.

API subscriptions list

There are two APIs list subscriptions, which perform different functions and return different results:

  • The ListSubscriptionsByTopic API allows a topic owner to see the list of all subscribers actively registered to a topic.
  • The ListSubscriptions API allows a user to get a list of all their active subscriptions (to one or more topics).

Subscribers can be unsubscribed either by the topic owner, the subscription owner or others depending on how it was set up.

  • A subscription that was confirmed with the AuthenticateOnUnsubscribe flag set to True in the call to the ConfirmSubscription API call can only be unsubscribed by a topic owner or the subscription owner.
  • If the subscription was confirmed anonymously without the AuthenticateOnUnsubscribe flag set to True, then it can be anonymously unsubscribed.

In all cases except when unsubscribed by the subscription owner, a final cancellation message will be sent to the end-point, allowing the endpoint owner to easily re-subscribe to the topic (if the Unsubscribe request was unintended or in error). 

As part of the subscription registration, Amazon SNS will ensure that notifications are only sent to valid, registered subscribers/end-points. To prevent spam and ensure that a subscriber end-point is really interested in receiving notifications from a particular topic, Amazon SNS requires an explicit opt-in from subscribers using a 2-part handshake:

  1. When a user first calls the Subscribe API and subscribes an end-point, Amazon SNS will send a confirmation message to the specified end-point.
  2. On receiving the confirmation message at the end-point, the subscriber should confirm the subscription request by sending a valid response. 

Only then Amazon SNS will consider the subscription request to be valid. If there is no response to the challenge, Amazon SNS will not send any notifications to that end-point. The exact mechanism of confirming the subscription varies by the transport protocol selected:

  • For HTTP/HTTPS notifications, Amazon SNS will first POST the confirmation message (containing a token) to the specified URL. The application monitoring the URL will have to call the ConfirmSubscription API with the token included token.
  • For Email and Email-JSON notifications, Amazon SNS will send an email to the specified address containing an embedded link. The user will need to click on the embedded link to confirm the subscription request.
  • For SQS notifications, Amazon SNS will enqueue a challenge message containing a token to the specified queue. The application monitoring the queue will have to call the ConfirmSubscription API with the token.

Note: The explicit “opt-in” steps described above are not required for the specific case where you subscribe your Amazon SQS queue to your Amazon SNS topic – and both are “owned” by the same AWS account.

Simple Queue Service (SQS)

Amazon Simple Queue Service (SQS) is a an AWS message queuing service that enables customers to decouple and scale microservices, distributed systems, and serverless applications. Amazon SQS offers common constructs such as dead-letter queues and cost allocation tags. It provides a generic web services API and it can be accessed by any programming language that the AWS SDK supports.

  • Using SQS, AWS customers can send, store, and receive messages between software components at any volume, without losing messages or requiring other services to be available.
  • Amazon SQS uses FIFO (first-in-first-out) model, which preserves queues in the exact order in which messages are sent and received. For those customers, who use a FIFO queue don’t have to place sequencing information in their messages.
    • FIFO queues provide exactly-once processing, that means that each message is delivered once and remains available until a consumer processes it and deletes it. Duplicates are not introduced into the queue.
  • Standard queues provide at-least-once delivery, because of that each message is delivered at least once.

SQS FEATURES

AWS customers can control who can send messages to and receive messages from an Amazon SQS queue.

  • Server-side encryption (SSE) enables customers to transmit sensitive data by protecting the contents of messages in queues using keys managed in AWS Key Management Service (AWS KMS).
  • There is no limit to the number of messages per queue, and standard queues provide nearly unlimited throughput. 
  • Costs are based on usage which provides significant cost saving versus the “always-on” model of self-managed messaging middleware. 

Customers can store the contents of messages larger than 256 KB using Amazon Simple Storage Service (Amazon S3) or Amazon DynamoDB, with Amazon SQS holding a pointer to the Amazon S3 object, or you can split a large message into smaller messages

  • Use Amazon SQS to transmit any volume of data, at any level of throughput, without losing messages or requiring other services to be available. 
  • SQS enables customers to decouple application components so that they run and fail independently, increasing the overall fault tolerance of the system. 

Amazon SQS locks customers messages during processing, so that multiple producers can send and multiple consumers can receive messages at the same time.

  • It has multiple copies of every message that are stored redundantly across multiple availability zones so that they are available whenever needed.

To ensure the safety of customers’ messages, Amazon SQS stores them on multiple servers. Standard queues support at-least-once message delivery, and FIFO queues support exactly-once message processing.

  • Amazon SQS uses redundant infrastructure to provide highly-concurrent access to messages and high availability for producing and consuming messages. 

Amazon SQS can process each buffered request independently, scaling transparently to handle any load increases or spikes without any provisioning instructions.

  • Using Amazon SQS, AWS customers can exchange sensitive data between applications using server-side encryption (SSE) to encrypt each message body.
  •  Amazon SQS SSE integration with AWS Key Management Service (KMS) allows customers to centrally manage the keys that protect SQS messages along with keys that protect them from other AWS resources. 
  • AWS KMS logs every use of customers encryption keys to AWS CloudTrail to help meet your regulatory and compliance needs.

SQS Functionality

  • Create unlimited Amazon SQS queues with an unlimited number of message in any region
  • Message payloads can contain up to 256KB of text in any format. Each 64KB ‘chunk’ of payload is billed as 1 request. For example, a single API call with a 256KB payload will be billed as four requests. To send messages larger than 256KB, you can use the Amazon SQS Extended Client Library for Java, which uses Amazon S3 to store the message payload. A reference to the message payload is sent using SQS.
  • Using Batches AWS customers can send, receive, or delete messages in batches of up to 10 messages or 256KB. Batches cost the same amount as single messages, meaning SQS can be even more cost effective for customers that use batching.
  • Long polling is used to reduce extraneous polling, and minimize cost while receiving new messages as quickly as possible. When your queue is empty, long-poll requests wait up to 20 seconds for the next message to arrive. Long poll requests cost the same amount as regular requests.
  • Retain messages in queues for up to 14 days. Send and read messages simultaneously.
  • When a message is received, it becomes “locked” while being processed. This keeps other computers from processing the message simultaneously. If the message processing fails, the lock will expire and the message will be available again

Types Queues

Amazon SQS offers two queue types for different application requirements:

FIFO Queues

FIFO queues are designed to enhance messaging between applications when the order of operations and events is critical, or where duplicates can’t be tolerated. It ensure that user-entered commands are executed in the right order, it displays  the correct product price by sending price modifications in the right order, and prevents a student from enrolling in a course before registering for an account.

  • High Throughput: By default, FIFO queues support up to 300 messages per second (300 send, receive, or delete operations per second). When you batch 10 messages per operation (maximum), FIFO queues can support up to 3,000 messages per second. To request a quota increase, file a support request.
  • Exactly-Once Processing: A message is delivered once and remains available until a consumer processes and deletes it. Duplicates aren’t introduced into the queue.
  • First-In-First-Out Delivery: The order in which messages are sent and received is strictly preserved (i.e. First-In-First-Out).

Standard Queues

AWS customers can use standard message queues in many scenarios, as long as their application can process messages that arrive more than once and out of order, for example: 

It enables customers’ upload media while resizing or encoding it, it processes a high number of credit card validation requests and schedules multiple entries to be added to a database.

  • Unlimited Throughput: Standard queues support a nearly unlimited number of transactions per second (TPS) per API action.
  • At-Least-Once Delivery: A message is delivered at least once, but occasionally more than one copy of a message is delivered.
  • Best-Effort Ordering: Occasionally, messages might be delivered in an order different from which they were sent

Amazon SQS message queuing can be used with other AWS Services such as Redshift, DynamoDB, RDS, EC2, ECS, Lambda, and S3, to make distributed applications more scalable and reliable. Below are some common design patterns:

  • Work Queues decouple components of a distributed application that may not all process the same amount of work simultaneously.
  • Buffer and Batch Operations adds scalability and reliability to your architecture, and smooth out temporary volume spikes without losing messages or increasing latency.
  • Request Offloading enables customers move slow operations off of interactive request paths by queueing the request.
  • Fanout combines SQS with Simple Notification Service (SNS) to send identical copies of a message to multiple queues in parallel.
  • Priority is used to separate queues prioritization of work.
  • Since message queues decouple customers’ processes, it’s easy to scale up the send or receive rate of messages by adding another process.

Simple Workflow Service (SWF)

Amazon Simple Workflow Service (SWF) is an AWS service that coordinates work across distributed application components. Amazon SWF enables developers to build, run, and scale background jobs that have parallel or sequential steps. The coordination of tasks involves managing execution dependencies, scheduling, and concurrency in accordance with the logical flow of the application. With Amazon SWF, developers get full control over implementing processing steps and coordinating the tasks that drive them.

  • Amazon SWF enables applications for a range of use cases, such  as edia processing, web application back-ends, business process workflows, and analytics pipelines, to be designed as a coordination of tasks.
  • Amazon SWF provides the AWS Flow Framework to allow developers use asynchronous programming in the development of their applications. 
  • By using Amazon SWF, developers benefit from ease of programming and have the ability to improve their applications’ resource usage, latencies, and throughputs.
  • Amazon SWF stores tasks and assigns them to workers when they are ready, tracks their progress, and maintains their state, including details on their completion. 
  • Amazon SWF maintains an application’s execution state durable so that the application is resilient to failures in individual components. With Amazon SWF, AWS customers can implement, deploy, scale, and modify these application components independently.

SWF Features

Amazon SWF promotes a separation between the control flow of customers’ background job’s stepwise logic and the actual units of work that contain their unique business logic. 

  • It enables customers to separately manage, maintain, and scale “state machinery” of their application from the core business logic that differentiates it.

Amazon SWF replaces the complexity of custom-coded workflow solutions and process automation software with a fully managed cloud workflow web service. 

  • This eliminates the need for developers to manage the infrastructure plumbing of process automation so they can focus their energy on the unique functionality of their application.

Amazon SWF runs within Amazon’s high-availability data centers, so the state tracking and task processing engine is available whenever applications need them. 

  • Amazon SWF redundantly stores the tasks, reliably dispatches them to application components, tracks their progress, and keeps their latest state.

Amazon SWF lets you write your application components and coordination logic in any programming language and run them in the cloud or on-premises.

  • It centralizes the coordination of steps in customers’  applications. The customer coordination logic does not have to be scattered across different components, but can be encapsulated in a single program.

Simple Workflow Concepts

The mechanism by which both the activity workers and the decider receive their tasks respectively is by polling the Amazon SWF service. Amazon SWF informs the decider of the state of the workflow by including, with each decision task, a copy of the current workflow execution history. 

  • The workflow execution history is composed of events, where an event represents a significant change in the state of the workflow execution. The history is a complete, consistent, and authoritative record of the workflow’s progress.

Amazon SWF access control uses AWS Identity and Access Management (IAM), allow customers to provide access to AWS resources in a controlled and limited way that doesn’t

I server endpoint. A node group is one or more Amazon EC2 instances that are deployed in an Amazon EC2 Auto Scaling group. 

A cluster can contain several node groups, and each node group can contain several worker nodes. The managed node groups are able to have a maximum number of nodes. All instances in a node group must:

  • Be the same instance type
  • Be running the same Amazon Machine Image (AMI)
  • Use the same Amazon EKS Worker Node IAM Role.

Amazon EKS provides a specialized Amazon Machine Image (AMI) called the Amazon EKS-optimized AMI. This AMI is built on top of Amazon Linux 2, and is configured to serve as the base image for Amazon EKS worker nodes.

  • The AMI is configured to work with Amazon EKS out of the box, and it includes Docker, kubelet, and the AWS IAM Authenticator. The AMI also contains a specialized bootstrap script that allows it to discover and connect to the customers cluster’s control plane automatically.

Customers can deploy one or more worker nodes into a node group. Nodes are Worker machines in Kubernetes. Amazon EKS worker nodes run in customers’ AWS accounts, and it connects their cluster’s control plane via the cluster AP

The fundamental concept in Amazon SWF is the workflow. A workflow is a set of activities that carry out some objective, together with logic that coordinates the activities. Each workflow runs in an AWS resource called a domain, which controls the workflow’s scope. An AWS account can have multiple domains, each of which can contain multiple workflows, but workflows in different domains can’t interact.

When designing an Amazon SWF workflow, AWS clients need to precisely define each of the required activities. Once that is done, they  register each activity with Amazon SWF as an activity type. While registering the activity, clients also need to provide information such as a name and version. Such expectations would inform the timeout values that you specify when registering your activities.

In the process of carrying out the workflow, some activities may need to be performed more than once, perhaps with varying inputs. Amazon SWF has the concept of an activity task that represents one invocation of an activity. 

  • An activity worker is a program that receives activity tasks, performs them, and provides results back. Note that the task itself might actually be performed by a person, in which case the person would use the activity worker software for the receipt and disposition of the task. 

Both Activity tasks and the Activity workers that perform them can run synchronously or asynchronously. They can be distributed across multiple computers, potentially in different geographic regions, or they can all run on the same computer. Different activity workers can be written in different programming languages and run on different operating systems. 

  • The coordination logic in a workflow is contained in a software program called a decider. The decider schedules activity tasks, provides input data to the activity workers, processes events that arrive while the workflow is in progress, and ultimately ends (or closes) the workflow when the objective has been completed.

The role of the Amazon SWF service is to function as a reliable central hub through which data is exchanged between the decider, the activity workers, and other relevant entities such as the person administering the workflow. Amazon SWF also maintains the state of each workflow execution, which saves customers applications from having to store the state in a durable way.

The decider directs the workflow by receiving decision tasks from Amazon SWF and responding back to Amazon SWF with decisions. A decision represents an action or set of actions which are the next steps in the workflow.

  •  A typical decision would be to schedule an activity task. Decisions can also be used to set timers to delay the execution of an activity task, to request cancellation of activity tasks already in progress, and to complete or close the workflow.

SWF Use cases

Data Center Migration

Business critical operations are hosted in a private datacenter but need to be moved entirely to the cloud without causing disruptions. Amazon SWF-based applications can combine workers that wrap components running in the datacenter with workers that run in the cloud. 

  • To transition a datacenter worker seamlessly, new workers of the same type are first deployed in the cloud. 
  • The workers in the datacenter continue to run as usual, along with the new cloud-based workers. 
  • The cloud-based workers are tested and validated by routing a portion of the load through them. During this testing, the application is not disrupted because the workers in the datacenter continue to run. 
  • After successful testing, the workers in the datacenter are gradually stopped and those in the cloud are scaled up, so that they move entirely to a cloud workflow management application. 
    • This cloud workflow process can be repeated for all other workers in the datacenter so that the application moves entirely to the cloud.

Video Encoding

AWS customers can encode Video using Amazon S3 and Amazon EC2. To do this customers large videos are uploaded to Amazon S3 in chunks. The upload of chunks has to be monitored. After a chunk is uploaded, it is encoded by downloading it to an Amazon EC2 instance. The encoded chunk is stored to another Amazon S3 location. After all of the chunks have been encoded in this manner, they are combined into a complete encoded file which is stored back in its entirety to Amazon S3. 

  • The entire application is built as a workflow where each video file is handled as one workflow execution. 
  • The tasks that are processed by different workers are: upload a chunk to Amazon S3, download a chunk from Amazon S3 to an Amazon EC2 instance and encode it, store a chunk back to Amazon S3, combine multiple chunks into a single file, and upload a complete file to Amazon S3. 
  • The decider initiates concurrent tasks to exploit the parallelism in the use case. 
  • The application state kept by Amazon SWF helps the decider control the workflow. 
  • The execution progress is continuously tracked in the Amazon SWF Management Console. 
    • If there are failures, the specific tasks that failed are identified and used to pinpoint the failed chunks.

Product Catalogs With Human Workers

While validating data in large catalogs, the products in the catalog are processed in batches. Different batches can be processed concurrently. For each batch, the product data is extracted from servers in the datacenter and transformed into CSV (Comma Separated Values) files required by Amazon Mechanical Turk’s Requester User Interface (RUI). 

  • The CSV is uploaded to populate and run the HITs (Human Intelligence Tasks). 
  • When HITs complete, the resulting CSV file is reverse transformed to get the data back into the original format. 
  • The results are then assessed and Amazon Mechanical Turk workers are paid for acceptable results. 
  • Failures are weeded out and reprocessed, while the acceptable HIT results are used to update the catalog. 
  • As batches are processed, the system needs to track the quality of the Amazon Mechanical Turk workers and adjust the payments accordingly. 
    • Failed HITs are re-batched and sent through the pipeline again

Implementing SWF

When using Amazon SWF, AWS customers implement workers to perform tasks. These workers can run either on cloud infrastructure, such as Amazon Elastic Compute Cloud (Amazon EC2), or on customers own premises. They can create tasks that are long-running, or that may fail, time out, or require restarts, or that may complete with varying throughput and latency. 

Amazon SWF is suitable for a range of use cases that require coordination of tasks, including media processing, web application back-ends, business process workflows, and analytics pipelines.

AWS SDKs

Amazon SWF is supported by the AWS SDKs for Java, .NET, Node.js, PHP, PHP version 2, Python and Ruby, providing a convenient way to use the Amazon SWF HTTP API in the programming language of your choice.

  • AWS customers can develop deciders, activities, or workflow starters using the API exposed by these libraries. 
    • Customers can access visibility operations through these libraries, and develop their own Amazon SWF monitoring and reporting tools.

AWS Flow Framework

The AWS Flow Framework is an enhanced SDK for writing distributed, asynchronous programs that can run as workflows on Amazon SWF. It is available for the Java and Ruby programming languages, and it provides classes that simplify writing complex distributed programs.

  • Using AWS Flow Framework, customers can use preconfigured types to map the definition of their workflow directly to methods they program.
  • The AWS Flow Framework supports standard object-oriented concepts, such as exception-based error handling. 
    • Programs written with the AWS Flow Framework can be created, executed, and debugged entirely within customers’ preferred editor or IDE. 

HTTP Service API

Amazon SWF provides service operations that are accessible through HTTP requests. Customers can use these operations to communicate directly with Amazon SWF. They  can also use them to develop their own libraries in any language that can communicate with Amazon SWF through HTTP.

  • By using the service API customers can develop deciders, activity workers, or workflow starters. 
  • Customers can access visibility operations through the API to develop your own monitoring and reporting tools.