CloudStaq – Page 2 – Cloud.Architecture.Technology

AWS re:Invent | Beyond The Shiny New Toys | Networking
This is part of the Beyond The Shiny New Toys series where I write about AWS reInvent 2019 announcements

Networking in AWS has always been continuously evolving to support wide range of customer use cases. Back in the early days of EC2, all customer Instances were in a flat EC2 network called EC2-Classic (there is still documentation that talks about this). 10 years ago, AWS introduced VPC which fundamentally changed the way we think about Networking and it has rapidly accelerated the adoption of public cloud.

Since the introduction VPC, the networking landscape within AWS started evolving pretty rapidly. You started getting capabilities such as VPC peering, NAT Gateways, Private endpoints for AWS services (such as S3, DynamoDB) within your VPC (so that traffic to those services do not leave your VPC). AWS also launched PrivateLink which allows your applications running in VPC securely access other applications (of yours and third party). And with their acquisition of Annapurna Labs, AWS started pushing the network capabilities even further such as 100Gbps networking, faster EBS-optimized performance, transparent hardware based encryption (almost nil performance overhead for your workloads).

New Networking Capabilities

While the AWS Networking stack continues to always improve through the year, reInvent 2019 had its fair share of announcements related to Networking. Here are some of the new features that I specifically found to be interesting.

Transit Gateway Inter Region Peering

https://aws.amazon.com/about-aws/whats-new/2019/12/aws-transit-gateway-supports-inter-region-peering/

AWS Transit Gateway allows you to establish a single gateway on AWS and connect all your VPCs, your datacenter (through a Direct Connect / VPN) and office locations (VPN). You no longer have to maintain multiple peering connections between VPCs. This changed your network topology to a Hub and Spoke model as depicted below

So, if you have multiple VPCs that needs to connect to each other (and with on-premise data centers or branch offices), Transit Gateway simplifies the networking topology. However, if you are operating in multiple AWS regions (which most large customers do), and want to build a global network, then you need connectivity between the Transit Gateways.

This type of cross region connectivity between Transit Gateways was not possible earlier. Now, with Transit Gateway peering, you can easily connect multile AWS regions and build a global network.

Transit Gateway peering across AWS regions

Traffic through the peering connection is automatically encrypted and is routed over the AWS backbone. This is currently available (at the time of writing) in US-East (N.Virginia), US-East(Ohio), US-West(Oregon), EU-Ireland EU-Frankfurt regions and I would expect AWS to roll this out globally based on customer demand.

Building a Global Network

Building a Global Network that spans across multiple AWS regions connecting your data center and branch offices have become simpler with the following services/features:
- AWS Site-to-Site VPN
- The new Transit Gateway Peering
- Direct Connect Gateway (requires a special interface called Transit Virtual Interface with 1Gbps and above subscription. For sub 1Gbps connections with Transit Gateway, check out this solution).
There were also quite a few other features announced around reInvent 2019 that ups the game when it comes to building a scalable and reliable Global Network. Few notable mentions:
- Accelerated Site-to-Site VPN: This uses the AWS Global Accelerator to route traffic through AWS edge locations and the AWS Global Backbone. Improves VPN performance from branch offices
- Multicast workloads: Transit Gateway now supports multicast traffic between the attached VPCs. A feature asked by many large enterprises for running clusters and media workloads
- Visualizing your Global Network: If you are building out such a complex Global Network, you would need a simpler way to visualize your network and take actions. The new Transit Gateway Network Manager allows you to centrally monitor your Global Network
If you would like to understand more about various architecture patterns, check out this deep dive talk by Nick Mathews at re:Invent 2019.

Application Load Balancer – Weighted Load Balancing

When you are rolling out a new version of your app, you need some kind of strategy to transition users from the older version to the new version. Depending upon your needs, you employ techniques such as Rolling Deployments, Blue/Green Deployments or Canary Deployments.

All of these techniques require some component in your infrastructure to selectively route traffic between the different versions. There are few ways in which you can do these in AWS today:
- Using Route 53 DNS routing policy. For example, you can use Weighted Routing policy to control how much traffic is routed to each resource (having two different ALBs/ELBs running different versions of your app)
- You could have different AutoScalingGroups attached to the same ALB and adjust the ASGs to shape your traffic
With this new “Weighted Load Balancing” feature in ALB, you can achieve the same with one ALB.
- You create multiple Target Groups with each having EC2 Instances running different versions of your app
- Under the “Listeners” of your ALB, you add the “Target Groups” and assign different weights to different Target Groups
Once you do the above, ALB would split the incoming requests based on the weights and route appropriate requests to each of the Target Groups.

Migration Use Cases

One other great use case for this feature is migration. Target Groups are not limited to just EC2 Instances. Target Groups can be EC2 Instances, Lambda functions, containers running ECS/EKS or even just IP addresses (say you have a VM in a data center). So, you can use this capability to even migrate from one environment to the other by gradually shifting traffic.

If you would like to learn more about this new feature, check out this blog.

Application Load Balancer – LOR Algorithm

https://aws.amazon.com/about-aws/whats-new/2019/11/application-load-balancer-now-supports-least-outstanding-requests-algorithm-for-load-balancing-requests/

The default algorithm that an Application Load Balancer (ALB) uses to distribute traffic is Round Robin. This fundamentally assumes that the processing time for all types of requests of your app is the same. While this may be true for most consumer facing apps, a lot of business apps have varying processing times for different functionalities. This would lead to some of the instances getting over utilized and some under utilized.

ALB now supports Least Outstanding Requests (LOR) algorithm. If you configure your ALB with LOR algorithm, ALB monitors the number of outstanding request of each target and sends the request to the one with the least number of outstanding requests. If you have some targets (such as EC2 Instances) that are currently processing some long requests, they will not be burdened with more requests.

This is available to all existing and new ALBs across all regions. A great feature!! There are some caveats too (such as how this would work with Sticky Sessions). Check out this documentation for more details.

Better EBS Performance

https://aws.amazon.com/about-aws/whats-new/2019/12/amazon-ec2-nitro-system-based-instances-now-support-36-faster-amazon-ebs-optimized-instance-performance/

These are the things that I love about being on public cloud. Any EBS intensive workload, automatically gets better network throughput. At no additional cost. Without me doing anything. Just like that!!

Well those are the announcements that I found interesting in the Networking domain. Did I miss anything? Let me know in the comments.
December 14, 2019
AWS re:Invent | Beyond The Shiny New Toys | Security
This is part of the Beyond The Shiny New Toys series where I write about AWS reInvent 2019 announcements

Security in Architectures is an area I am passionate about. One of the most important things that I learnt during my tenure at Amazon is thinking about Security at the beginning of everything. You cannot build a great feature and then think about Security. You rather think about Security before even you start building out features.

Jeff Bezos always says “Customers are not going to wake up tomorrow morning and say they are willing to pay more”. Similarly, I would argue that “Customers are not going to tell you that they are OK to use a product/feature that is not secure by design”

Most folks call out On-demand provisioning, Elasticity, Cost efficiencies as benefits of Cloud. While those are definitely true, one of the most important aspect of Cloud is that it drastically changes the Security posture for everyone. A customer who pays $100 a month gets the same security baseline and features as an Enterprise who is running a highly regulated workload. And the baseline continuously improves.

Lets look at some of the important announcements made in the Security domain during re:Invent 2019.

Unused IAM Roles

https://aws.amazon.com/about-aws/whats-new/2019/11/identify-unused-iam-roles-easily-and-remove-them-confidently-by-using-the-last-used-timestamp/

IAM Roles have become such a fundamental building block for applications running on AWS. An IAM Role provides temporary access credentials to anyone assuming the Role. There are plenty of use cases for IAM Roles such as: applications running on EC2 Instances assuming a Role to get temporary creds, services such as Lambda/Kinesis talking to other services, users assuming roles to log in to AWS accounts (through SSO) and so on.

Pretty quickly your AWS account can get filled with plenty of such IAM Roles and you may start loosing track of which IAM Roles are actually being in use.

To know if you a particular IAM Role is being used or not, through Access Advisor, you can now check the latest timestamp when the Role creds were used to access an AWS API. Under IAM, go to Roles and choose any Role to open it’s Summary page. You will find a tab called “Access Advisor” where you can view when the Role was last used.

Access Advisor Tab of an IAM Role showing service permissions granted by the role and when those services were last accessed

This is a pretty useful feature to look at all your IAM Roles and understand which ones are being actively used. And then drilling down to “Access Advisor” to find out which policies of the Role are being used. In the above example, it appears that Application Auto Scaling and App Mesh policies are not used by this Role and you can start invetigating if you really need these policies as part of the role.

Attribute Based Access Control

https://aws.amazon.com/blogs/security/simplify-granting-access-to-your-aws-reso urces-by-using-tags-on-aws-iam-users-and-roles/

We have been traditionally used to the concept of Role Based Access Control (RBAC). RBAC allows us to make sure we are able to provide access to users and machines based on their “Role”. On AWS, this is typically implemented through a combination of IAM Users, Roles and Groups.

For example, you could create IAM Groups such as “devs”, “admins”, “superadmins” with different set of IAM policies attached to them. You would then add users to those Groups and they automatically get relevant access.

While this model works in general, if you have lots of different projects and teams, it can get pretty complex to manage. For example, you will have users move from one project to another and they may need new set of permissions. You then end up creating newer set of policies and groups and this can go on for ever.

IAM recently introduced the ability to add “Tags” on IAM Principals – Users and Roles. With this, administrators can write simple policies based on “Tags” associated with the “Users” and the “Resources”.

As an example, let’s say you have a Tagging structure where EC2 resources are tagged with the key “team“. You could have a generic policy where you could allow “Users” to Start/Stop EC2 Instances only if the Instances have a Tag called “team” whose value matches the User’s Tag called “team”.

This can be achieved through a policy like:
```
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstances"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:StartInstances",
                "ec2:StopInstances"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "ec2:ResourceTag/team": "${aws:PrincipalTag/team}"
                }
            }
        }
    ]
}
```
This dramatically simplifies how you manage Policies at Scale where you no longer have to create different set of policies based on different use cases.

If you use a SAML based Identity Provider (such as Active Directory) to federate into AWS console and APIs, you could use attributes in your IdP to achieve the same. You could learn more about this here: https://aws.amazon.com/blogs/security/rely-employee-attributes-from-corporate-directory-create-fine-grained-permissions-aws/

If you want to learn more about this, watch this awesome session by Brigid Johnson:

IAM Access Analyzer

https://aws.amazon.com/blogs/aws/identify-unintended-resource-access-with-aws-identity-and-access-management-iam-access-analyzer/

This is another example of how being on a large Cloud Platform like AWS automatically improves security for everyone. We earlier saw how you can identify usage of your IAM Roles. While IAM Roles gives temporary creds to access services, on the services side, you can apply policies (called as Resource Based Policies) to control who has access to services (such as S3, SQS) and what actions can they perform. With this, even if someone gets a credential through IAM Role to access a service, if the service’s policies do not allow access then those requests would fail.

So, Resource Based Policies are super important and you can find the list of supported services here.

While this is a great way to protect your services, how do you know that these policies are set the way you intended to be? How about these questions:
- I had created a S3 bucket and allowed Read access from another AWS account. Does that AWS account still have only Read access?
- I created an IAM Role for a federated user to assume when they log in using their SAML based IdP. This was few months ago. Are we sure nobody else has permissions to assume the Role now?
These are the kind of questions that you care about deeply. Validating this manually can be pretty daunting. That’s where IAM Access Analyzer can help.

Once enabled, IAM Access Analyzer continuously monitors resources (currently supports S3, SQS, IAM Roles, KMS and Lambda) and reports findings wherever a resource is accessed by a Principal outside the AWS account. The findings are provided within the IAM Console itself from where you can start taking actions. If you find the access to be as intended, you can simply “Archive” them. If you see some resources that looks suspicious, you can jump to respective consoles and take actions (such as turning off public access to your S3 bucket).

source: https://aws.amazon.com/blogs/aws/identify-unintended-resource-access-with-aws-identity-and-access-management-iam-access-analyzer/

Here’s a detailed blog post on how you can use Access Analyzer to understand which of your S3 buckets have “Public Access” or access from other AWS accounts: https://aws.amazon.com/blogs/storage/protect-amazon-s3-buckets-using-access-analyzer-for-s3/

The best part of this feature: it doesn’t cost anything. And it can be turned ON in a click.

Instance Metadata Service V2

https://aws.amazon.com/blogs/security/defense-in-depth-open-firewalls-reverse-proxies-ssrf-vulnerabilities-ec2-instance-metadata-service/

Instance Metadata Service (IMDS) is the internal service in AWS that provides temporary AWS credentials to EC2 Instances through the IAM Role that is attached to the Instance. From within an EC2 Instance, you can hit 169.254.169.254 to access the IMDS service.

This service has been available for more than 10 years now quietly working behind the scenes doing the heavy lifting of providing temporary, frequently rotated AWS credentials.

Earlier this year, Capital One had a breach where there was unauthorized access by an outsider who obtained Capital One’s customer data. This breach happened through a technique called SSRF (Server Side Request Forgery). You can read more details about this incident here.

This incident involved the EC2 IMDS. Once someone has access to an EC2 Instance with an attached IAM Role, you have access to the EC2 IMDS. Depending on the policies on the IAM Role attached to the EC2 Instance, you can start interacting with those AWS services. In Capital One’s case, the IAM Role seem to have had excessive permissions to S3 thereby allowing the attacker to download data from S3.

This isn’t pointing the blame at AWS. As AWS calls out , Security is a Shared Responsibility model where the customer equally owns the Security of the infrastructure that they are running.

However, AWS seems to have thought on how they can make it better and protect their customers. The effect of that is V2 of IMDS. In IMDS V2, there are additional protection to the service against attacks such as SSRF. Specifically, V2 provides:
- Protection against Open Reverse Proxies
- Protection against SSRF Vulnerabilities
- Protection against Open Layer 3 Firewalls and NATs
Using IMDS V2
- You can start using IMDS V2 by upgrading your AWS SDKs (that your apps running on EC2 use) to their latest version. The latest AWS SDKs automatically use IMDS V2
- In addition, there is a new CloudWatch metric called “MetadataNoToken” for EC2 Instances. This is a new metric that tells you “the numer of times the IMDS was accessed through V1”. You can start monitoring this metric to understand how many Instances are still using V1
- Once the above metric becomes 0 (meaning all are using V2), you can have IAM Policies that blocks Instances being launched with V1 (essentially you can enforce usage of V2 only)
- CloudTrail also now records if the credentials were provided using V1 or V2
If you want to learn more about IMDS V2, read this blog post and this documentation.

AWS recommends that you start adopting V2. V1 will continue to be supported indefinitely and continues to be secure. However V2 provides additional protection. No harm in being more secure 🙂

Well those are the announcements that I found interesting in the security domain. Did I miss anything? Let me know in the comments.
December 6, 2019
AWS re:Invent|Beyond The Shiny New Toys |Redshift
This is part of the Beyond The Shiny New Toys series where I write about AWS reInvent 2019 announcements

Redshift has been going through a series of major changes which tremendously simplifies schema design and overall management of workloads. Here are some of the new features that were announced around the re:Invent 2019 timeframe that I specifically think a lot of customers (based on my earlier interaction with them) would look to put in use.

Materialized Views

https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-redshift-introduces-support-for-materialized-views-preview/

This has been one of the most wanted asks from many customers who migrate from other DW systems into Redshift. Materialized Views (MV) have a significant improvement on query performance for repeated workloads such as Dashboarding, queries from BI tools or certain predictable steps in ETL pipelines.

Till now, Redshift lacked support for MV and the recommendation has been to either modify your workloads or implement architectural changes such as performing a query rewrite using pg_bouncer

You can now use the native MV (available in preview) capability to address such needs. There are some current limitations though. For example, you need to manually refresh the MV whenever your base tables undergo changes. Over time, I am sure AWS folks would address these limitations based on customer feedback. You can find the complete set current limitations here: https://docs.aws.amazon.com/redshift/latest/dg/mv-usage-notes.html

Automated Table Sort

https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-redshift-introduces-automatic-table-sort-alternative-vacuum-sort/

One of the most important best practices when it comes to Redshift is to keep the data Sorted. This would directly improve query performance as Redshift can read specific blocks of data (when your query has a filter) and also apply compression better. If your data is NOT sorted well enough, Redshift may read unwanted blocks and then later skip them in the memory. So, you choose a SORT Key for your table initially and on incremental data loads, you had to earlier run “VACUUM SORT” command to make sure the data blocks are sorted.

With this new feature, Redshift automatically performs the sorting activity in the background without any interruption to query processing. However, if you do have large data loads, you may still want to run “VACUUM SORT” manually (as Automatic Sorting may take a while to fully Sort in the background).

You can also monitor the “vacuum_sort_benefit” and “unsorted” columns in the SVV_TABLE_INFO table. Together, these columns tell you the following:
1. What percentage of a particular table is “unsorted”
2. How much percentage benefit would you derive by running “VACUUM SORT” against the table
Check the following documentation for more details: https://docs.aws.amazon.com/redshift/latest/dg/t_Reclaiming_storage_space202.html#automatic-table-sort

ALTER SORT KEY Dynamically

https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-redshift-supports-changing-table-sort-keys-dynamically/

When you start using Redshift, you pick Distribution and Sort Keys for your tables. However, over time, as your workload evolves there may be a need to modify the Sort Keys that you originally picked. Previously, this meant, recreating your table with the new set of Sort Keys and loading all the data into that newly created table. This was required because, Redshift physically sorts the data in the underlying storage. Changing your Sort Keys meant re-sorting your data.

With this new feature, you can now dynamically change the Sort Keys of your existing table. Redshift, behind the scenes will re-sort the data while your table continues to be available for querying. This provides more flexibility when it comes to schema design.

Cross Instance Restore

https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-redshift-launches-cross-instance-restore/

This is another important feature and one that has been long requested by customers. You may want to restore a snapshot of production DC2.8XL cluster into a smaller DC2.Large cluster for your test/dev purposes. Or you may have a DC2.Large cluster with many number of nodes. You have a snapshot of that cluster and wish to launch a cluster with smaller number of DC2.8XL cluster. This wasn’t possible until this capability was introduced.

One of the important aspects that you want to consider when doing this exercise is to undersatnd how would your “target” cluster’s storage utilization on each node would look like. The following command in the AWS CLI would throw you some options to consider:
```
aws redshift describe-node-configuration-options --snapshot-identifier <mycluster-snapshot> --region eu-west-1 -—action-type restore-cluster
```
Automatic Workload Management

https://aws.amazon.com/about-aws/whats-new/2019/09/amazon-redshift-announces-automatic-workload-management-and-query-priorities/

This isn’t a re:Invent timeframe announcement as such. This was announced in September. But I am including it here because this is a big one and simplifies day to day operations of a Redshift cluster for an administrator.

Even some of the large Redshift customers find it cumbersome to perform Workload Management (WLM) on Redshift. WLM on itself is a pretty deep topic and is something that you cannot avoid once your workloads start scaling on Redshift.

WLM provides many controls for a Redshift administrator to manage different workloads and give better experience for all types of users of the system. Over the years, WLM has evolved from a static configuration to a dynamic configuration (of queues and memory) with Queue Priorities, Query Monitoring Rules, Queue Hopping, Short Query Acceleration and Concurrency Scaling.

However all of these require someone to continuously observe the workloads on the cluster and keep tweaking these configurations. With Automatic WLM, Redshift removes much of these overheads from the administrator.

With Automatic WLM, you still define Queues, Queue Priorities, User/Query Groups and configure Concurrency Scaling (for required Queues). Automatic WLM will then dynamically manage memory allocation and concurrency amongst these queues based on the workload. Automatic WLM also works with Short Query Acceleration allowing short running queries to complete.

If you are managing WLM manually today, it might be worthwhile taking a look at this feature. You can read more about how Automatic WLM works here: https://docs.aws.amazon.com/redshift/latest/dg/automatic-wlm.html

A few more noteworthy ones

These are few more features that got added over the couse of 2019 – just ICYMI
- Stored Procedure Support. A BIG BIG ask from many customers. More here: https://docs.aws.amazon.com/redshift/latest/dg/stored-procedure-create.html
- Auto VACUUM DELETE: https://docs.aws.amazon.com/redshift/latest/dg/t_Reclaiming_storage_space202.html#automatic-table-delete
- Auto ANALYZE: https://docs.aws.amazon.com/redshift/latest/dg/t_Analyzing_tables.html#t_Analyzing_tables-auto-analyze
- AUTO Distribution Style: https://docs.aws.amazon.com/redshift/latest/dg/c_choosing_dist_sort.html
I believe with all these new capabilities, Redshift has now automated a whole lot of operations making administrators’ life simpler. Put it in typical Amazon way, Redshift now takes care of most of the “undifferentiated heavy lifting” 🙂

Well those are the Redshift announcements that I found interesting. Did I miss anything? Let me know in the comments.
December 2, 2019
Beyond The Shiny New Toys | Redshift
This is part of the Beyond The Shiny New Toys series where I write about AWS reInvent 2019 announcements

Amazon Redshift has been going through a series of major changes which tremendously simplifies schema design and overall management of workloads. Here are some of the new features that were announced around the re:Invent 2019 timeframe that I specifically think a lot of customers (based on my earlier interaction with them) would look to put in use

Materialized Views

https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-redshift-introduces-support-for-materialized-views-preview/

This has been one of the most wanted asks from many customers who migrate from other DW systems into Redshift. Materialized Views (MV) have a significant improvement on query performance for repeated workloads such as Dashboarding, queries from BI tools or certain predictable steps in ETL pipelines.

Till now, Redshift lacked support for MV and the recommendation has been to either modify your workloads or implement architectural changes such as performing a query rewrite using pg_bouncer

You can now use the native MV (available in preview) capability to address such needs. There are some current limitations though. For example, you need to manually refresh the MV whenever your base tables undergo changes. Over time, I am sure AWS folks would address these limitations based on customer feedback. You can find the complete set current limitations here: https://docs.aws.amazon.com/redshift/latest/dg/mv-usage-notes.html

Automated Table Sort

https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-redshift-introduces-automatic-table-sort-alternative-vacuum-sort/

One of the most important best practices when it comes to Redshift is to keep the data Sorted. This would directly improve query performance as Redshift can read specific blocks of data (when your query has a filter) and also apply compression better. If your data is NOT sorted well enough, Redshift may read unwanted blocks and then later skip them in the memory. So, on incremental data loads, you had to earlier run “VACUUM SORT” command to make sure the data blocks are sorted.

With this new feature, Redshift automatically performs the Sorting activity in the background without any interruption to query processing. However, if you do have large data loads, you may still want to run “VACUUM SORT” manually (as Automatic Sorting may take a while to fully Sort in the background).

You can also monitor the “vacuum_sort_benefit” and “unsorted” columns in the SVV_TABLE_INFO table. Together, these columns tell you the following:
1. What percentage of a particular table is “unsorted”
2. How much percentage benefit would you derive by running “VACUUM SORT” against the table
Check the following documentation for more details: https://docs.aws.amazon.com/redshift/latest/dg/t_Reclaiming_storage_space202.html#automatic-table-sort

ALTER SORT KEY Dynamically

https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-redshift-supports-changing-table-sort-keys-dynamically/

When you start using Redshift, you pick Distribution and Sort Keys for your tables. However, over time, as your workload evolves there may be a need to modify the Sort Keys that you originally picked. Previously, this meant, recreating your table with the new set of Sort Keys and loading all the data into that newly created table. This was required because, Redshift physically sorts the data in the underlying disks. Changing your Sort Keys meant re-sorting your data.

With this new feature, you can now dynamically change the Sort Keys of your existing table. Redshift, behind the scenes will re-sort the data while your table continues to be available for querying. This provides more flexibility when it comes to schema design.

Cross Instance Restore

https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-redshift-launches-cross-instance-restore/

This is another important feature and one that has been long requested by customers. You may want to restore a snapshot of production DC2.8XL cluster into a smaller DC2.Large cluster for your test/dev purposes. Or you may have a DC2.Large cluster with many number of nodes. You have a snapshot of that cluster and wish to launch a cluster with smaller number of DC2.8XL cluster. This wasn’t possible until this capability was introduced.

One of the important aspects that you want to consider when doing this exercise is to undersatnd how would your “target” cluster’s storage utilization on each node would look like. The following command in the AWS CLI would throw you some options to consider:
```
aws redshift describe-node-configuration-options --snapshot-identifier <mycluster-snapshot> --region eu-west-1 -—action-type restore-cluster
```
Automatic Workload Management

https://aws.amazon.com/about-aws/whats-new/2019/09/amazon-redshift-announces-automatic-workload-management-and-query-priorities/

This isn’t a re:Invent timeframe announcement as such. This was announced in September. But I am including it here because this is a big one and simplifies day to day operations of a Redshift cluster for an administrator.

Even some of the large Redshift customers find it cumbersome to perform Workload Management (WLM) on Redshift. WLM on itself is a pretty deep topic and is something that you cannot avoid once your workloads start scaling on Redshift.

WLM provides many controls for a Redshift administrator to manage different workloads and give better experience for all types of users of the system. Over the years, WLM has evolved from a static configuration to a dynamic configuration (of queues and memory) with Queue Priorities, Query Monitoring Rules, Queue Hopping, Short Query Acceleration and Concurrency Scaling.

However all of these require someone to continuously observe the workloads on the cluster and keep tweaking these configurations. With Automatic WLM, Redshift removes much of these overheads from the administrator.

With Automatic WLM, you still define Queues, Queue Priorities, User/Query Groups and configure Concurrency Scaling (for required Queues). Automatic WLM will then dynamically manage memory allocation and concurrency amongst these queues based on the workload. Automatic WLM also works with Short Query Acceleration allowing short running queries to complete.

If you are managing WLM manually today, it might be worthwhile taking a look at this feature. You can read more about how Automatic WLM works here: https://docs.aws.amazon.com/redshift/latest/dg/automatic-wlm.html

A few more noteworthy ones

These are few more features that got added over the couse of 2019 – just ICYMI
- Stored Procedure Support. A BIG BIG ask from many customers. More here: https://docs.aws.amazon.com/redshift/latest/dg/stored-procedure-create.html
- Auto VACUUM DELETE: https://docs.aws.amazon.com/redshift/latest/dg/t_Reclaiming_storage_space202.html#automatic-table-delete
- Auto ANALYZE: https://docs.aws.amazon.com/redshift/latest/dg/t_Analyzing_tables.html#t_Analyzing_tables-auto-analyze
- AUTO Distribution Style: https://docs.aws.amazon.com/redshift/latest/dg/c_choosing_dist_sort.html
I believe with all these new capabilities, Redshift has now automated a whole lot of operations making administrators’ life simpler. Put it in typical Amazon way, Redshift now takes care of most of the “undifferentiated heavy lifting” 🙂

Did I miss any new major announcement? What do you think about these features? Do let me know your thoughts in the comments section below.
December 2, 2019
AWS re:Invent | Beyond The Shiny New Toys
This is one of my favorite time of the year when it comes to AWS. Yes, the time around re:Invent where AWS launches a whole bunch of new Services and Features. Over the years, it has become a bit of overwhelming for me with the sheer number of announcements. Last year, during the week of re:Invent 2018 AWS announced about 100 major new Services and Features. This year AWS has gone one step further with pre:Invent. More than 100 new Features/Services already announced in the two weeks run up to re:Invent 2019.

At every re:Invent, AWS continues to push the platform forward with some amazing innovative services. Nobody imagined that AWS would drive a truck to your datacenter. And who thought you could get a Ground Station at a click of a button? May be this year (I am writing this post a week before re:Invent) they would launch a service for “Inter Planetary Travel”. Or a feature in Sagemaker that builds an AWS service automatically from your thoughts.

Jokes apart, AWS would continue to innovate at a rapid clip on behalf of cutsomers and it is super exciting to see all these services coming our way. At the same time, these are shiny new toys that address some very specific use cases. What majority of the customers end up adopting are the Features & Services released in the areas of “fundamental building blocks”. These are in the areas of Compute, Storage, Networking, Secuirty, Analytics, Management Tools and Cost Optimization that are core to most workloads that businesses run on the Cloud.

This series of posts focus on new Services and Features in these building blocks. I will try to collate as much as possible from the rampage of announcements under specific topics. As I add more posts, I will also collate them and list them here so that this post becomes a “master” post by itself.
- Amazon Redshift: http s://cloudstaq.io/2019/12/02/beyond-shiny-toys-redshift-aws-reinvent/
- Security: https://cloudstaq.wordpress.com/beyond-shiny-toys-aws-reinvent-security/
- Networking: https://cloudstaq.wordpress.com/beyond-shiny-toys-aws-reinvent-networking/
- Containers: https://cloudstaq.wordpress.com/beyond-shiny-toys-aws-reinvent-containers/
November 30, 2019
Beyond The Shiny New Toys
This is one of my favorite time of the year when it comes to AWS. Yes, the time around re:Invent where AWS launches a whole bunch of new Services and Features. Over the years, it has become a bit of overwhelming for me with the sheer number of announcements. Last year, during the week of re:Invent 2018 AWS announced about 100 major new Services and Features. This year AWS has gone one step further with pre:Invent. More than 100 new Features/Services already announced in the two weeks run up to re:Invent 2019.

At every re:Invent, AWS continues to push the platform forward with some amazing innovative services. Nobody imagined that AWS would drive a truck to your datacenter. And who thought you could get a Ground Station at a click of a button? May be this year (I am writing this post a week before re:Invent) they would launch a service for “Inter Planetary Travel”. Or a feature in Sagemaker that builds an AWS service automatically from your thoughts.

Jokes apart, AWS would continue to innovate at a rapid clip on behalf of cutsomers and it is super exciting to see all these services coming our way. At the same time, these are shiny new toys that address some very specific use cases. What majority of the customers end up adopting are the Features & Services released in the areas of “fundamental building blocks”. These are in the areas of Compute, Storage, Networking, Secuirty, Analytics, Management Tools and Cost Optimization that are core to most workloads that businesses run on the Cloud.

This series of posts focus on new Services and Features in these building blocks. I will try to collate as much as possible from the rampage of announcements under specific topics. As I add more posts, I will also collate them and list them here so that this post becomes a “master” post by itself.
- Amazon Redshift: https://cloudstaq.io/2019/12/02/beyond-shiny-toys-redshift-aws-reinvent/
November 30, 2019
Hello World! Again!
This is my second attempt on blogging about technology. I was blogging earlier at https://www.raghuramanb.com/. In late 2013, I co-founded a startup and then AWS happened. AWS can keep you really busy!! I was spending an enormous amount of time with customers (which I absolutely loved!). The pace at which AWS (and Amazon in general) operates, it means a minimum of 6-8 meetings every week, Workshops, POCs, Public Speaking, Keeping pace with the ever growing AWS Platform. I just couldn’t make enough time to maintain the blog. But the learning was tremendous. With so much customer interaction, one learns a whole lot of architecture patterns, different ways of solving problems, multiple ways of stitching together AWS services.

Now that I have moved out of AWS, here’s my another attempt to resume blogging. Here are some of the areas that I plan to write:
- Cloud Architecture: I am very passionate about architectures. By architectures I mean thinking about different aspects of a solution. Such as Security, Availability, Decoupling, Cost, Evolution – can it start small and continue to evolve.
- AWS: The blog would still be heavily on AWS. At least to start with. Because that’s where my core expertise is (where I worked on for the past 10 years)
- Privacy: This is an area that I have become very passionate recently. More on the personal life. I am actively moving out of big tech companies (those who monetize on data) and choosing services where their business model doesn’t depend on my data. I plan to write on some of my experiences in this area
Those are the areas that I want to start writing on. Hopefully I am able to spend enough time in keeping this blog alive this time!!
November 30, 2019