How to use AWS DMS to migrate Data & capture CDC from source RDS to S3

What is AWS Data Migration Service?

Migration Flow Overview

AWS DMS Migration Architecture using Source RDS and Destination S3 Bucket

Steps to perform migration using DMS?

  • Create a migration subnet group within the VPC
  • Create a DMS replication instance in the Private Subnet
  • Edit the Security group of the RDS Source instance
  • Create a source endpoint and Test the connection
  • Create a target endpoint and Test the connection
  • Create a task to perform the one time migration of the data
  • Create another task to perform CDC migration from the source
  • Open the DMS console in your AWS account and click on the Subnet groups on the left hand side panel and then click on Create subnet group button
  • In the Create Replication Subnet group form give a valid name and a description
  • Select the VPC in which you want to create the migration subnet group
  • Then add subnets from the above selected VPC
  • If required you can also add tags while creating the migration subnet group
  • Then click on Create Subnet
Create Migration Subnet Group
  • Once the DMS subnet group is created then click on the replication instance on the left panel and click on create replication instance
  • Please give a valid name to the replication instance and add a description
  • Then select an instance type for the replication instance as per the volume of data which is to be migrated. The speed of migration will depend on the size of the migration instance
  • Then select the VPC in which you want to launch the DMS replication instance
  • Make sure that un check the Publicly Accessible option for DMS instance as the Source is within the VPC itself
  • Then open Advanced security and network configuration and select the replication subnet group which we created in the previous step
  • Select the Security group for your replication instance and then select a KMS key for encryption (Optional) of the data
  • You can also define a maintenance window for the Replication instance and can also tags to the instance (Optional)
Create DMS Replication Instance - Image 1
Create DMS Replication Instance - Image 2
  • In the Create endpoint form select the option of Source Endpoint as we are creating endpoint for the source
  • Check the box of Select RDS DB instance as our source is an RDS instance (Note : If the source is not an RDS instance but a MySQL instance on on-premise or on another cloud GCP or Azure, then you can provide the endpoint information of the DB server manually, without checking this option)
  • Provide a Endpoint name and an ARN name for the source endpoint
  • Select the Database Engine for your source database
  • Then chose the Provide access information manually radio button to enter the details of your source database
  • As we have selected the RDS instance above, so majority of the fields will be auto populated (Note : If the source is not a RDS instance then you will have to manually enter all the details of the source DB)
  • You can select the KMS key for the encryption for you source endpoint (Optional)
  • You can also add tags if necessary (Optional)
  • Then open the Test endpoint connection and select VPC in which you want you have created your DMS replication instance and also select the DMS replication instance created in the earlier step
  • Then click on Run Test to check whether the source endpoint is able to connect the DMS Migration instance with Source RDS instance
Important Note : If your Run Test Failed, then this means that the Source Endpoint is not able to connect the DMS replication instance to the RDS Database. For that check the Step 3 as Security Group configuration is an important step to make sure that Firewalls are configured properly for the connection
Create DMS Source Endpoint - Image 1
Create DMS Source Endpoint - Image 2
  • Firstly, select the Target Endpoint option over here as we are now creating an endpoint for the target or the destination which is S3 in our case
  • Provide a valid name and an ARN description to the target endpoint
  • Select the target engine as S3 (as per our example)
  • Then add the IAM role ARN of the IAM role which has access to write on the S3 bucket (Note : This IAM role should be properly created as this will ensure that whether the target endpoint is able to connect to the destination or not)
IAM Role ARN which provides access to write to the destination S3 bucket
  • Provide the target S3 bucket name where the data is to be migrated by the DMS replication instance
  • Provide the exact folders and sub folders in which the data is to written on the target S3 bucket
  • You can also add tags if necessary (Optional)
  • Then open the Test endpoint connection and select VPC in which you want you have created your DMS replication instance and also select the DMS replication instance created in the earlier step
  • Then click on Run Test to check whether the target endpoint is able to connect the DMS Migration instance with Target S3 bucket
Important Note : If your Run Test Failed, then this means that the Target Endpoint is not able to connect the DMS replication instance to the target S3 Bucket. For that check the IAM policies attached to the IAM role whose ARN have been used in the above step. Make sure the the IAM role includes the policy which gives write access to the S3 bucket which we are using as target bucket
Create DMS Target Endpoint - Image 1
Create DMS Target Endpoint - Image 2
Create DMS Target Endpoint - Image 3
  • We have created a DMS subnet group along with a DMS replication instance which is going to perform the actual data migration
  • We have created a Source and Target endpoints with RDS MySQL database as source and an S3 bucket as a target.
  • Both the Endpoints have also been tested and both are successful in connecting to their respective targets
  • Provide a valid name to the migration task and the also add a description for it
  • Select the DMS replication instance which we created in the Step 2
  • Select the Source Endpoint and the Target Endpoint which we created Step 4 and Step 5
  • In the Migration type select the option of Migrate Existing Data
  • You can keep all the Task settings default and if required then you can enable the Cloudwatch Logs just to debug in case of any failures
  • In Table Mapping click on the Add new selection rule button and then select Enter a Schema and add the schema name which you want to migrate (Note : If you want to migrate all the schema then just add modulas (%) symbol)
  • Keep all the other configurations default and add tags if required and then click on Create Task
Create DMS Migration Task - Image 1
Create DMS Migration Task - Image 2
Create DMS Migration Task - Image 3
Create DMS Migration Task - Image 4
  • All the steps remains exactly same which we did in our last step, just one thing which we will have to select in the Migration Type drop down as Replication data changes only which is nothing but CDC
Create DMS CDC Migration Task - Image 1
Important Note : When the data source is a MySQL database then for DMS replication instance to capture the changes in the database you need to make changes to the source database configurations as suggested above in the red box

Conclusion

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Rajas Walavalkar

Associate Technical Architect at Quantiphi Analytics. AWS & GCP Certified. Worked on ETL, Data Warehouses, Big Data (AWS Glue, Spark), BI & Dashboarding (D&A)