DynamoDB - Nejati Notes

*DynamoDB* is an AWS key-value serverless OLTP database. - schema-less - ACID compliance - Encryption at rest - Point-in-time recovery ### Data Model 1. **Table**: similar to other DBs, table is a collection of *items*. 2. **Item**: similar to rows, Items are a group of *attributes*. Each Table can have zero or more Items. 3. **Attribute**: a json object. ![[Pasted image 20250217043202.png]] ### Primary Key DynamoDB has two kind of Primary Key: 4. **Partition Key**: a single unique *Attribute* 5. **Partition Key + Sort Key**: a composite primary key that is composed of two attributes ![[Pasted image 20250217043009.png]] > [!note] > The partition key is also known as *hash attribute* as it derives from use of an internal hash function. > The sort key is also known as *range attribute* as it derives from the way DynamoDB stores items with same partition close together. ### Indexes An Index (also known as a *secondary index*) lets you query using an alternate key. DynamoDB supports two kinds of indexes: - **Global secondary index**: An index with a partition key and sort key that can be different from those on the table. Can be created/deleted any time. - **Local secondary index**: An index that has the same partition key as the table, but a different combined-unique sort key. Can be created upon table creation. >[!info] > By using secondary indexes as [sparse key](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-indexes-general-sparse-indexes.html), only a select set of attributes can be fetched that contain that attribute. ### Availability Since DynamoDB is running on AWS, it can be replicated to different locations in order to be near your user. DynamoDB maintains several copies of an item in fault-independent zones within a region. ### Read consistency When you *write* a change to a table in DynamoDB, the change is done in *one* of the replicas and **eventually updated** at the rest. In the mean time if you *read* the the changed data, it could retrieve from any of the replicas. Meaning, the older version could be still retrieved. This is the default behaviour of DynamoDB and called: **Eventually consistent reads** > [!tip] > By using [DynamoDB Accelerator](https://aws.amazon.com/dynamodbaccelerator/) (DAX), the write changes can be cached and updated faster. This method also improves the performance of DynamoDB. There is an another version which make sure the read respond with the most up-to-date data, reflecting the updates from all prior write operations. This is another behaviour which *costs more* called: **Strongly consistent reads** > [!note] > It is a good practice to design around the Eventually consistent reads in the application. ### Basic Item Requests ![[Pasted image 20250217045109.png]] - with **Query**, a sort key *must* be present for condition matching. - by using a secondary index, data can be fetched from another attribute than primary key. ### Dynamo Streams Like Kenesis streams, DynamoDB can activate [streams](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html) on a *table*. This means whenever a change is made to a table, an stream, in form of a changelog is presented. An stream has a 24-hour lifetime. ### TTL expiring By setting a Time To Live (TTL) for DynamoDB *item* one can mark it as no longer relevant. This allows DynamoDB to automatically delete expired items within days of their expiration time with consuming write throughput. By using *streams*, the expired items can be transferred to an cold storage (e.g. S3) instead of total removal. This pattern uses AWS Lambda for detection and Kenesis Firehose to transferring to S3. ### Backup and Restore DynamoDB automatically backs up *tables* up to 30 days. By its point-in-time recovery. It can backup can be restored from any time from this time-span. Restoring a table can take up to 10 hours. However, it does not grow linearly for bigger table sizes. ### Billing DynamoDB bills for read, write and throughput. The capacity mode can be both *provisioned* and *on-demand*. ### Workbench In order to design a model and visualise it for DynamoDB, [AWS NoSQL Workbench](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/workbench.html) can be used. It can be downloaded locally and mocks a local version of DynamoDB itself in your local machine. Also, it has the option to automatically deploy the model to AWS. ### Design considerations - For hot partition key reads, use DAX. - For hot partition key writes, use AWS SQS queues. - Plan ahead for hot partition key writes by [scatter-gather pattern](https://docs.aws.amazon.com/prescriptive-guidance/latest/cloud-design-patterns/scatter-gather.html). - GSIs are often recommended over LSIs. Unless there is a need for strong read consistency. - Don't make secondary indexes you don't need. This make writes throttled. This is known is back-pressure. - For updating data that has not been updated before, use [optimistic locking with version number](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBMapper.OptimisticLocking.html). - Use [one-to-many tables](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithItemCollections.html) instead of large number of attributes. ![[Pasted image 20250217061826.png]] -