You can create 20 GSI indexes for DynamoDB tables as of the time we build this workshop. However, sometimes our application may need to support more access patterns than that and exceed the current limit of per-table GSI index tables. From this, the GSI Key Overloading index table design pattern is generated by selecting and reusing an attribute (column header) among the items and storing the value of that attribute depending on the context of the attribute type. .
When you create a GSI Index table on that property, you are indexing multiple access patterns, each for a different item type — and using only 1 GSI Index table. For example, with the Employees table, an employee can contain items of type metadata (for details about the employee), employee-title (all job titles that the employee has held), or employee-location ( all office buildings and locations where the employee worked).
The access patterns required for this scenario include:
Query all employees of a table
Query all employees with a specific title currently
Query all employees who used to have a specific title
Query all employees by name
The figure below shows the design of Employees table. Attribute PK (Parition key) has a value that is a combination of employeeID with the prefix e. The pound sign (#) is placed between the entity type identifier (e) and the actual ID of the employee employeeID. The SK (Soft key) attribute acts as an overloaded attribute and its value can be current title, previous title, or the root keyword indicating this is a primary item of the Employee table, which has values in the most important attributes The GSI_1_PK attribute includes the employee’s title or name. Reusing the GSI Index table for multiple entity types such as employees, employee locations, and employee titles allows us to simplify DynamoDB table management because we only need to track and lose cost for one Index table sub GSI instead of three separate indexes.
aws dynamodb create-table --table-name employees \
--attribute-definitions AttributeName=PK,AttributeType=S AttributeName=SK,AttributeType=S \
AttributeName=GSI_1_PK,AttributeType=S AttributeName=GSI_1_SK,AttributeType=S \
--key-schema AttributeName=PK,KeyType=HASH AttributeName=SK,KeyType=RANGE \
--provisioned-throughput ReadCapacityUnits=100,WriteCapacityUnits=100 \
--tags Key=workshop-design-patterns,Value=targeted-for-cleanup \
--global-secondary-indexes "IndexName=GSI_1,\
KeySchema=[{AttributeName=GSI_1_PK,KeyType=HASH},{AttributeName=GSI_1_SK,KeyType=RANGE}],\
Projection={ProjectionType=ALL},\
ProvisionedThroughput={ReadCapacityUnits=100,WriteCapacityUnits=100}"
aws dynamodb wait table-exists --table-name employees
The Base table is named employees
The table’s Partition key is the PK attribute and it contains the employee ID value.
The Sort key of the Base table is the SK attribute, with the value being processed in the script.
The Index table named GSI_1 - is the GSI Overloading Index table used in the next step
The Partition key of the GSI Index table is GSI_1_PK, and is equal to the value of the SK attribute in the Base table.
Sort key of table GSI Index is GSI_1_SK whose value is Employee’s Name, attribute Name
Key schema: HASH, RANGE (partition and sort key)
Table read capacity units (RCUs) = 100
Table write capacity units (WCUs) = 100
Global secondary index: GSI_1 (100 RCUs, 100 WCUs) - Allows querying by host’s IP address
Attribute Name (Type) | Special Attribute | Attribute Use Case | Sample Attribute Value |
---|---|---|---|
PK (STRING) | Partition Key | Employee ID | e#129 |
SK (STRING) | Sort key | Derived value | root, state#MI |
GSI_1_PK (STRING) | GSI_1 partition key | Derived value | root, state#MI |
GSI_1_SK (STRING) | GSI_1 sort key | Employee name | Christine Milsted |
python load_employees.py employees ./data/employees.csv
1000,Nanine Denacamp,Programmer Analyst,Development,San Francisco,CA,1981-09-30,2014-06-01,Senior Database Administrator,2014-01-25
The output of the load command is similar to the following:
python load_employees.py employees ./data/employees.csv
employee count: 100 in 3.7393667697906494
employee count: 200 in 3.7162938117980957
...
employee count: 900 in 3.6725080013275146
employee count: 1000 in 3.6174678802490234
RowCount: 1000, Total seconds: 36.70457601547241
The output shows that 1000 records have been written to the table. Go to DynamoDB Console, and then click view employees table.
Execute a query to find all employees working at Washington headquarters by running the script query_employees.py, which contains the code to query the table using an overloaded global secondary index
python query_employees.py employees state 'WA'
Query command results
List of employees with WA in the attribute state:
Employee name: Alice Beilby - hire date: 2014-12-03
Employee name: Alla Absalom - hire date: 2015-06-25
Employee name: Alvan Heliar - hire date: 2016-05-15
Employee name: Anders Galtone - hire date: 2015-12-22
Employee name: Ashil Hutchin - hire date: 2015-02-11
...
Employee name: Sula Prattin - hire date: 2014-01-11
Employee name: Vittoria Edelman - hire date: 2014-10-01
Employee name: Willie McCuthais - hire date: 2015-05-27
Total of employees: 46. Execution time: 0.13477110862731934 seconds
python query_employees.py employees current_title 'Software Engineer'
Query results
List of employees with Software Engineer in the attribute current_title:
Employee name: Alice Beilby - hire date: 2014-11-03
Employee name: Anetta Byrne - hire date: 2017-03-15
Employee name: Ardis Panting - hire date: 2015-08-06
Employee name: Chris Randals - hire date: 2016-10-27
Employee name: Constantine Barendtsen - hire date: 2016-06-10
Employee name: Eudora Janton - hire date: 2015-01-05
Employee name: Florella Allsep - hire date: 2015-03-31
Employee name: Horatius Trangmar - hire date: 2013-10-21
Employee name: Korey Daugherty - hire date: 2016-11-03
Employee name: Lenka Luquet - hire date: 2014-10-01
Employee name: Leonora Hyland - hire date: 2016-06-14
Employee name: Lucretia Ruffell - hire date: 2015-07-04
Employee name: Malcolm Adiscot - hire date: 2014-04-17
Employee name: Melodie Sebire - hire date: 2013-08-27
Employee name: Menard Ogborn - hire date: 2014-06-27
Employee name: Merwyn Petters - hire date: 2014-06-19
Employee name: Niels Buston - hire date: 2014-10-30
Employee name: Noelani Studde - hire date: 2015-03-30
Total of employees: 18. Execution time: 0.11937260627746582 seconds
python query_employees.py employees name 'Dale Marlin'
Query results
List of employees with Dale Marlin in the attribute name:
Employee name: Dale Marlin - hire date: 2014-10-19
Total of employees: 1. Execution time: 0.1274700164794922 seconds