How the pandas and docx libraries can streamline repetitive COVID-19 communications / by Daniel Berry

During the first several months of the COVID-19 pandemic, my colleagues and I were responsible for creating communications packets for a client with close to 2,000 retail locations across the U.S. These packets helped inform regional vice presidents, district managers, store managers and store employees about confirmed cases of COVID-19, and included information regarding how to communicate the information to store employees, what the company was doing to protect employees and customers, and reminders about best practices for maintaining a sanitary and safe working environment. 

Many of these packets were repetitive in nature and there was a clear, logical order we would use to fill them out. In order to fill these packets out, we would look at several different Excel files with information such as the name of regional, district and store leadership, the address of the store in question and how many cases the store has had previously. 

 Given my recent experiences with data frames, importing/exporting files using Python and building functions, I believe that the creation of these materials could be automated, which would streamline the process and allow the team to get the communications out to the necessary individuals faster. The pandas and docx libraries would be the most important libraries to use to create this process.

 Below, I’ve outlined how I think these libraries could be utilized to create functions that would have automated much of the communications packet creation for us, as well as considerations that would have to go into each step. 

 Pandas

As I mentioned, while creating these packets initially, my team and I would have to look at several different Excel files in order to get the information we needed. Using Pandas, I would take those files with the information and merge them, mapping regional vice president, district manager and store manager names onto their respective regions/districts/stores. I would then save this data frame into something like a store-lookup.csv that would be easily accessible and could be used in a function find the information we would use the memos. 

Each time we had to enter a new confirmed case into the system, we could use the input function in python to assign different values to a dictionary and concatenate them to a second data frame that housed all of the confirmed case information, which could be named something along the lines of confirmed-case-log.csv. This would keep a running tally of the region, state, store number, date of notification and any other information we might need for future reference.

DocX

I recently found the docx library while considering ways to streamline the repetitive communications tasks that we had to perform during the pandemic. Because this library works similarly to how we might print out information, once the text of the memos and the other communications packets were settled upon with the client, using the input function to create the variables we needed to find the information in the store-lookup.csv file, we could use f-statements to insert the names of the regional vice presidents, district and store managers, and store addresses into the documents as appropriate. 

Additionally, for each of the communications talking points, we could have a set of talking points for if the case was the store’s first confirmed case, or if it already had a confirmed case (i.e., if there were already an entry in the confirmed-case-log.csv file). Using if/elif/else statements, the appropriate verbiage and words could be placed as needed in the file, and then exported as whole into the directory of choice. 

Ultimately, given the urgency with which these packets have to be communicated to store employees, and the repetitive nature of the packets, creating a program to create and export these packets quickly could significantly reduce the overall time spent on them, creating value for both the client and the account team.