This post is the technical accompaniment to Multilevel Models to Explore the Impact of the Affordable Care Act’s Shared Savings Program, Part I.Read More
The Affordable Care Act encompasses a host of programs and provisions, which are the subject of much discussion and debate right now. This post offers a nonpartisan, genuinely curious exploration of one of the Affordable Care Act’s less frequently debated programs, the Shared Savings Program (SSP). The program’s objective is to reduce Medicare costs and increase the quality of care provided to Medicare patients. In this short sequence of posts, I explore whether the program looks to be meeting these objectives. Given that this topic may appeal to both non-technical and technical audiences, I’ve split the posts into a higher-level description of the findings (part I) and a more technical post with code (part II).Read More
This post demonstrates how to map change in a variable over time in a geographic area, allowing the user to scroll through time and selectively view dates of interest. It produces an interactive choropleth map, as the last post did, but whereas the last post was interactive in the sense that the user could zoom in on a specific geographic area, this map is interactive in the sense that the user can ‘zoom in’ on a specific point in time.
This map: In the wake of Hurricane Katrina, multiple New Orleans committees generated plans to rebuild the city; in some cases, these plans involved shifting the city’s footprint to move citizens out of more topographically vulnerable areas. The sequence of maps produced here answer the question: how quickly did various New Orleans zip codes re-populate after Hurricane Katrina, and how does the city’s current address density relate to pre-Katrina levels?Read More
If you've needed to perform the same sequence of tasks or analyses over multiple units, you've probably found for loops helpful. They aren't without their challenges, however - as the number of units increases, the processing time increases. For large data sets, the processing time associated with a sequential for loop can become so cumbersome and unwieldy as to be unworkable. Parallel processing is a really nice alternative in these situations. It makes use of your computer's multiple processing cores to run the for loop code simultaneously across your list of units. This post presents code to:
- Perform an analysis using a conventional for loop.
- Modify this code for parallel processing.
To illustrate these approaches, I'll be working with the New Orleans, LA Postal Service addresses data set from the past couple of posts. You can obtain the data set here, and code to quickly transform it for these analyses here.
The question we'll be looking to answer with these analyses is: which areas in and around New Orleans have exhibited the greatest growth in the past couple of years?Read More
This post, the last in a sequence of four, combines the code samples from the previous two posts and resolves the lingering issue that R interprets the column with the counts of active addresses as a character variable. In the code below, I set up the unique identifier for each zip code-parish combination and move the parish information into a variable before reshaping the data long, as this is more concise than trying to do these things afterwards, and leverages the unique identifier to reshape the data properly.Read More
This post, the second in a sequence of four, works with the New Orleans active addresses dataset introduced in the last post and addresses the challenge of transposing the data long while preserving the date information.
The challenge is that the date information is spread out over two rows (one for year and one for month), and we want to make sure that when we flip the data long, the date information is connected to the correct values. Additionally, some of the year information is missing, so we'll also need to fill this in.Read More
The following couple of posts address a common data science challenge: when sourcing data over the internet or from disparate departments within an organization, it's often necessary to substantially reformat the data before analysis. The Excel table considered in these posts, available online courtesy of The Data Center, features monthly counts of active postal addresses, by zip code, in New Orleans during the decade after Hurricane Katrina.Read More