Social Scientific Applications of Historical GIS, Part 2: Linking and Visualizing Population Data in ArcGIS

This post has been contributed by Emre Amasyali. Information about the author is included at the end of the post. The first part of this contribution can be read here.

In Part 1 of this exercise we went over how you may import a digitized image, georeference it and record administrative boundary information contained in the map. The shapefiles that we created now have geographic information ascribed to them. Yet, this is all they have. In Part 2, I will go over how one might add population information to these shapefiles and how we can visualize this information. Central to accomplishing this goal is making sure we can identify each shapefile accurately.

Step 1: Identifying Shapefiles Accurately

Each shapefile I created in Part 1 will have a unique FID (Feature ID – right click shapefile in Table of ContentsOpen Attribute Table). This is an important identifier and should not be dropped. To this unique identifier we must add a naming convention that will allow us to know which district this shapefile belongs to. To this end, I will run a simple Python script. Copy the code below into wordpad (in Windows), change the directory to suit your own PC and save as .py – i.e., a Python file.

 import arcpy  
 from arcpy import env  

env.workspace = r"C:\Users\damasy\Dropbox\Digital Orientalist\Borders"
   
 for shapefile in arcpy.ListFeatureClasses("*"):  
     name = shapefile.split(".")[0]  
     arcpy.AddField_management(shapefile, "RTENO", "TEXT")  
     arcpy.CalculateField_management(shapefile, "RTENO", '"' + name + '"', "PYTHON") 

Change the directory above to your workspace location, save and shut down ArcGIS, and run your code in Command Prompt or Terminal. When you open up ArcGIS again, right click one of your shapefiles and open the attribute table, you should now see a new column, RTENO:

Now we are able to identify which FID belongs to which district. As a final step, we will merge all of our shapefiles into one large shapefile so that we can export it. To do this, click the search icon and search the term “merge” and click on “Merge (Data Management).”

Select all of your districts from your Table of Contents on the left side of the screen and drag-drop them into the “Input Datasets” box in the Merge tool window (see below). Name your output dataset something that makes sense; I will use: “AllKazas_Erzurum.”

When you are done, click ok. The resulting image should look like this:

You will notice that if you right click on your new merged shapefile, “AllKazas_Erzurum,” in the Table of Contents and click on Open Attribute Table, the column RTENO will be preserved. Stay in this window. Click the dropdown arrow on the top left corner (seen below) and select “Export…” → Choose Folder → Choose Name → Save File as “text”.

The next step requires us to open and save the text file in Excel. To do this, open Excel, select “Data Ribbon,” and click on “Get External Data.” Select from text and choose the file we just exported. Select “Delimited” and click next. In the next window, check “comma” as a delimiter then click next and hit finish. Make sure you select the first box in your spreadsheet and click ok. Save this file as .csv in your GIS project folder.

Step 2: Merging Non-Spatial Data

Now that we have our map spatially located and administrative borders drawn, the next step is merging historical census information to this spatial data. This will involve using the .csv file that we created which includes both district names and FID.

To this .csv file we will now add census information. This information might be based on archival sources, historical databases or secondary sources that included such data.

You will need to have your census data transcribed to an Excel spreadsheet. I will use the 1893 Ottoman Census because it was created around the same time as the Cuinet map. It was also the first Ottoman Census to count both men and women, and categorize the population according to religious affiliation. Using this data, you may want to calculate new variables to include in later analyses. This is possible in ArcGIS, but easier and faster in Excel. For this example, I calculate the Armenian share of the population for each district in Excel.

Next is merging the .csv file and your population figures. Given the small number of cases in this province, I will do this manually in Excel. If you have a larger dataset, you will need to get creative with your merging techniques. For my own projects, I use STATA and create a .do file that assigns a unique code to each district, which I later use for merging. When you have merged both files, save the final product in your project folder as Population.csv.

Step 3: Joining non-spatial data to existing spatial data

Open ArcGIS once again, click “Catalouge,” right-click your project folder and click “Refresh.” The Population.csv file we have added should now appear here. Drag this file to the Table of Contents.

Now we have to join this spreadsheet to our merged shapefile “AllKazas_Erzurum.” Right-click the merged shapefile (AllKazas_Erzurum) and select Joins and RelatesJoin. Choose Population.csv as your table to join, select FID in both dropdowns and hit ok.

After some processing time, your population information should be merged to your shapefile. To double-check, right click the shapefile and select the attributes table. You should now be able to see the population figures in your shapefile attributes like this:

Step 4: Visualizing Results

As one final step, I would like to demonstrate how we may use ArcGIS to visualize our information. To do this, right click your AllKazas_Erzurum shapefile → Properties → Symbology → Quantities → Value → Armenian Share. You can play around and set the breaks and colour code to your liking. At the end your image should look something like this:

So that’s it! If you have made it this far in the exercise, you should have an understanding of how to start your own HGIS dataset. All it takes is practice and repetition. You can also use the logic presented here to merge additional non-spatial information. Social scientific databases like to include additional spatial correlates of development such as distance to lakes, ports, borders etc. Those variables are easier to calculate and can be done through spatial computations in ArcGIS. But all that for another time…

About the Author

Emre Amasyali is a PhD candidate at the Sociology Department at McGill University. Amasyali’s research examines topics of political sociology, comparative historical sociology and history and combines qualitative methods with statistical analysis, geospatial analysis, and archival research.

2 thoughts on “Social Scientific Applications of Historical GIS, Part 2: Linking and Visualizing Population Data in ArcGIS

Leave a comment