Frequently Asked Questions

For an overview of the project, check the About page.

What's new, since the project launched in 2005?
Can I reproduce your maps?
What is the ultimate purpose of CommonCensus?
Once you have enough participants, will you shut down voting and produce a 'final map'?
Do you plan to expand CommonCensus internationally?
When will you include maps for the fourth question, or more regional or city maps?
Why do you require people's addresses? Why not just use ZIP codes?

Can you contribute for more than once address? What about multiple contributions from the same computer?
What about Alaska and Hawaii?
Why don't you make more detailed cultural distinctions?
How accurate are the maps?
Can you be more specific on how CommonCensus can help fight against gerrymandering?
Isn't the map skewed because of self-selection or multiple votes?
Can you include other sports leagues, like NASCAR, soccer, or women's sports?
What are you doing with the #2 and #3 sports team choices?
Are you making any money off of the project?
Do you have GIS polygon data available for download?
How often do you update the maps? Can you make it more frequent?
What software do you use to generate the maps? Do you use projections?
Can you show isolines around the cities to show their degrees of influence on particular areas?
I don't think your maps show the information well. Will you show the data better?
What is the exact math used to calculate the maps?
You had almost 32,000 contributors before. Why did the number suddenly drop by four thousand?

What's new, since the project launched in 2005?

It's August 2010 as I write this, having just transferred CommonCensus to a new hosting provider. And since site blogs weren't de rigeur back when I started CommonCensus, this first FAQ entry is where I'll have to stick everything that's new.

CommonCensus is now five years old, and has even been "officially" recognized in the 9th edition of De Blij's textbook Human Geography (page 291). Not bad, considering I've never taken a geography course in my life!

CommonCensus received around 45,000 votes in its first few months, and has received around 10,000 since. So it's not going to provide any definitive results anytime soon; it rather functions as a proof-of-concept of how "natural" boundary lines can be drawn in a scientific manner.

In the switch to a new web host, I've been able to automate the map-drawing process so that maps are now updated once per month instead of the previous sporadic schedule. (And let me tell you, it's not easy dusting off and reconfiguring computer code you haven't touched in five years!)

However, I don't currently have any great plans to expand the site further on its own. Of course, if there's an organization out there which has the resources to perform a larger survey and is interested in a partnership with CommonCensus, I'm certainly open to the idea.

I realize that the lists of sports teams are slightly out of date, that there's no option for Major League Soccer, that Alaska, Hawaii and Canada are not included... Sorry about that! But for now the site is just going to remain "as is"... including the rest of the FAQ. (Anything I've just said takes precedence over anything I say below!)

But a big thanks to everyone who's participated so far — the site would be nothing without you! — and to everyone who's new and will vote for the first time!

Cheers,
Michael

Can I reproduce your maps?

All map images are made available under the Creative Commons Attribution-NonCommercial License. If you put them on a blog, please copy the images to your server instead of referring to mine. And if you want to publish them commercially, I'll say yes, I just want to know first!

What is the ultimate purpose of CommonCensus?

Its purpose is simply to educate people about what geographic groups Americans 'feel' they belong to, as opposed to where politicians and post offices say they belong to. This matters because every day people categorize Americans into geographical shapes that are not representative. Nobody has ever before created a detailed map of the 'sphere of influence' of every city on local, regional and national levels.

CommonCensus does not espouse any particular political viewpoint, but imagines that the kind of data it creates should be invaluable to the debate on Congressional redistricting, to anyone involved in urban planning or studying markets, and to anyone with an interest in geography or demographics. It is also the first systematic way to exactly determine the limits of a metropolitan area without relying on guesswork or one person's opinion, which could contribute to consistent measurement of metropolitan area populations across the world.

Once you have enough participants, will you shut down voting and produce a 'final map'?

Maps of increasing accuracy will be produced throughout the process, and the more votes we have, the more accurate the maps will be. Thus, there will be no 'final map', just a 'current map'. After all, cities grow and change—a new neighborhood might rise to prominence in New York City, or the sphere of influence of New Orleans might have changed after Hurricane Katrina. The CommonCensus project seeks to always have the most up-to-date data as possible.

Do you plan to expand CommonCensus internationally?

Yes. Once the project has received enough data from participants to draw an accurate map of the US, I plan on expanding into Canada, the rest of the Americas, and Europe initially (I've been getting requests from football fans from England particularly). So far, I've been relying on freely available national data from the USGS that gives place names, latitude and longitude, and population. Canada is a problem because the freely available data doesn't include population, which my sorting algorithms need. In fact, the free Canada data comes from the US, because Canada itself charges for it. For parts of Europe and all of Asia, the problems are both translation and data. If you would be willing to translate parts of the site for free, and/or have the ability to provide me with a database of place names and coordinates that includes population data for any country, I look forward to speaking with you! Full acknowledgement and thanks will be given on the site.

When will you include maps for the fourth question, or more regional or city maps?

Right now (May 2006) I'm quite busy with other projects, and currently don't have a lot of time to dedicate to this site. However, as soon as the vote total reaches 100,000, I will do a bit of a re-haul of the site, and add a lot more maps at various levels, including the maps for the fourth question.

Why do you require people's addresses? Why not just use ZIP codes?

Using street addresses allows the exact latitude and longitude of your location to be calculated, and this is necessary in order to draw the boundaries between towns and neighborhoods at a local level. ZIP codes are not nearly accurate enough. Please note that the site does not ask for your name or email address. Only your location is identified, not you personally. You are encouraged to enter your own full street address, but if you would prefer not to, then please enter the name of your street without the number.

Can you contribute for more than once address? What about multiple contributions from the same computer?

The survey has two parts. Steps 1-3 regard your area, and Step 4 and the sports step are about you personally. If you would like to enter data for Steps 1-3 for more than one address (i.e. where you live, where you work, where you used to live, etc.) that is more than welcome. In other words, first do the whole survey for where you live now. Then, when you do it again for a different address, when you get to Step 4, just go back to the home page by clicking the site logo at the top of the page. This prevents you from giving your personal data more than once. Even though you don't "finish" the survey the data is still recorded and will be used.

Some people have noticed that I've had to delete a couple thousand of responses from people using the same computer over and over to 'pad' some of the sports votes. Don't worry, I don't automatically delete multiple votes that come from the same computer. I only delete sets of votes that are obviously fraudulent, and there are many factors that go into determining that besides just the IP address.

What about Alaska and Hawaii?

So far, all of Alaska identifies with Anchorage, and all of Hawaii with Honolulu on the national scale. I'll put up maps for them once I start putting up the local area maps, after 100,000s of votes.

I identify with one major city culturally, another economically, and go to art exhibitions in another, but the format of your questions doesn't let me make these distinctions. Why not? Or why don't you ask people as well what TV stations they watch, or newspapers they read?

I want to keep the project simple and easy, so that as many people can participate as possible, and the process stays enjoyable. I experimented with various sets of questions in the beginning, and found that too many questions confused the project rather than helped it. Instead, the questions ask you to consider the influence of cities overall, and not in any particular aspects.

How accurate are the maps?

The national map, shown on the main page, is at a large scale, and a rough version can be produced with just 10,000s of votes, even though the results are still not very accurate. To produce a decently accurate (but not perfect) map, responses in the 100,000s will probably be necessary. The main page reports how many people have voted so far.

The second question in the survey concerns local areas, and I expect that to make even a suggestive map for much of the US will require votes in the 100,000s. During the first year of the project (late 2005 and most of 2006), I expect rough initial maps for some of the most populated areas of the country to start appearing.

The first question concerns drawing the boundaries for local communities, and tens of votes will be required per community just to appear, and hundreds or thousands to draw reasonable boundaries. For extremely dense urban areas, a few maps will be possible in the near future. As to a comprehensive map of local community areas in the US, don't be in too much of a hurry.

Can you be more specific on how CommonCensus can help fight against gerrymandering?

Once the maps for local areas start to become available (see previous question), we will be able to see the 'natural' shapes that people segment themselves into, at something more-or-less analogous to a county level. The shapes of current Congressional districts will be overlaid onto these maps, so that the two sets of shapes can be compared. Maps of Congressional districts are already available, with some really bizarre shapes for some of them, but unfortunately there is often no common-sense alternative to compare them to visually, in terms of areas and shapes. I believe that having Congressional districts laid over the natural population shapes will be a powerful image. For more information on gerrymandering and examples of particularly egregious shapes, see the excellent Wikipedia article on gerrymandering.

How do you prevent people from voting more than once and skewing the results? Why don't you make people register? Isn't the map inherently flawed because it relies on a self-selected sample?

Contributions are recorded along with their exact coordinates, IP address and time. Thus, if there are 10 or 100 votes in a short period of time from the same computer for exactly the same set of sports teams, they're going to be deleted. Whatever strategy is used for inventing street "addresses" (or lack thereof) in an area also stands out clearly from the normal distribution, and I'm not just talking about ones like the "123 Main St", "234 Main St", "345 Main St" sequence I found... When you look at the data, it's blindingly obvious when the same person is trying to promote one or more teams. The fact that the survey is several pages long probably deters it, but I've already had to delete almost 2,000 votes in about twenty cases of "ballot-stuffing". I'm flattered by the "effort" these people have made, but it's wasted time. Just because it's computerized doesn't mean a human is watching over it, guys...

Certainly, some sports teams have been over- or under-represented on the maps and tables because of a particularly popular thread on a message board, or lack thereof. I hope that as the votes increase, things will even out reasonably. I want to avoid any kind of registration process because I want it to be as easy as possible for people to contribute, and I suspect a lot of people would avoid voting if I asked for email addresses.

Yes, the survey is self-selected for Internet users. The purpose of this project is not to create a reference map with perfect accuracy, but rather to create something interesting that opens people's eyes and makes them think. That being said, I doubt any skew on the main (non-sports) maps due to self-selection would be major. Is it likely that Internet users in New Jersey think they associate more with New York, while non-Internet users would associate with Philadelphia? It's not impossible, but I don't think it's very likely either. Also, the map only draws a boundary at the 'tipping point' between two areas. The reponses within the two areas could be off by 20%, but the 'tipping point' will not move drastically. In any case, so far small sample size is a vastly larger source of inaccuracy than self-selection.

Can you include other sports leagues, like NASCAR, soccer, or women's sports?

Right now I'm limited sports voting to the largest, most popular sports that are based on teams associated with cities. Depending on requests for other leagues, I may add more in the future. Contact me to give me an idea of what else people would like.

What are you doing with the #2 and #3 sports team choices?

So far, nothing. The option to give 2nd and 3rd place teams you support was added in December 2005, and there isn't enough data to show much yet. Time will tell if the data turns out to be meaningful geographically or not.

Are you making any money off of the project?

Not currently. The main purpose is to create something useful to the public, created by the public itself. I believe that if I'm asking people for their free contributions, then I should show them the results for free.

Once enough data has been received, I would like to make printed posters of the maps available for sale on the site, the profits of which will help support the cost of its development and hosting. However, the maps will always be available for free viewing on the site.

Do you have GIS polygon data available for download?

I have received some inquiries about making polygon data available for the maps for use in GIS applications. I don't think this is something useful right now, because the shapes are still changing so much. The data I produce is pixel-based, so it would take quite a lot of work on my part to convert the map data to polygon format. It seems that polygon files would be useful for government agencies and market researchers to combine with their own data. I may decide to do this, and this would be something I would charge for, because of the work it would involve. If you're interested, please contact me to let me know there's demand. Please note that this data would not contain addresses, just the shapes of areas. Again, the pixel-based maps will always be available for free viewing on the site.

How often do you update the maps? Can you make it more frequent?

Each map 1280 pixels wide takes a few hours to generate on my computer, and as the data increases, it just gets slower. Daily updates are just not possible. Right now I try to update the maps as much as new activity warrants, which is generally every increase of 1,000 or 2,000 votes.

What software do you use to generate the maps? Do you use projections?

My own. A lot of people have asked why I don't use commercially available GIS software. The answer is that GIS software involves coloring regions (i.e. states, counties or towns) that already have set borders. My project is to draw those very borders, so I had to write something myself. It's an application I put together in Visual Basic 6, that accesses the data in a MySQL database to draw the map pixel-by-pixel, against a mask image of the US. I don't use map projections because it would be a headache to program by hand. Instead, I just plot by latitude and longitude, and expand the Y axis by 20% to get the proportions about right. It's not perfect, but it's not too bad when you're just dealing with part of a continent.

Can you show isolines around cities to show their degrees of influence on particular areas?

This is precisely why I ask people to rank how strong of an influence their major city has on their local area, on a scale of 1 to 5, in the third question. I intend to color five rings around each city, which will make the map even more interesting and informative. However, much more data from non-urban areas is needed to be able to draw five distinct rings, so we will have to wait for these contributions to come in before this can be shown on the map.

I don't think your maps show the information well. Can you show statistics tables, or dots for fans from individual teams, or draw separate maps for teams that share the same city (i.e. Mets/Yankees), or use a different algorithm?

The maps are a work in progress. Here are some of the deficiencies, and my take on them:

  1. Areas with very few votes, like Montana, or non-urban areas in general (like much of California or Arizona) can be very extremely inaccurate, as just one or two people could make a city area appear or disapper. I expect this situation to improve as more people contribute, and that this should stop being a problem once votes have reached the low hundreds of thousands.
  2. Showing the areas of coverage for cities or teams says nothing about the number of people who are there. Denver covers a huge area on the national map, and New York is quite small, even though there are far more people in New York. I've tried to improve this by showing the number of votes for each location next to the label.
  3. Perhaps most irritating is the fact that cities that have two sports teams are shown colored in entirely by the team with majority fans. The other team essentially disappears. I experimented with a kind of cross-hatching to show first- and second-place teams in any area, but the resulting map made things more confusing, not less. In the future, I plan on making a separate map available for each and every team, showing a gradient of the percent fans of just that team across the country. New: I've added an experimental 'sports hotspot' page that lets you see the exact statistics for an area, accessible from the main sports map pages. Thus, you can see for yourself the current proportions of all teams in a given area.

What is the exact math used to calculate the maps?

Iterate through each pixel on the map, and for each:

Step 1: Locate the n nearest contributions by latitude and longitude (n = 50 for the national maps; a smaller value is used for the regional and local maps until they have enough votes to make 50 work)
Step 2: Assign each location a 'vote strength' s = 1 / (1 + d ^ 2), where d = the distance from the pixel to the contribution. The scale used for d matters; for the national city map, d = 1 = 60 miles, for the sports maps, d = 1 = 4 miles (this makes them much more sensitive to local fluctuations), the regional maps use d = 1 = 10 miles, and the neighborhood maps use d = 1 = 0.1 miles
Step 3: Calculate the cumulative vote tally for the n contributions according to their votes and vote strengths. Because of the vote strengths, votes closer to the pixel will count more. Select the winner
Step 4: For the national and regional maps, color the pixel according to the winner. For the local and sports maps, color the pixel according to the winner only if the winning vote strength is above a particular threshold t, otherwise leave it gray. For neighborhood maps t = 3 (so only consensus-conglomerations show up); for sports maps t = 0.2 (so every isolated vote shows up)

For a physical analogy, imagine that each contribution is a small mountain of a particular color (according to the vote). The mountain has a peak in the middle, and tails off in height the further away you go. On a flat surface cut out in the shape of the US, mountains of different colors get piled on top of other mountains, one per vote. The surface of the landscape is colored by whichever 'mountain mineral color' is most common below it (the winning vote). For the sports maps, the mountain landscape is filled with a lake of a particular height, so only the tips of mountains and mountain ranges appear, while the lake itself is colored gray.

You had almost 32,000 contributors before. Why did the number suddenly drop by four thousand?

The number was actually inflated somewhat, because I was counting people who hadn't finished the contribution process. I decided to change the count to reflect people that finished all four steps, so that the number would match the number of contributors automatically printed on the maps. Don't worry, all the data is still there! Except for the 2,000 or so votes that were repeatedly placed by a few individual enterprising sports fans trying to boost their team numbers.


Have a question or suggestion not mentioned here? Contact me

 
 

This website and all its contents copyright © 2005 Michael Baldwin

Map images are also covered by a Creative Commons Attribution-NonCommercial License — please share!